Professional Documents
Culture Documents
Resit Summative
assessment Final Project
60%
Resit Final Project Presentation 40%
Dear students,
This is your Resit Assessment. Prior attempting this assignment, please carefully read
relevant materials found in the Module Revision Material of the course shell.
Assessment Resit Submission Rules and Important Notes:
1. You are able to use the feedback and constructive comments provided by your
tutor in order to improve/enhance your work. Please ensure that you work on your initial
piece of assignment and improve it based on the feedback received.
2. During the resit period you are given the opportunity to revise and resubmit
originally failed module assessment(s), but no further academic instruction will be
provided. However, you are able to use the feedback and constructive comments
provided by your tutor in order to improve/enhance your work.
3. Your assessment should be submitted via the appropriate VLE Submission Link
by 11:59 PM (VLE) time, at the end of the resit week, the specific date of which has been
provided to you on the day your access was granted to the resit module. You may
request confirmation of your deadline in a timely manner via resubmission@unicaf.org.
5. We are here to help and support you during the resit period, so if you need any
nontutor, technical assistance for any issues affecting your ability to submit your resit
assignment please contact the Resubmission Services via resubmission@unicaf.org as
a first step to getting in touch and allow 48 hours for us to answer you before moving on
to Student Support.
6. The maximum mark attainable for the components upon reassessment will be
50%. Please write your solutions clearly and concisely. If you do not explain your answer
you will be given no credit. You must write your own solution. Copying someone else’s
solution will be considered plagiarism and may result in failing the whole course.
Page 1 of 6
Overall mark for CRWK comes from two main activities as follows:
1- Big Data Analytics report (around 5,000 words, with a tolerance of ± 10%) (60%)
2- Presentation (around 1000 words, with a tolerance of + 10%) (40%)
Total 100
Page 2 of 6
Tasks:
1
The raw network packets of the UNSW-
NB15 dataset was created by the IXIA
PerfectStorm
tool in the Cyber Range Lab of the Australian Centre for Cyber Security
(ACCS) for generating a hybrid of real modern normal activities and synthetic contemporary
attack behaviours. Tcpdump tool used to capture 100 GB of the raw traffic (e.g., Pcap files). This
data set has nine types of attacks, namely, Fuzzers, Analysis, Backdoors, DoS, Exploits, Generic,
Reconnaissance, Shellcode and Worms. The Argus and Bro-IDS tools are used and twelve
algorithms are developed to generate totally 49 features with the class label.
c) In this coursework, we use the total number of 10-million records that was stored in
the CSV file (download). The total size is about 600MB, which is big enough to
employ big data methodologies for analytics. As a big data specialist, firstly, we
would like to read and understand its features, then apply modeling techniques. If
you want to see a few records of this dataset, you can import it into Hadoop HDFS,
then make a Hive query for printing the first 5-10 records for your understanding.
(2) Big Data Query & Analysis by Apache Hive [30 marks]
This task is using Apache Hive for converting big raw data into useful information for the
end users. To do so, firstly understand the dataset carefully. Then, make at least 4 Hive
queries (refer to the marking scheme). Apply appropriate visualization tools to present
your findings numerically and graphically. Interpret shortly your findings.
Finally, take screenshot of your outcomes (e.g., tables and plots) together with the
scripts/queries into the report.
Tip: The mark for this section depends on the level of your HIVE queries’ complexities, for
instance using the simple select query is not supposed for full mark.
1 source: https://www.unsw.adfa.edu.au/unsw-canberra-cyber/cybersecurity/ADFA-NB15-Datasets/
Page 3 of 6
(3) Advanced Analytics using PySpark [50 marks]
In this section, you will conduct advanced analytics using PySpark.
a) Design and build a binary classifier over the dataset. Explain your algorithm and its
configuration. Explain your findings into both numerical and graphical
representations. Evaluate the performance of the model and verify the accuracy
and the effectiveness of your model. [15 marks]
b) Apply a multi-class classifier to classify data into ten classes (categories): one
normal and nine attacks (e.g., Fuzzers, Analysis, Backdoors, DoS, Exploits,
Generic, Reconnaissance, Shellcode and Worms). Briefly explain your model with
supportive statements on its parameters, accuracy and effectiveness. [20 marks]
Page 4 of 6
Marking Scheme for the
Presentation
Presentation 20
design & Makes excellent use of fonts, colors, graphics, effects,
layout features, transitions to enhance the presentation.
Total 100
This will be the second Submission which is located at a different submission link and here you
will submit a presentation based on the report above. This will have a weight of 40% of your
Final Grade.
Page 5 of 6
2. Table of Contents
3. Report of the tasks (it needs sub-sections for few tasks, accordingly)
SUBMISSION
single PDF into Turnitin in Moodle, by the end of Week 12
single PDF into Turnitin in Moodle at the second submission link for the presentation, by the end
of Week 12
PLAGIARISM
The University defines an assessment offence as any action(s) or behaviour likely to confer
an unfair advantage in assessment, whether by advantaging the alleged offender or
disadvantaging (deliberately or unconsciously) another or others. A number of
examples are set out in the Regulations and these include:
“D.5.7.1 (e) the submission of material (written, visual or oral), originally produced by another
person or persons, without due acknowledgement, so that the work could be assumed the
student’s own. For the purposes of these Regulations, this includes incorporation of
significant extracts or elements taken from the work of (an) other(s), without
acknowledgement or reference, and the submission of work produced in collaboration for an
assignment based on the assessment of individual work. (Such offences are typically
described as plagiarism and collusion.)”.