You are on page 1of 2

Sample Test

Filling in the blank:


1. (2 points) Confidentiality, integrity, and availability form what is often referred
to as the ___.
Answer: CIA triad.
2. (2 points) ___ security, also called infrastructure security, protects the
information systems that contain data and the people who use, operate, and
maintain the systems.
Answer: Physical
3. (2 points) When a distribution is negatively skewed, the mean is pulled in a
right direction. (True/False)
Answer: False
4. (2 points)Histogram are used with numeric data rather than with categorical
data. (True/False)
Answer: True
5. Write the R code that creates a vector of numerics from 0 to 100 that
increment by 1.
Answer: seq(0,100, by=1)

Answer short question:


1. (5 points) What will be the output of the following R code?

Answer: the numeric summary of the “Height” column in the


“01_heights_weights_genders1.csv” dataset. (1’) The numeric summary
includes the minimum, 1st quartile, median, mean, 3rd quartile, maximum
number. (4’)

2. (5 points) What is security analytics? List and explain three common


approaches used for Twitter spam detection.
Answer: We define security analytics as the adaptation of techniques from
data science to security challenges. (2’)

Spam detection using data analytics approach:


(1) Detection based on syntax analysis: The detection methods based
on syntax analysis can be categorized into two parts: 1) key segment
and2) tweet content. Key segment methods collect indicative segment
such as keywords, username patterns and URLs to represent the
context of tweets and posters. Tweet content methods focus on the text
of the Tweet, there are currently three major techniques to represent
textual content of tweets: TD-IDF (Term Frequency -Inverse Document
Frequency), bag-of-words and sparse learning. (1’)
(2) Detection based on feature analysis: The feature analysis-based
detection methods include statistic information and social graph
information. Statistic information is extracted from tweet statistic
information, account statistic information and campaign statistic
information. The social graph information is extracted from the
macroscopic attribute of graph nodes as well as the relationships of
graph nodes. (1’)
(3) Detection based on blacklist: Blacklist detection methods rely on the
third-party blacklisting techniques. (1’)

3. (5 points) How do you read a Boxplot in statistics?


Answer: Minimum number, Q1, Median, Q3, maximum number and outliers.
(5’)

You might also like