You are on page 1of 6

Subcode CP5293

Subject Name BIG DATA ANALYTICS Knowledge


Sem II
Level
Degree M.E
Branch CSE CO
Staff Name M.P.RAJAKUMAR, R.RANJITH Stat (Remembe
Date of Updation 30.06.2021 eme r/
nt Apply/Ana
lyze/
Understan
d
Evaluate/C
reate)
Pa Unit Gr Q Question
rt ou .
p N
o
A 3 1 1 What is linear regression model? CO3 Understand
A 3 1 2 List the main reasons for the use of the CO3 Understand
widespread use of the linear model
A 3 1 3 List the Common reasons for doing a CO3 Understand
regression analysis
A 3 1 4 How nonlinear regression model differ from a CO3 Understand
linear model? Apply
……
A 3 2 1 What is SVM? CO3 Understand
A 3 2 2 What is Kernel function? CO3 Understand
A 3 2 3 List the various clustering techniques CO3 Understand
A 3 2 4 List the applications of cluster analysis CO3 Understand
…….
A 3 3 1 After performing K-Means Clustering analysis CO3 Understand
on a dataset, you observed the following Apply
dendrogram. What is the conclusion can be Analyze
drawn from the dendrogram?

A 3 3 2 I am the marketing consultant of a leading e- CO3 Understand


commerce website. I have been given a task of Apply
making a system that recommends products to Analyze
users based on their activity on Facebook. I
realize that user interests could be highly
variable. Hence I decide to a. First, cluster the
users into communities of like-minded people
and b. Second, train separate models for each
community to predict which product category
(e.g. electronic gadgets, cosmetics, etc.) would
be the most relevant to that community. The
first task is a/an ______________ learning
problem while the second is a/an
________________ problem. (Fill in the blanks)
A 3 3 3 Differentiate between supervised and CO3 Understand
unsupervised learning Apply

A 3 3 4 What is the need of density based clustering CO3 Understand


method.
……..
A 3 4 1 How soft clustering is different from hard CO3 Understand
clustering? Apply

A 3 4 2 What do you mean by predictive analysis? CO3 Understand

A 3 4 3 List the advantage and disadvantages of R CO3 Understand


programming
A 3 4 4 Are the traditional distance measures, which CO3 Understand
are frequently used in low dimensional cluster Apply
analysis, also effective on high-dimensional Analyze
data?” Justify
……..
A 3 5 1 How can we find subspace clusters from high- CO3 Understand
dimensional data?
A 3 5 2 What do you mean biclusters? List the CO3 Understand
requirements for biclusters.
A 3 5 3 What are two major types of methods for CO3 Understand
discovering biclusters in data that may come
with noise?
A 3 5 4 List the various dimensionality reduction CO3 Understand
methods
……..
B 3 1 1 a) Explain the k-means clustering algorithm in CO3
detail. List the advantages and disadvantages
of k-means clustering algorithm
b) Suppose we have several objects and each Understand
objects have two attributes (or features) as Apply
shown below in the table. Apply k-means Analyze
clustering algorithm and group these objects
into k=2
Object Attribu Attribu
te 1 (x) te 2 (y)
Medici 1 1
ne A
Medici 2 1
ne B
Medici 4 3
ne C
Medici 5 4
ne D
B 3 1 2 a) Explain single linkage clustering algorithm CO3
in detail. b) Suppose we have 6 objects and Understand
each object has 2 features as shown below in Apply
the distance measure table. Apply single Analyze
linkage clustering algorithm and group these
objects.
object x y
A 1 1
B 1.5 1.5
C 5 5
D 3 4
E 4 4
F 3 3.5
B 3 1 3 a) Explain complete linkage clustering CO3
algorithm in detail. b) Suppose we have 6 Understand
objects and each object has 2 features as shown Apply
below in the distance measure table. Apply Analyze
complete linkage clustering algorithm and
group these objects.
object x y
A 1 1
B 1.5 1.5
C 5 5
D 3 4
E 4 4
F 3 3.5
B 3 1 4 Consider the similarity matrix given below. CO3 Understand
Show the hierarchy of clustering created by Apply
the single linkage clustering and complete Analyze
linkage clustering.
P P2 P3 P4 P5 P6
1
P 1 0.70 0.65 0.4 0.2 0.0
1 0 0 5
P 1 0.95 0.7 0.5 0.3
2 0 0 5
P 1 0.7 0.5 0.4
3 5 5 0
P 1 0.8 0.6
4 0 5
P 1 0.8
5 5
P 1
6
B 3 1 5 Explain predictive analytics in detail with CO3 Understand
different cases. Apply
……….
B 3 2 1 a) What do you mean by distance measures? CO3
List the axioms of distance measures.
b) Define L2-norm, L1-norm and Lα-norm. Understand
Consider the two-dimensional Euclidean Apply
space (the customary plane) and the points (2, Analyze
7) and (6, 4). Calculate the L2-norm, L1-norm
and Lα-norm
B 3 2 2 a) Define Cosine distance. Let our two vectors CO3 Understand
be x = [1, 2, −1] and = [2, 1, 1]. Calculate the Apply
cosine distance. Analyze
b) Show that the cosine distance is indeed a
distance measure
B 3 2 3 a) Define Jaccard distance. Find the Jaccard CO3 Understand
distances between the following pairs of sets: Apply
{1, 2, 3, 4} and {2, 3, 4, 5} Analyze
b) Show that the Jaccard distance is indeed a
distance measure
B 3 2 4 a) Define edit distance. Find the edit distances CO3 Understand
(using only insertions and deletions) between Apply
the following pairs of strings. (i) abcdef and Analyze
bdaefc. (ii) abccdabc and acbdcab.
b) Another way to define and calculate the
edit distance d(x, y) is to compute a longest
common subsequence (LCS) of x and y. Using
LCS calculate the edit distance of the above
pair of strings
B 3 2 5 a) Define Hamming distance. Find the CO3 Understand
Hamming distances between each pair of the Apply
following vectors: 000000, 110011, 010101, and Analyze
011100.
b) There are number of other notions of edit
distance available. For instance, we can allow,
in addition to insertions and deletions, the
following operations:
i. Mutation, where one symbol is replaced by
another symbol. Note that a mutation can
always be performed by an insertion followed
by a deletion, but if we allow mutations, then
this change counts for only 1, not 2, when
computing the edit distance.
ii. Transposition, where two adjacent symbols
have their positions swapped. Like a
mutation, we can simulate a transposition by
one insertion followed by one deletion, but
here we count only 1 for these two steps. If
edit distance is defined to be the number of
insertions, deletions, mutations, and
transpositions needed to transform one string
into another, calculate the edit distance for the
following: abcdef and bdaefc.
……….
A 4 1 1 Is opinion mining and sentiment analysis are CO4 Analyze
same. Justify your answer
A 4 1 2 Give the properties of Steam data CO4 Apply
A 4 1 3 Define Sampling from data streams CO4 Remember
A 4 1 4 Define Aurora system model CO4 Remember
……….
A 4 2 1 Define Sensor data. CO4 Remember
A 4 2 2 Define Image Data. CO4 Remember
A 4 2 3 Discuss the issues in data streaming. CO4 Remember
A 4 2 4 What do you mean by stream queries? CO4 Analyze
……….
A 4 3 1 Define data source . Remember
CO4
A 4 3 2 Explain SQuAl CO4 Remember
A 4 3 3 Define QoS specifications CO4 Remember
A 4 3 4 Define SRS CO4 Remember
……….
A 4 4 1 Define Data sampling CO4 Remember
A 4 4 2 Define CP-tree. CO4 Remember
A 4 4 3 Define Curve fitting CO4 Remember
A 4 4 4 What is meant by Function approximation? CO4 Analyze
……….
A 4 5 1 Define ARIMA models. CO4 Remember
A 4 5 2 What are the two sets of conditions under CO4 Analyze
which much of the theory is built:?
A 4 5 3 Define Ergodic process CO4 Remember
A 4 5 4 Why we need RTAP? CO4 Analyze
……….
B 4 1 1 Draw and Explain Aurora system model CO4 Understand

B 4 1 2 Explain how sampling in database is done. CO4 Understand

B 4 1 3 Define Sampling data in a stream  CO4 Understand

B 4 1 4 Describe in detail about Mining Time-series CO4 Understand


data
B 4 1 5 What is Stream Data. CO4 Understand

…….
B 4 2 1 Describe in detail about Mining Data Streams CO4 Understand
Apply
B 4 2 2 Explain windowing approach to data stream CO4 Understand
mining. Apply
B 4 2 3 Describe the process of Stock Market CO4 Understand
Predictions. Apply
B 4 2 4 Describe Real Time Sentiment Analysis CO4 Understand
Apply
B 4 2 5 Describe Real Time Analytics Platform (RTAP) CO4 Understand
Applications Apply
…….
C 3 3 1 Write DBSCAN algorithm in detail. Suppose CO3 Understand
we have 8 objects and each object has 2 Apply
features as shown below in the table. Apply Analyze
DBSCAN algorithm and show how groping is
done.
object x y
A 2 10
B 2 5
C 8 4
D 5 8
E 7 5
F 6 4
G 1 2
H 4 9
C 3 3 2 Explain the EM algorithm in detail. Consider 6 Understand
points (3, 3), (4, 10), (9, 6), (14, 8), (18, 11), (21, Apply
7). Compute 2 fuzzy clusters using EM Analyze
algorithm.
C 3 3 Compare and contrast the different methods Understand
available to analyze data. Apply
Analyze
………

Note :
 Name of the file should be same as subject code [for example BA5101.doc]
 Last date to submit the question bank at exam office for
 I Yr PG – 01.07.2021 (Unit 3 & 4)
 PART – C – Not Applicable for Mathematics Paper. For case study &
comprehensive type of questions, there will be no choice.

PRINCIPAL

You might also like