Professional Documents
Culture Documents
net
17 Judge why the partitions are shuffled in map reduce? BTL5 Evaluating
18 How will you formulate Hadoop development BTL6 Creating
19 Who is generating big data and what are the ecosystem projects used
BTL6 Creating
for processing
20 Why BTL6 Creating
c o does
nv one
en choose
tio naanalyt ica
l system? ls yst e m o ve r
e p z .n e t w w w .p ad e
www.padeepz.net
PART – B
List the main characteristics of big data architecture with a neat BTL1 Remembering
1
schematic diagram.(13)
2 Explain in detail about the challenges of conventional system(13) BTL4 Analyzing
3 How would you show your understanding of the tools, trends and BTL3 Applying
technology in big data?(13)
4 i. What are the best practices in Big Data analytics? (6) BTL4 Analyzing
ii. Explain the techniques used in Big Data Analytics. (7)
5 i. What make a great analysis? State reason with example.(6) BTL3 Applying
ii. Examine in detail the trends and technology in big data (7)
i. Describe in detail about 5v‟s? (6)
6 ii. Illustrate Map-Reduce with example.(7) BTL1 Remembering
Discuss the use of Big Data Analytics in Business with suitable real
7 world example.(13) BTL2 Understanding
i. Generalize the list of tools related to Hadoop.(6) BTL 6 Creating
8 ii. How does Hadoop work?(7)
9 Discuss the following features of Apache Hadoop in detail with
BTL2 Understanding
diagram as necessary.(13)
10 Summarize briefly on BTL2 Understanding
i. Feature of MapR distribution(7)
ii. Explain the architecture for MapR.(6)
11 Explain the complexity theory for Map-Reduce? What is reducer size
BTL4 Analyzing
and replication rate (13)
12 Explain the significances Hadoop distributed file systems and its
BTL5 Evaluating
application (13)
13 i. Describe Map Reduce framework in detail. Draw the architectural
diagram for physical organization of compute nodes(7) BTL1 Remembering
ii.Define HDFS. Explain HDFS in detail(6)
14 i. Highlight the features of Hadoop and explain the functionalities of
Hadoop cluster?(7)
BTL1 Remembering
ii. Describe briefly about Hadoop input and output and write a note on
data integrity?(6)
PART – C
Generalize how the data flow takes places in MapReduce framework BTL6 Creating
1 (15)
State the significances of MapReduce and discuss about Hadoop
2 BTL5 Evaluating
distributed file system architecture with neat diagram (15)
Consider a collection of literature survey made by a researcher in the
form of a text document with respect to cloud and big data analytics.
3 Using Hadoop and Map Reduce , write a program to count the BTL5 Evaluating
occurrence of pre dominant key words (15)
4 Examine the NameNode recovery process. What will happen with a
NameNode that doesn‟t have any data? (15) BTL6 Creating
UNIT II CLUSTERING AND CLASSIFICATION
Advanced Analytical Theory and Methods: Overview of Clustering - K-means - Use Cases - Overview of the Method -
Determining the Number of Clusters - Diagnostics - Reasons to Choose and Cautions .- Classification: Decision Trees -
Overview of a Decision Tree - The General Algorithm - Decision Tree Algorithms - Evaluating a Decision Tree -
Decision Trees in R - Naïve Bayes - Bay es „T heo r em - N aï ve B ay e s C la ssifier.
w w w .p ad e e p z . n e t
www.padeepz.net
PART – A
Q.No. Question BT Level Competence
1 Define clustering BTL 1 Remembering
2 How can the initial number of clusters for k-means algorithm be
BTL3 Applying
estimated
3 Can you Pick K in a K-Means Algorithm? BTL5 Evaluating
4 What are the problems faced if clustering exists in non-Euclidean BTL2 Understanding
5 Point out the conclusions drawn from choosing clustroid? BTL4 Analyzing
6 Compare and contrast the relationship between centroids and
BTL4 Analyzing
clustering
7 Generalize the initialization of K-Means algorithm? BTL6 Creating
8 Define Bayes Theorem BTL 1 Remembering
9 Discuss the number of clusters? BTL2 Understanding
10 What is diagnotics? BTL 1 Remembering
11 Examine the use of object Attributes?
BTL3 Applying
What is unit of measure? BTL 1 Remembering
12
13 Summarize about Rescaling? BTL2 Understanding
14 What is metoids? BTL2 Understanding
15 What is Customer segmentation BTL 1 Remembering
16 Describe the prediction trees BTL2 Understanding
17 Analyze on internal nodes and leaf nodes BTL4 Analyzing
18 What is purity of node BTL 1 Remembering
Point out the CART BTL4 Analyzing
19
Illustrate Naïve Bayes
20 BTL3 Applying
PART – B
1 Explain the K-means clustering algorithm with an example.(13) BTL2 Understanding
i. Define K-Means algorithm(6)
2 BTL1 Remembering
ii. Examine how the data is processed in BFR Algorithm(7)
What are the main features of GRGPF Algorithm and explain it?(13) BTL1 Remembering
3
Summarize the hierarchical clustering in Euclidean and non-Euclidean
4 BTL2 Understanding
Spaces with its efficiency?(13)
5 Describe the various hierarchical methods of cluster analysis. (13) BTL2 Understanding
Explain and list the different hierarchical clustering techniques and
6 BTL4 Analyzing
explain any one. (13)
i. Given a one dimensional dataset {1, 5, 8, 10, 2} use the
7 BTL6 Creating
agglomerative clustering algorithms with the complete link with
Euclidean distance to westwabwlish.pa
ahiderearcehpicazl .gnroeupting
www.padeepz.net
relationship. By using the maximal lifetime as the cutting
threshold, how many clusters are there? What is their
membership in each cluster? (6)
ii. State and explain the clustering in non-Euclidean space with
example. (7) BTL2 Understanding
8 Describe about Market-Basket model.(13) BTL1 Remembering
9 Demonstrate about the two clustering techniques with suitable
BTL3 Applying
example.(13)
Illustrate about the clustering? Explain it with proper example.(13)
10 BTL1 Remembering
11 Explain in detail about the applications of clustering. (13)
BTL4 Analyzing
Generalize about general algorithm and decision tree algorithm. (13)
12 BTL 6 Creating
Illustrate in detail about Decision tree in R.(13)
13 BTL3 Applying
Explain in detail about Naïve Bayes Theorem, Classifier, Smoothing
14 BTL4 Analyzing
and Diagnostics. (13)
PART – C
Use the k-means algorithm and Euclidean distance to cluster the BTL5 Evaluating
1
following 8 examples into 3
clusters:A1=(2,10),A2=(2,5),A3=(8,4),A4=(5,8),A5=(7,5),A6=(6,4),A
7=(1,2),A8=(4,9).Suppose that the initial seeds(centers of each cluster)
are A1,A4 and A7.Run the k-means algorithm for 1 epoch only. At the
end of this epoch show
(i) The new clusters (5)
(ii) The centers of the new clusters (6)
(iii) How many more iterations are needed to coverage? Draw
the result for each epoch. (4)
2 Develop decision tree with an example to predict whether customers BTL 6 Creating
will buy a product(15)
3 Explain in detail about evaluate the decision tree algorithm(15) BTL5 Evaluating
Explain in detail about two methods of using the naïve Bayes classifier BTL 6 Creating
4 in R with examples.(15)
Advanced Analytical Theory and Methods: Association Rules - Overview - Apriori Algorithm - Evaluation
of Candidate Rules - Applications of Association Rules - Finding Association& finding similarity -
Recommendation System: Collaborative Recommendation- Content Based Recommendation - Knowledge
Based Recommendation- Hybrid Recommendation Approaches.
PART – A
BT
Q.No. Question Competence
Level
1 Define apriori algorithm BTL1 Remembering
2 State the use of Apriori algorithm in data mining BTL1 Remembering
3 State market basket analysis BTL1 Remembering
www.padeepz.net
www.padeepz.net
4 What is the logic behind association rule? BTL2 Understanding
5 What is Prune BTL2 Understanding
6 Define Confidence BTL1 Remembering
7 What do you mean by lift BTL2 Understanding
8 Show the advantage of leverage BTL3 Applying
9 What is frequent itemset generation BTL1 Remembering
10 Analyze the Validation and testing
BTL4 Analyzing
11 Demonstrate the approaches to improve Apriori efficiency
BTL3 Applying
12 How are interesting rules identified? BTL4 Analyzing
13 Examine the broad classification of Recommendation systems? BTL3 Applying
14 Summarize the interesting rules distinguished from coincidental rules? BTL 5 Evaluating
What is utility matrix
15 BTL4 Analyzing
16 Examine about item profile BTL1 Remembering
17 Give the Content Based recommendation system BTL 5 Evaluating
13 Generalize in detail about utility matrix and long tail. (13) BTL 6 Creating
Explain in detail about discovering features of documents. (13) BTL3 Applying
14
PART – C
A database has five transactions. Let min sup = 60% and min conf=80%
1
TID ITEMS
T100 Milk, Onion, Nuts, Kiwi, Egg, Yoghurt
T200 Dhal, Onion, Nuts, Kiwi, Egg, Yoghurt
T300 Milk, Apple, Kiwi, Egg BTL6 Creating
T400 Milk, Curd, Kiwi,Yoghurt
T500 Curd, Onion, Kiwi, Ice cream,Egg
Find all frequent item sets using Apriori method(15)
Narrate in detail about a model for Recommendation system.(15)
2 BTL 5 Evaluating
3 Illustrates with an example the application of the Apriori algorithm to a
relatively simple case that generalizes to those used in practice. Show how to
use the Apriori algorithm to generate frequent item sets and rules and to BTL6 Creating
evaluate and visualize the rules.(15)
Explain in detail about Hybrid and Knowledge based
4 recommendation.(15) BTL 4 Analyzing
How is sentiment analysis playing a major role in data mining? (13) BTL4 Analyzing
13
www.padeepz.net
www.padeepz.net
Explain in detail about Alon-Matias-Szegedy algorithm for second
14 BTL3 Applying
moments. (13)
PART – C
How does the Big Data Stream Analytics Framework (BDSAF)works
1 and explain with a neat architecture diagram (15) BTL6 Creating
Taking stock market preconditions as a case study, elaborate on the
2 Real-time Analytics Platform (RTAP). Present the assumptions mode. BTL6 Creating
(15)
www.padeepz.net
www.padeepz.net
Describe a pilot application for graph analytics BTL4 Analyzing
20
PART – B
List the classification of NoSQL Databases and explain about Key BTL3 Applying
1 Value Stores. (13)
2 Describe the system architecture and components of Hive and BTL1 Remembering
Hadoop(13)
3 What is NoSQL? What are the advantages of NoSQL? Explain the types
of NoSQL databases. (13) BTL1 Remembering
4 Explain about Graph databases and descriptive Statistics(13) BTL1 Remembering
5 Write short notes on BTL3 Applying
i. NoSQL Databases and its types(7)
ii. Illustrate in detail about Hive data manipulation, queries, data
definition and data types(6)
With suitable examples differentiate the applications, structure, BTL2 Understanding
6
working and usage of different NoSQL databases (13)
7 Differences between SQL Vs NoSQL explain it with suitable example. BTL2 Understanding
(13)
i. Explain Hive Architecture(6) BTL2 Understanding
8 ii. Write down the features of Hive(7)
9 Explain the types of NoSQL data stores in detail. (13) BTL3 Applying
Discuss in detail about the characteristics of NoSQL databases(13)
10 BTL4 Analyzing
Explain in detail about Market and Business drives for Big data BTL1 Remembering
12
Analytics. (13)
What is HBase? Give detailed note on features of HBASE(13) BTL4 Analyzing
13
i. What is the purpose of sharding? (6)
14 BTL6 Creating
ii. Explain the process of sharding in MongoDB. (7)
PART – C
1 Analyze the use of Hive. How does Hive interact with Hadoop explain
in detail.(15) BTL4 Analyzing
Formulate how big data analytics helps business people to increase
2 BTL5 Evaluating
their revenue. Discuss with any one real time application.(15)
3 Draw insights out of any one visualization tool.(15)
BTL5 Evaluating
Explain in detail about brief history of NoSQL. Explain in detail about
4 ACID vs. BASE.(15) BTL 6 Creating
www.padeepz.net