You are on page 1of 3

Big Data Analytics (15CS82) – Question Bank Santhosh Kumar D K

Question Bank
MODULE - I
1. What is HDFS? Explain the importance with all the components of HDFS.
2. Explain with neat diagram, the necessity of Name Node High availability.
3. Explain the necessity of checkpoint, backup and snapshot. How do they differ?
4. Explain the benefit of name node federation with neat diagram.
5. Write any 8 general HDFS commands with example.
6. Explain Hadoop parallel MapReduce data flow with neat diagram.
7. Explain the importance of fault tolerance and speculative execution.
8. Short note on HDFS safe mode and Rack Awareness.
9. Explain compiling and running the Hadoop Word-count with Map Reduce debugging.
10. Write a HDFS java program that is used to READ from and WRITE to HDFS.
11. Write a MapReduce java word count program and explain its important functions.

MODULE - II
1. What is Pig? Explain its usage modes?
2. Importance of Hive in Hadoop?
3. Explain how data is imported and exported in Hadoop using Scoop with diagram.
4. Difference between Scoop V1 and V2?
5. Explain how Flume agents are used to acquire data stream in Hadoop with all
configurations.
6. How Hadoop manages workflow using Oozie, explain with diagram?
7. What is HBase, explain its features.
8. Explain Yarn architecture with two clients with diagram.
9. Explain with diagram Hadoop eco system
10. How Spark is different from Mapreduce
11. Explain how Flume and storm are different.
12. Different views available in Ambari
13. How to manage Hadoop services using Ambari
14. How to change Hadoop properties using Ambari
15. Explain basic Hadoop yarn administration
16. Explain basic HDFS administration
17. Explain how to configure NFSV3 gateway to HDFS

MODULE – III
1. Define business intelligence. Explain BDMI cycle with neat diagram
2. How Business intelligence plays a major role in decision making? Explain its
types.
3. List and explain tools available for business intelligence
4. List skills required for business intelligence

Dept. of CS&E, Canara Engineering College Page 1


Big Data Analytics (15CS82) – Question Bank Santhosh Kumar D K

5. How business intelligence plays a major role in customer relationship


management application?
6. Explain how business intelligence is applied on health care and wellness
application
7. How retail industries have an impact on BI?
8. List & explain the key factors imposed by BI on banking financial service and
insurance.
9. How BI is useful in in telecom service?
10. Define DW. List and explain design constraint for DW?
11. Compare the development approaches of DW?
12. Explain DW architecture with a neat diagram.
13. Define ETL cycle. Explain with neat diagram.
14. Define Data Mining. Explain how to gather and select data for mining.
15. List and explain the consideration for data cleansing and preparation.
16. Explain with neat diagram how to evaluate DM results.
17. List and explain DM techniques available.
18. Compare the different available DM platforms.
19. Explain with diagram crisp DM cycle.
20. List the myths about DM.
21. Explain the common DM techniques.
22. Define Data Visualization
23. Explain how to excel in Visualization
24. List and explain Types of charts available for Data Visualization.
25. List and explain data Visualization tips.

MODULE – IV
1. How decision trees are used to solve a problem. Explain with example.
2. Compare decision tree with table lookup.
3. Write and explain Decision tree algorithm with the key elements.
4. Compare the most popular decision tree algorithms.
5. What are the key steps for regression? Explain its correlation and relationships.
6. With example explain linear regression statics and co-efficient.
7. With a neat diagram explain model for Single ANN and Multi-Layer ANN.
8. List and explain the steps for developing an ANN.
9. List the advantages and disadvantages of ANN.
10. Define Clustering analysis. List and explain its applications.
11. Define Cluster. Explain how to represent a cluster.
12. Write a short note on Clustering technique with a pseudo-code.
13. Explain K-means algorithm for clustering with an example.
14. List the advantages and disadvantages of K-means algorithm.
15. Define Association rule mining and explain business applications of it.
16. Write and explain Apriori algorithm with example.

Dept. of CS&E, Canara Engineering College Page 2


Big Data Analytics (15CS82) – Question Bank Santhosh Kumar D K

17. How to create association rule explain with an example.

MODULE – V
1. Define text mining. List and explain its applications.
2. Explain the process of text mining with neat diagram.
3. Explain the use of term document matrix in text mining. Explain with the example.
4. Compare the difference between the text and data mining and highlight the practices
for text mining.
5. Explain the Naïve-Bayes models with an example.
6. Discuss the advantages and disadvantages of Naïve- Bayes model.
7. Explain SVM model with respective diagrams.
8. Explain the use of kernel method in SVM,
9. List the advantages and disadvantages of SVM.
10. Define Web mining. List and explain its characteristics with the neat diagram.
11. How web contents are mined. Explain with the structure (2 components).
12. With the neat diagram explain web usage mining.
13. Write a short note on Web Mining Algorithm.
14. Define social network analysis. List and explain its applications.
15. What are the topologies used in SNA? Explain with neat diagram.
16. Explain techniques and algorithm used for SNA with example.
17. Explain page rank algorithm.
18. Difference between SNA and Traditional data analysis. Explain practical
considerations for SNA.

Dept. of CS&E, Canara Engineering College Page 3

You might also like