Professional Documents
Culture Documents
Big Data Analytics QB Bank
Big Data Analytics QB Bank
Question Bank
MODULE - I
1. What is HDFS? Explain the importance with all the components of HDFS.
2. Explain with neat diagram, the necessity of Name Node High availability.
3. Explain the necessity of checkpoint, backup and snapshot. How do they differ?
4. Explain the benefit of name node federation with neat diagram.
5. Write any 8 general HDFS commands with example.
6. Explain Hadoop parallel MapReduce data flow with neat diagram.
7. Explain the importance of fault tolerance and speculative execution.
8. Short note on HDFS safe mode and Rack Awareness.
9. Explain compiling and running the Hadoop Word-count with Map Reduce debugging.
10. Write a HDFS java program that is used to READ from and WRITE to HDFS.
11. Write a MapReduce java word count program and explain its important functions.
MODULE - II
1. What is Pig? Explain its usage modes?
2. Importance of Hive in Hadoop?
3. Explain how data is imported and exported in Hadoop using Scoop with diagram.
4. Difference between Scoop V1 and V2?
5. Explain how Flume agents are used to acquire data stream in Hadoop with all
configurations.
6. How Hadoop manages workflow using Oozie, explain with diagram?
7. What is HBase, explain its features.
8. Explain Yarn architecture with two clients with diagram.
9. Explain with diagram Hadoop eco system
10. How Spark is different from Mapreduce
11. Explain how Flume and storm are different.
12. Different views available in Ambari
13. How to manage Hadoop services using Ambari
14. How to change Hadoop properties using Ambari
15. Explain basic Hadoop yarn administration
16. Explain basic HDFS administration
17. Explain how to configure NFSV3 gateway to HDFS
MODULE – III
1. Define business intelligence. Explain BDMI cycle with neat diagram
2. How Business intelligence plays a major role in decision making? Explain its
types.
3. List and explain tools available for business intelligence
4. List skills required for business intelligence
MODULE – IV
1. How decision trees are used to solve a problem. Explain with example.
2. Compare decision tree with table lookup.
3. Write and explain Decision tree algorithm with the key elements.
4. Compare the most popular decision tree algorithms.
5. What are the key steps for regression? Explain its correlation and relationships.
6. With example explain linear regression statics and co-efficient.
7. With a neat diagram explain model for Single ANN and Multi-Layer ANN.
8. List and explain the steps for developing an ANN.
9. List the advantages and disadvantages of ANN.
10. Define Clustering analysis. List and explain its applications.
11. Define Cluster. Explain how to represent a cluster.
12. Write a short note on Clustering technique with a pseudo-code.
13. Explain K-means algorithm for clustering with an example.
14. List the advantages and disadvantages of K-means algorithm.
15. Define Association rule mining and explain business applications of it.
16. Write and explain Apriori algorithm with example.
MODULE – V
1. Define text mining. List and explain its applications.
2. Explain the process of text mining with neat diagram.
3. Explain the use of term document matrix in text mining. Explain with the example.
4. Compare the difference between the text and data mining and highlight the practices
for text mining.
5. Explain the Naïve-Bayes models with an example.
6. Discuss the advantages and disadvantages of Naïve- Bayes model.
7. Explain SVM model with respective diagrams.
8. Explain the use of kernel method in SVM,
9. List the advantages and disadvantages of SVM.
10. Define Web mining. List and explain its characteristics with the neat diagram.
11. How web contents are mined. Explain with the structure (2 components).
12. With the neat diagram explain web usage mining.
13. Write a short note on Web Mining Algorithm.
14. Define social network analysis. List and explain its applications.
15. What are the topologies used in SNA? Explain with neat diagram.
16. Explain techniques and algorithm used for SNA with example.
17. Explain page rank algorithm.
18. Difference between SNA and Traditional data analysis. Explain practical
considerations for SNA.