You are on page 1of 2

Super Important Questions for BDA-18CS72

By The TIE review team GAT, RNSIT, KSIT, JSSATE, RRCE, BNMIT

Module-1

1. What is Big data? Mention its characteristics, and also the classification of Big data?
2. With a neat diagram explain big data architecture design
3. With a neat diagram, explain data store export to cloud and cloud services
4. Discuss various case studies and applications of Big data
5. Define:(i)Hadoop (ii) Mesos (iii)SQL and NoSQL(with features) (iv)DDBMS (v)In-memory
column and row format data
6. Explain various techniques for data pre-processing
7. Diff between distributed computing, grid computing, cluster computing

Module-2

1. Describe the Hadoop Core Components with a diagram


2. Explain using a diagram the Hadoop Ecosystem Components and features
3. What is HDFS? List the different commands in HDFS
4. Write a note on YARN Based Execution Model
5. Describe the Hadoop physical organization? Write the features of Hadoop?
6. Explain any three Essential Hadoop tools
7. Classify the map-reduce framework(also learn process placement) and Programing model

Module-3

1. What is NoSQL? What are the advantages of NoSQL? Explain the types of NoSQL
databases
2. What is a NoSQL database? List the classification of NoSQL databases and explain
Key-value stores.
3. Explain the usage of using NoSQL to Manage Big Data
4. List and explain NoSQL Data Architecture Patterns
5. What are the ways of handling big data problems? Explain
6. Write a note on (i)MongoDB Database (ii)Casandra Databases
7. Write a short note on (i) Cap theorem (ii) Acid properties (iii)Column Stores

Module-4

1. What is Hive in Big data? List the features of Hive? Also, explain Hive architecture with
relevant diagrams
2. With the help of a neat diagram, briefly explain the MapReduce programming model. Also,
explain the MapReduce process when a client submits a job supported by a neat diagram
3. Write a short note on Pig Data Model(pig architecture) along with its features
4. Describe in detail Hive data manipulation, queries, data definition and data types.
5. Discuss the Composing MapReduce for Calculations and Algorithms
6. Write a note on HiveQL, and its Queries.
7. Explain how hive interacts with Hadoop(VBQ)

Module-5

1. Demonstrate Probability Distributions, and Correlations


2. Describe the Structure of the Web and analyse a Web Graph
3. Explain K-Nearest Neighbour Regression Analysis. What are the different ways by which
KNN distance can be obtained? Explain.
4. Difference between (i)Linear and non-linear relationship
(ii)Standard deviation and standard error
5. Explain in detail web content mining and diff phases for web usage mining
6. Design the Text Analytics Process Pipeline and explain it in brief.
7. Illustrate the use of Text Mining, Web Mining, Web Content and Web Usage Analytics

—------------------------------------------All the best, Paper Tumbsi Banni—------------------------------------

You might also like