You are on page 1of 1

viva questions

### UNIT-1: Understanding Big Data


1. What are the key characteristics of big data?
2. Why is big data important in today's context?
3. Discuss the challenges posed by big data.
4. Can you classify big data analytics? Explain.
5. Provide examples of big data applications in healthcare, banking, advertising, and other
industries.

### Unit 2: Hadoop Distributed File System (HDFS)


1. Explain the components of the Hadoop ecosystem.
2. Describe the architecture of Hadoop.
3. What are the key concepts of HDFS?
4. Differentiate between Name nodes and Data nodes in HDFS.
5. How do you read, write, and delete data in HDFS?

### Unit 3: NoSQL Data Management


1. What is NoSQL and why is it used?
2. Discuss the aggregate data models in NoSQL.
3. Explain the key-value and document data models.
4. What are graph databases and schema-less databases?
5. Describe the concepts of sharding and map-reduce in NoSQL.

### Unit 4: MapReduce and YARN


1. Explain the MapReduce paradigm in Hadoop.
2. Differentiate between Mapper and Reducer tasks.
3. What are Job and Task trackers in Hadoop?
4. Discuss the components and functions of YARN.
5. How does YARN address the failures encountered in classic MapReduce?

### Unit 5: Pig and Hive


1. How do you install and run Pig? Provide an example.
2. Compare Pig with traditional databases.
3. What is Pig Latin and how is it used for data processing?
4. Explain the concepts of Hive and its shell.
5. Discuss the similarities and differences between HiveQL and traditional SQL.

These questions should cover the main topics outlined in your syllabus.

You might also like