To Understand and apply the fundamental concepts of
Big Data Technology and YARN Questions :
1) Types of NOSQL database Key-value and Schema-less.
2) Schemaless database are Column(Cassandra), Document(CouchDB),Graph(Neo4j). 3) Auto-Sharding automatically spread data across an arbitrary no of servers . 4) NoSQL support strong consistency example MangoDB. 5) JSON stands for Java Script Object Notation. 6) Hadoop supports Structured,Semi-Structured and Unstructured data format. 7) Hadoop 1.0 supports only Batch processing. 8) YARN stands for Yet Another Resource Negotiator. 9) One of the core aspects of Hadoop is MapReduce. 10) Master node is also known has Name Node. 11) Hadoop is a distributed Master/Slave architecture. 12) ClickStream data helps you to understand the purchasing behavior of customers 13) Business analysts can use Apache Pig or Apache Hive for website analysis. Big Data and Analytics- 2
14) Master MapReduce decide and schedules computation task
on slave node. 15) Mahout scalable machine learning and data mining library. 16) The expansion of BASE Basically Available Soft State Eventual Consistency. 17) Mongo DB Consistent & Partition Tolerant . 18) NewSQL is a robust database that supports ACID properties . 19) Linux is the official development and production platform for hadoop. 20) JobTracker & TaskTracker are daemon associated with mapreduce programming. 21) Pig is a data flow system for hadoop and it uses Pig Latin to specify dataflow. 22) Application is a Job submitted to framework. 23) A Blockreport contains a list of all blocks on a datanode. 24) data node is responsible for read/write file operation. 25) NameNode is a single point of failure of hadoop clusters. 26) Hadoop 1.0 has two main parts Data Storage Framework and Data processing framework. 27) There is a single TaskTracker per slave node. Big Data and Analytics- 2
28) A typical block size is used by HDFS is 64MB.
29) The key distinctions of Hadoop are Accessible,Robust and Scalable. 30) Under SQL we have query statements . 31) Under Mapreduce we have script and code statements . 31) Complier to translate Pig Latin to Mapreduce programming. 32) HBase is used to store billions of rows and millions of columns. 33) The famous example for Mapreduce programming is Word Count . 34) To create directory hadoop fs-mkdir/ command is used in HDFS. 35) To get a list of complete directories and files hadoop fs -ls - R/ command is in HDFS. 36) A Global Resource Manager main responsibility is to distribute resources among various applications in system. 37) Node Manager responsibility is launching the application containers for application execution. 38)Hadoop 2.0 is based on YARN architecture. 39)RDBMS supports structured data format. Big Data and Analytics- 2
40)File is used for updating Mapreduce setting used mapred-