You are on page 1of 22

Dr.G.R.

Damodaran College of Science


(Autonomous, affiliated to the Bharathiar University, recognized by the UGC)Re-
accredited at the 'A' Grade Level by the NAAC and ISO 9001:2008 Certified
CRISL rated 'A' (TN) for MBA and MIB Programmes

II MCA [2018-2021 Batch]


Semester IV
Elective II: Big Data Analytics - 454U8
Multiple Choice Questions.

1. Facebook Tackles Big Data With _______ based on Hadoop


A. Project Prism
B. Prism
C. Project Data
D. Project Bid
ANSWER: A

2. What are the 3v's of Big Data?


A. Volume
B. Variety
C. Velocity
D. all the above
ANSWER: D

3. What license is Hadoop distributed under ?


A. Apache License 2.0
B. Mozilla
C. Shareware
D. Middleware
ANSWER: A

4. Sun also has the Hadoop Live CD ________ project, which allows running a fully functional Hadoop
cluster using a live CD
A. OpenOffice.org
B. OpenSolaris
C. OpenSolaris
D. Linux
ANSWER: C

5. Which of the following genres does Hadoop produce ?


A. Distributed file system
B. JAX-RS
C. Java Message Service
D. JSP
ANSWER: A

6. What was Hadoop written in ?


A. C
B. C++
C. Java
D. JSP
ANSWER: C

7. Which of the following platforms does Hadoop run on ?


A. Bare metal
B. Debian
C. Cross-platform
D. Unix-Like
ANSWER: C

8. Hadoop achieves reliability by replicating the data across multiple hosts, and hence does not require
________ storage on hosts.
A. RAID
B. ZFS
C. Operating System
D. DFS
ANSWER: A

9. Above the file systems comes the ________ engine, which consists of one Job Tracker, to which client
applications submit MapReduce jobs.
A. MapReduce
B. Google
C. Functional Programming
D. Facebook
ANSWER: A

10. The Hadoop list includes the HBase database, the Apache Mahout ________ system, and matrix
operations.
A. Machine learning
B. Pattern recognition
C. Statistical classification
D. Artificial intelligence
ANSWER: A

11. ________ is a platform for constructing data flows for extract, transform, and load (ETL) processing
and analysis of large datasets.
A. Pig Latin
B. Oozie
C. Pig
D. Hive
ANSWER: C

12. Point out the correct statement


A. Hive is not a relational database, but a query engine that supports the parts of SQL specific to
querying data
B. Hive is a relational database with SQL support
C. Pig is a relational database with SQL support
D. All of the mentioned
ANSWER: A

13. _________ hides the limitations of Java behind a powerful and concise Clojure API for Cascading.
A. Scalding
B. HCatalog
C. Cascalog
D. All of the mentioned
ANSWER: C

14. Hive also support custom extensions written in :


A. C
B. C++
C. C#
D. Java
ANSWER: D

15. Point out the wrong statement


A. Amazon Web Service Elastic MapReduce (EMR) is Amazon packaged Hadoop offering
B. Elastic MapReduce (EMR) is Facebook packaged Hadoop offering
C. Scalding is a Scala API on top of Cascading that removes most Java boilerplate
D. All of the mentioned
ANSWER: B

16. ________ is the most popular high-level Java API in Hadoop Ecosystem
A. Scalding
B. HCatalog
C. Cascalog
D. Cascading
ANSWER: D

17. ___________ is general-purpose computing model and runtime system for distributed data analytics.
A. Mapreduce
B. Drill
C. Oozie
D. None of the mentioned
ANSWER: A

18. The Pig Latin scripting language is not only a higher-level data flow language but also has operators
similar to :
A. JSON
B. XML
C. XSL
D. SQL
ANSWER: D

19. _______ jobs are optimized for scalability but not latency
A. Mapreduce
B. Drill
C. Hive
D. Chuckro
ANSWER: C
20. ______ is a framework for performing remote procedure calls and data serialization.
A. Mapreduce
B. Dril
C. Avro
D. Chuckro
ANSWER: C

21. As companies move past the experimental phase with Hadoop, many cite the need for additional
capabilities, including
A. As companies move past the experimental phase with Hadoop, many cite the need for additional
capabilities, including
B. Improved extract, transform and load features for data integration
C. Improved data warehousing functionality
D. Improved security, workload management and SQL support
ANSWER: D

22. Point out the correct statement


A. Hadoop do need specialized hardware to process the data
B. Hadoop 2.0 allows live stream processing of real time data
C. In Hadoop programming framework output files are divided in to lines or records
D. None of the mentioned
ANSWER: B

23. According to analysts, for what can traditional IT systems provide a foundation when they are
integrated with big data technologies like Hadoop ?
A. Big data management and data mining
B. Data warehousing and business intelligence
C. Management of Hadoop clusters
D. Collecting and storing unstructured data
ANSWER: A

24. Hadoop is a framework that works with a variety of related tools. Common cohorts include
A. MapReduce, MySQL and Google Apps
B. MapReduce, Hive and HBase
C. MapReduce, Hummer and Iguana
D. MapReduce, Heron and Trumpet
ANSWER: B

25. Which of the following is not an input format in Hadoop ?


A. TextInputFormat
B. ByteInputFormat
C. SequenceFileInputformat
D. KepInputFormat
ANSWER: B

26. What was Hadoop named after?


A. Creator Doug Cutting favorite circus act
B. Cutting high school rock band
C. The toy elephant of Cutting son
D. A sound Cutting laptop made during Hadoop development
ANSWER: C

27. All of the following accurately describe Hadoop, EXCEPT


A. Open source
B. Real-time
C. Java-based
D. Distributed computing approach
ANSWER: B

28. __________ can best be described as a programming model used to develop Hadoop-based
applications that can process massive amounts of data.
A. MapReduce
B. Mahout
C. Oozie
D. All of the mentioned
ANSWER: A

29. __________ has the world's largest Hadoop cluster.


A. Apple
B. Datamatics
C. Facebook
D. None of the mentioned
ANSWER: C

30. Facebook Tackles Big Data With _______ based on Hadoop.


A. Prism
B. Project Prism
C. Project Big
D. Project Data
ANSWER: B

31. A ________ node acts as the Slave and is responsible for executing a Task assigned to it by the
JobTracker.
A. MapReduce
B. Mapper
C. TaskTracker
D. JobTracker
ANSWER: C

32. Point out the correct statement


A. Map Task in MapReduce is performed using the Mapper() function
B. Reduce Task in MapReduce is performed using the Map() function
C. All of the mentioned
D. MapReduce tries to place the data and the compute as close as possible
ANSWER: D

33. ___________ part of the MapReduce is responsible for processing one or more chunks of data and
producing the output results.
A. Maptask
B. Mapper
C. Task execution
D. All of the mentioned
ANSWER: A

34. _________ function is responsible for consolidating the results produced by each of the Map()
functions/tasks.
A. Map
B. Reduce
C. Reducer
D. Reduced
ANSWER: B

35. Point out the wrong statement


A. A MapReduce job usually splits the input data-set into independent chunks which are processed by
the map tasks in a completely parallel manner
B. The MapReduce framework operates exclusively on pairs
C. Applications typically implement the Mapper and Reducer interfaces to provide the map and reduce
methods
D. None of the mentioned
ANSWER: D

36. Although the Hadoop framework is implemented in Java ,MapReduce applications need not be written
in
A. C
B. C++
C. Java
D. VB
ANSWER: C

37. ________ is a utility which allows users to create and run jobs with any executables as the mapper
and/or the reducer.
A. HadoopStrdata
B. Hadoop Streaming
C. Hadoop Stream
D. None of the mentioned
ANSWER: B

38. __________ maps input key/value pairs to a set of intermediate key/value pairs.
A. Mapper
B. Reducer
C. Both Mapper and Reducer
D. None of the mentioned
ANSWER: A

39. The number of maps is usually driven by the total size of


A. task
B. output
C. input
D. none
ANSWER: C

40. _________ is the default Partitioner for partitioning key space


A. HashPar
B. Partitioner
C. HashPartitioner
D. None of the mentioned
ANSWER: C

41. Mapper implementations are passed the JobConf for the job via the ________ method
A. JobConfigure.configure
B. JobConfigurable.configure
C. JobConfigurable.configureable
D. None of the mentioned
ANSWER: B

42. Point out the correct statement


A. Applications can use the Reporter to report progress
B. The HadoopMapReduce framework spawns one map task for each InputSplit generated by the
InputFormat for the job
C. The intermediate, sorted outputs are always stored in a simple (key-len, key, value-len, value) format
D. All of the mentioned
ANSWER: D

43. Input to the _______ is the sorted output of the mappers.


A. Reducer
B. Mapper
C. Shuffle
D. All of the mentioned
ANSWER: A

44. The right number of reduces seems to be :


A. 0.65
B. 0.55
C. 0.95
D. 0.68
ANSWER: C

45. Point out the wrong statement


A. Reducer has 2 primary phases
B. Increasing the number of reduces increases the framework overhead, but increases load balancing and
lowers the cost of failures
C. It is legal to set the number of reduce-tasks to zero if no reduction is desired
D. The framework groups Reducer inputs by keys (since different mappers may have output the same
key) in sort stage
ANSWER: A

46. The output of the _______ is not sorted in the Mapreduce framework for Hadoop.
A. Mapper
B. Cascader
C. Scalding
D. None of the mentioned
ANSWER: D
47. Which of the following phases occur simultaneously ?
A. Reduce and Sort
B. Shuffle and Sort
C. Shuffle and Map
D. All of the mentioned
ANSWER: B

48. Mapper and Reducer implementations can use the ________ to report progress or just indicate that they
are alive.
A. Partitioner
B. OutputCollector
C. Reporter
D. All of the mentioned
ANSWER: C

49. __________ is a generalization of the facility provided by the MapReduce framework to collect data
output by the Mapper or the Reducer
A. Partitioner
B. OutputCollector
C. Reporter
D. All of the mentioned
ANSWER: B

50. _________ is the primary interface for a user to describe a MapReduce job to the Hadoop framework
for execution.
A. Map Parameters
B. JobConf
C. MemoryConf
D. All of the mentioned
ANSWER: B

51. A ________ serves as the master and there is only one NameNode per cluster
A. Data Node
B. NameNode
C. Data block
D. Replication
ANSWER: B

52. Point out the correct statement


A. DataNode is the slave/worker node and holds the user data in the form of Data Blocks
B. Each incoming file is broken into 32 MB by default
C. Data blocks are replicated across different nodes in the cluster to ensure a low degree of fault
tolerance
D. None of the mentioned
ANSWER: A

53. HDFS works in a __________ fashion


A. master-worker
B. master-slave
C. worker/slave.
D. All of the mentioned
ANSWER: A

54. ________ NameNode is used when the Primary NameNode goes down.
A. Rack
B. Data
C. Secondary
D. None
ANSWER: C

55. Point out the wrong statement


A. Replication Factor can be configured at a cluster level (Default is set to 3) and also at a file level
B. Block Report from each DataNode contains a list of all the blocks that are stored on that DataNode
C. User data is stored on the local file system of DataNodes
D. DataNode is aware of the files to which the blocks stored on it belong to
ANSWER: D

56. Which of the following scenario may not be a good fit for HDFS?
A. HDFS is not suitable for scenarios requiring multiple/simultaneous writes to the same file
B. HDFS is suitable for storing data related to applications requiring low latency data access
C. HDFS is suitable for storing data related to applications requiring high latency data access
D. None of the mentioned
ANSWER: A

57. The need for data replication can arise in various scenarios like :
A. Replication Factor is changed
B. DataNode goes down
C. Data Blocks get corrupted
D. All of the mentioned
ANSWER: D

58. ________ is the slave/worker node and holds the user data in the form of Data Blocks
A. DataNode
B. NameNode
C. Data block
D. Replication
ANSWER: A

59. HDFS provides a command line interface called __________ used to interact with HDFS.
A. HDFS Shell
B. FS Shell
C. DFSA Shell
D. None
ANSWER: B

60. HDFS is implemented in _____________ programming language


A. C++
B. Java
C. Scala
D. None
ANSWER: B
61. ___________ is the world's most complete, tested, and popular distribution of Apache Hadoop and
related projects.
A. MDH
B. CDH
C. ADH
D. BDH
ANSWER: B

62. Point out the correct statement


A. Cloudera is also a sponsor of the Apache Software Foundation
B. CDH is 100% Apache-licensed open source and is the only Hadoop solution to offer unified batch
processing, interactive SQL, and interactive search, and role-based access controls
C. More enterprises have downloaded CDH than all other such distributions combined
D. All of the mentioned
ANSWER: D

63. Cloudera ___________ includes CDH and an annual subscription license (per node) to Cloudera
Manager and technical support.
A. Enterprise
B. Express
C. Standard
D. All the above
ANSWER: A

64. Cloudera Express includes CDH and a version of Cloudera ___________ lacking enterprise features
such as rolling upgrades and backup/disaster recovery
A. Enterprise
B. Express
C. Standard
D. Manager
ANSWER: D

65. Point out the wrong statement


A. CDH contains the main, core elements of Hadoop
B. In October 2012, Cloudera announced the Cloudera Impala project
C. CDH may be downloaded from Cloudera's website at no charge
D. None of the mentioned
ANSWER: D

66. Cloudera Enterprise comes in ___________ edition .


A. One
B. Two
C. Three
D. Four
ANSWER: C

67. __________ is a online NoSQL developed by Cloudera.


A. HCatalog
B. Hbase
C. Imphala
D. Oozie
ANSWER: B

68. _______ is an open source set of libraries, tools, examples, and documentation engineered.
A. Kite
B. Kize
C. Ookie
D. All of the mentioned
ANSWER: A

69. To configure short-circuit local reads, you will need to enable ____________ on local Hadoop.
A. librayhadoop
B. libhadoop
C. libhad
D. hadoop
ANSWER: B

70. CDH process and control sensitive data and facilitate


A. multi-tenancy
B. flexibilty
C. scalabilty
D. resuability
ANSWER: A

71. _______ can change the maximum number of cells of a column family
A. set
B. reset
C. alter
D. connect
ANSWER: C

72. Point out the correct statement


A. You can add a column family to a table using the method addColumn()
B. Using alter, you can also create a column family
C. Using disable-all, you can truncate a column family
D. None of the mentioned
ANSWER: A

73. Which of the following is not a table scope operator ?


A. MEMSTORE_FLUSH
B. MEMSTORE_FLUSHSIZE
C. MAX_FILESIZE
D. All of the mentioned
ANSWER: A

74. You can delete a column family from a table using the method _________ of HBAseAdmin class.
A. delColumn()
B. removeColumn()
C. deleteColumn()
D. All of the mentioned
ANSWER: A
75. Point out the wrong statement
A. To read data from an HBase table, use the get() method of the HTable class
B. You can retrieve data from the HBase table using the get() method of the HTable class
C. While retrieving data, you can get a single row by id, or get a set of rows by a set of row ids, or scan
an entire table or a subset of rows
D. None of the mentioned
ANSWER: D

76. __________ class adds HBase configuration files to its object.


A. Configuration
B. Collector
C. Component
D. None of the mentioned
ANSWER: A

77. The ________ class provides the getValue() method to read the values from its instance
A. Get
B. Result
C. Put
D. Value
ANSWER: B

78. ________ communicate with the client and handle data-related operations.
A. Master Server
B. Region Server
C. Htable
D. All of the mentioned
ANSWER: B

79. _________ is the main configuration file of HBase.


A. hbase.xml
B. hbase-site.xml
C. hbase-site-conf.xml
D. None of the mentioned
ANSWER: B

80. HBase uses the _______ File System to store its data
A. Hive
B. Impala
C. Hadoop
D. Scala
ANSWER: C

81. Which of the following is a principle of analytic graphics?


A. Don't plot more than two variables at at time
B. Make judicious use of color in your scatterplots
C. Show box plots (univariate summaries)
D. Show causality, mechanism, explanation
ANSWER: D

82. What is the role of exploratory graphs in data analysis ?


A. They are made for formal presentations.
B. They are typically made very quickly.
C. Axes, legends, and other details are clean and exactly detailed.
D. They are used in place of formal modeling.
ANSWER: B

83. Which of the following is true about the base plotting system ?
A. Margins and spacings are adjusted automatically depending on the type of plot and the data
B. Plots are typically created with a single function call
C. Plots are created and annotated with separate functions
D. The system is most useful for conditioning plots
ANSWER: C

84. Which of the following is an example of a valid graphics device in R?


A. A socket connection
B. A Microsoft Word document
C. A PDF file
D. A file folder
ANSWER: C

85. Which of the following is an example of a vector graphics device in R?


A. JPEG
B. GIF
C. PNG
D. SVG
ANSWER: D

86. Bitmapped file formats can be most useful for


A. Plots that may need to be resized
B. Plots that require animation or interactivity
C. Plots that are not scaled to a specific resolution
D. Scatterplots with many many points
ANSWER: D

87. Which of the following functions is typically used to add elements to a plot in the base graphics system
A. lines()
B. hist()
C. plot()
D. boxplot()
ANSWER: D

88. Which function opens the screen graphics device for the Mac ?
A. bitmap()
B. quartz()
C. pdf()
D. png()
ANSWER: B

89. What does the 'pch' option to par() control ?


A. the size of the plotting symbol in a scatterplot
B. the line width in the base graphics system
C. the orientation of the axis labels on the plot
D. the plotting symbol/character in the base graphics system
ANSWER: D

90. MapReduce was devised by______________


A. Apple
B. Google
C. Facebook
D. Samsung
ANSWER: B

91. _____ programming language is a dialect of S.


A. B
B. C
C. R
D. K
ANSWER: C

92. Point out the WRONG statement


A. Early versions of the S language contain functions for statistical modeling
B. The book Programming with Data by John Chambers documents S version of the language
C. In 1993 Bell Labs gave StatSci (later Insightful Corp.) an exclusive license to develop and sell the S
language
D. All of the mentioned
ANSWER: A

93. In 2004, ________ purchased the S language from Lucent for $2 million
A. Insightful
B. Amazon
C. IBM
D. All the above
ANSWER: A

94. In 1991, R was created by Ross Ihaka and Robert Gentleman in the Department of Statistics at the
University of _________.
A. John Hopkins
B. California
C. Harvard
D. Auckland
ANSWER: D

95. Point out the wrong statement


A. R is a language for data analysis and graphics
B. K is language for statistical modelling and graphics
C. One key limitation of the S language was that it was only available in a commercial package, S-PLUS
D. None of the mentioned
ANSWER: B

96. Finally, in _________ R version 1.0.0 was released to the public.


A. 2000
B. 2005
C. 2010
D. 2012
ANSWER: A

97. R is technically much closer to the Scheme language than it is to the original _____ language.
A. B
B. C
C. R
D. S
ANSWER: D

98. The R-help and _____ mailing lists have been highly active for over a decade now
A. R-mail
B. R-devel
C. R-dev
D. R-d
ANSWER: B

99. Which of the following describes R language ?


A. Free
B. Paid
C. Available for free trial only
D. None of the above
ANSWER: A

100. The copyright for the primary source code for R is held by the ______ Foundation.
A. A
B. C
C. C++
D. R
ANSWER: D

101. They primary R system is available from the ______


A. CRWO
B. CRAN
C. CZOO
D. GNU
ANSWER: B

102. Point out the wrong statement


A. Key feature of R was that its syntax is very similar to S
B. R has been reported to be running on modern tablets, phones, PDAs, and game consoles
C. R runs only on Windows computing platform and operating system
D. None of the mentioned
ANSWER: C

103. R functionality is divided into a number of ________


A. Packages
B. Functions
C. Domains
D. Sub Domains
ANSWER: A

104. The _________ R system contains, among other things, the base package which is required to run R
A. root
B. child
C. base
D. none of the above
ANSWER: C

105. Point out the wrong statement


A. One nice feature that R shares with many popular open source projects is frequent releases
B. R has sophisticated graphics capabilities
C. S's base graphics system allows for very fine control over essentially every aspect of a plot or graph.
D. All the above
ANSWER: C

106. Which of the following is a base package for R language ?


A. util
B. lang
C. tools
D. all the above
ANSWER: C

107. Which of the following is "Recommended" package in R ?


A. util
B. lang
C. stats
D. spatial
ANSWER: D

108. How many packages exist in R language for statistics ?


A. 2000
B. 3000
C. 4000
D. all the above
ANSWER: D

109. Advanced users can write ___ code to manipulate R objects directly.
A. C
B. C++
C. Java
D. None of the mentioned
ANSWER: A

110. Which of the following is used for Statistical analysis in R language ?


A. RStudio
B. Studio
C. Heck
D. None of the mentioned
ANSWER: A
111. The most convenient way to use R is at a graphics workstation running a ________ system.
A. windowing
B. running
C. interfacing
D. All of the mentioned
ANSWER: A

112. Point out the wrong statement


A. Setting up a workstation to take full advantage of the customizable features of R is a straightforward
thing
B. q() is used to quit the R program
C. R has an inbuilt help facility similar to the man facility of UNIX
D. None of the mentioned
ANSWER: B

113. Which of the following is default prompt for UNIX environment ?


A. >
B. <<
C. <<
D. <
ANSWER: A

114. . Point out the wrong statement


A. Windows versions of R have other optional help system also
B. The help.search command (alternatively ??) allows searching for help in various ways
C. R is case insensitive as are most UNIX based packages, so A and a are different symbols and would
refer to different variables
D. All of the mentioned
ANSWER: C

115. Which of the following statement is alternative to ?solve


A. help(solve)
B. man(solve)
C. hel(solve)
D. All of the mentioned
ANSWER: A

116. Elementary commands in R consist of either _______ or assignments.


A. utilstats
B. language
C. expressions
D. None of the mentioned
ANSWER: C

117. If a command is not complete at the end of a line, R will give a different prompt, by default it is :
A. *
B. -
C. +
D. All the above
ANSWER: C
118. Command lines entered at the console are limited to about ________ bytes
A. 3000
B. 4095
C. 5000
D. None
ANSWER: B

119. ._____ text editor provides more general support mechanisms via ESS for working interactively with
R.
A. EAC
B. Emac
C. Shell
D. None
ANSWER: B

120. What would be the result of following R code ? > x <- 1 >print(x)
A. 1
B. 2
C. 3
D. 4
ANSWER: A

121. Point out the wrong statement


A. The grammar of the language determines whether an expression is complete or not
B. The <- symbol is the assignment operator in R
C. The ## character indicates a comment
D. None of the mentioned
ANSWER: C

122. ___________ is the world's largest Hadoop Cluster.


A. Apple
B. Datamatics
C. Facebook
D. None of the above
ANSWER: C

123. Files containing R scripts ends with extension :


A. S
B. R
C. B
D. P
ANSWER: B

124. Point out the wrong statement


A. : operator is used to create integer sequences
B. The numbers in the square brackets are part of the vector itself
C. There is a difference between the actual R object and the manner in which that R object is printed to
the console
D. The numbers in the curve brackets are part of the vector itself
ANSWER: B
125. If commands are stored in an external file, say commands.R in the working directory work, they may
be executed at any time in an R session with the command :
A. source("commands.R")
B. exec("commands.R")
C. execute("commands.R")
D. All of the mentioned
ANSWER: A

126. _______ will divert all subsequent output from the console to an external file.
A. sink
B. div
C. dip
D. exp
ANSWER: A

127. The entities that R creates and manipulates are known as ________
A. task
B. objects
C. function
D. expression
ANSWER: B

128. Which of the following can be used to display the names of (most of) the objects which are currently
stored within R ?
A. object()
B. objects()
C. list()
D. none of the above
ANSWER: B

129. Collection of objects currently stored in R is called as :


A. package
B. workspace
C. list
D. array
ANSWER: B

130. What will be the output of following code snippet ? > paste("a", "b", se = ":")
A. a+b
B. a-b
C. ab
D. none
ANSWER: D

131. Point out the correct statement


A. In R, a function is an object which has the mode function
B. R interpreter is able to pass control to the function, along with arguments that may be necessary for
the function to accomplish the actions that are desired
C. Functions are also often written when code must be shared with others or the public
D. All of the mentioned
ANSWER: D
132. The __________ function returns a list of all the formal arguments of a function
A. formals()
B. funct()
C. formal()
D. All of the mentioned
ANSWER: A

133. Point out the wrong statement


A. A formal argument can be a symbol, a statement of the form 'symbol = expression', or the special
formal argument
B. The first component of the function declaration is the keyword function
C. The value returned by the call to function is not a function
D. None of the mentioned
ANSWER: A

134. You can check to see whether an R object is NULL with the _________ function.
A. is.nullobj()
B. null()
C. is.null()
D. obj.null()
ANSWER: C

135. Which of the following code will print NULL ?


A. > args(pastebin)
B. > args(paste)
C. > arg(paste)
D. > argc(paste)
ANSWER: B

136. What are the main components of Big Data?


A. MapReduce
B. HDFS
C. YARN
D. all the above
ANSWER: D

137. What are the different features of Big Data Analytics?


A. Open Source
B. Data Recovery
C. Scalability
D. all of the above
ANSWER: D

138. For YARN, the ___________ Manager UI provides host and port information.
A. Data Node
B. NameNode
C. Resource
D. Replication
ANSWER: C
139. Point out the correct statement
A. The Hadoop framework publishes the job flow status to an internally running web server on the
master nodes of the Hadoop cluster
B. Each incoming file is broken into 32 MB by default
C. Data blocks are replicated across different nodes in the cluster to ensure a low degree of fault
tolerance
D. None of the mentioned
ANSWER: A

140. For ________, the HBase Master UI provides information about the HBase Master uptime.
A. Oozie
B. HBase
C. Kafka
D. Afka
ANSWER: B

141. __________ is a standard Java API for monitoring and managing applications.
A. JVM
B. JVN
C. JMX
D. JMY
ANSWER: C

142. __________ Manager's Service feature monitors dozens of service health and performance metrics
about the services and role instances running on your cluster.
A. Microsoft
B. Cloudera
C. Amazon
D. None of the abovc
ANSWER: B

143. The IBM _____________ Platform provides all the foundational building blocks of trusted
information, including data integration, data warehousing, master data management, big data and
information governance.
A. InfoStream
B. InfoSphere
C. InfoSurface
D. InfoSurface
ANSWER: A

144. Point out the correct statement


A. IBM InfoSphere DataStage is an ETL tool
B. IBM InfoSphere DataStage is an ETL tool
C. InfoSphere uses a graphical notation to construct data integration solutions
D. All of the mentioned
ANSWER: D

145. InfoSphere DataStage has __________ levels of Parallelism


A. 1
B. 2
C. 3
D. 4
ANSWER: C

146. InfoSphere DataStage uses a client/server design where jobs are created and administered via a
________ client against central repository on a server
A. Ubuntu
B. Windows
C. Debian
D. Solaris
ANSWER: B

147. ___________ is used for processing complex transactions and messages,


A. PS
B. Server Edition
C. MVS Edition
D. TX
ANSWER: C

148. DataStage originated at __________, a company that developed two notable products: UniVerse
database and the DataStage ETL tool.
A. VMark
B. Vzen
C. Hatez
D. SMark
ANSWER: A

149. DataStage RTI is real time integration pack for :


A. STD
B. ISD
C. EXD
D. FSD
ANSWER: B

150. NameNode is monitored and upgraded in a __________ transition.


A. safemode
B. securemode
C. servicemode
D. servicemonitor
ANSWER: B

Staff Name
Suguna M .

You might also like