You are on page 1of 7
CS3ED04 Big Data Engineering Your email address (en17es301182@medicaps.2c.n) willbe recorded when you submit this form, Not you? Switch account * Required ‘Sqoop uses. to fetch data from RDBMS and stores that on HDFS. * 3 pont O tpn O tive O Bigtor © Mop reduce A data warehouse is which of the following? * 1 pont © ovganizes around impertant subject areas. O Can be updated by end users © Contains numerous naming conventions and formats. © Contains oni curent deta. describes the data contained in the data warehouse. * 1 pont © ‘ntormationa data © operational data. © Metadata. © Feiaionat cate What are the advantages of using Flume? * 1 pont © Using Apache Flume we can tte the data into any ofthe centralize stores (Ease HFS) When the rate of incoming data exceeds the rate at which data can be writen othe © destination, Flame ets 28 a mediator between data producers andthe cenalized stores and proves a stead flow of data between them © both 4 sna o © None Which of the following Batch Processing instance is NOT an example of Big + port Data Batch Processing? * © Web craming app © Trending topic analysis of twests for lst 15 minutes © Processing 10.68 sales data every hours Define Clustering, * 1 pont Clustering involves the grouping of similar obj ETL Stands for * 1 point © nract Transmission, Loading © Exact, wanstormation and Loading © Expore, Transformation, Loeding © Export, Transformation, Loading Describe briefly meaning of coordinator job in Oozie. * Your answer Which of the following is true for RDD? * 1 pont © ods ere similarto the tebe ine relational database © Wecan operate Spark RODS in paral witha fo-levl API O ‘eatows processing of large amount of structured cata © thas buitin optimization engine Point out the correct statement, * 1 pont © Version 152s the eighth Flume release as en Apache top eve project, © Flume 152s production-eagy software for integration wth hadoop © Fumes ditribute, lable ard avaible service © Allotthe mentones A isan operation on the stream that can transform the stream." © Allofthe mentioned O sinks © Decorator © source Which of the folowing is required by K-means clustering? * © intat uess as to cluster centroids © number of csters © allot he mentioned O defined distance metic Which ofthe folowing statements true? * © The operational data ae used 28a source for he data warehouse © Alotthe above O The data warehouseis used asa source forthe operational data © Thedata warehouse consists of data mart and operational data Where does Sqoop ingest data from? * © Have © Linux Fe ciretory O ws0 O Mongoo8. ‘Streaming is better fit when you have * O Processing needs single pate over ata port port 1 point 1 pont U Hrocessing neeos munpre pass over aaa © Processing needs to secess resent data (temporal localty) © both aanee For Multiclass classification problem which algorithm isnot the solution? * > port © Naive Bayes © Decision Tees © Logie Regression © Fandom Forests ‘Consider @ scenario in which some insights in to the data are more valuable + port shortly after ithas happened and value of data diminishes very fest with time. Name the kind of data processing to be used. * Your answer Define Flume Agent. * pont ‘Your answer Zoho Analytics is 2 Big data tool for * pont © Date transformation © bate steeming © Alortne above © bts visatzation Which of the following is a tool of Apache Spark Machine Learning Library? port © Persistence O Pipetines © Alottne above © sities te tiner ager, statics Which of the folowing i false for Apache Spark? * pont © ‘ean te integrated with Hadoop and can process existing Hadaop HDFS data © Spark is 100 times faster than Bigdata Hadoop © Sparkis.anopen source tramewark whichis written in Java © Movosies hig teve APLin Jove, Python R, Seale Which of these is not an application of stream processing? * point © smart patient eae O None ofthese O Traffic mentoring © Montring a production ine Give the name of core abstraction In Apache storm. * 1 pont Your answer Is a distributed real-time computation system for processing + pont large volumes of high-velocity data. * © stam O diste O twoene © kama Which of the following Is true for Spark MLiIb? * port ‘© Provides an execution platform forall the Spark applications © enables power inerectve and data analytics application seross ive streaming data O Altotihe above © his the scalable machine learning library which deine efficiencies Woich among the following are core components of Flume? * + point O source © event O six © alirine above ‘Clustering is a type of * + pont © Reinforcement tearing © Supervised leering © Unsupervised iearning O None ofthe above By default the records from databases imported to HDFS by sqoop are* > port © ter separated © concatenated columns © comma separated © snsce separated Define Regression. * + pain Your answer Point out the correct statement. * 1 port © Alot tne mentioned © storm integrates withthe querying and database technologies you eeady use © Apache Storm is a free and open source distributed realtime computation system © ASteem tepslgy consumes steams of eta and processes those streams in arbitrarily complex ways: ‘A.copy of your responses will be emailed to en17¢s301182@medicaps.ac.in, os tet Never submit passwords through Google Fos. “This for was created ins of Med Caps Ueresiy Indore Report Abuse

You might also like