repository or cloud vendor proprietary data store.
Examples of big data sources are Amazon
Redshift, HP Vertica, and MongoDB. 7.What is web data? The Web data is the data that comes from large or diverse number of sources. Web data are developed with the help of Semantic Web tools such as RDF, OWL, and SPARQL. Also, the web data allows sharing of information through HTTP protocol or SPARQL endpoint. 8. List out the data analytic tools? Trifacta Rapid Miner Rattle GUI Qlikview Weka KNIME Orange. 9. what are the challenges of big data? Data challenges Volume, velocity, veracity, variety, Data discovery and comprehensiveness Scalability Process challenges Capturing data Aligning data from different sources, Transforming data into suitable form for data analysis ,Modeling data(mathematically, simulation,) Understanding output, visualizing results and display issues on mobile devices Management challenges Security , Privacy, Governance, Ethical issues 10. what are the Trends in Big Data Analytics 1. Big Data Analytics in the cloud 2. Hadoop: The new enterprise data operating system 3. Big Data lakes 4. More predictive analytics 5. SQL on Hadoop: Faster, better 6. More, better NoSQL 7. Deep learning 8. In-memory analytics Unit II Part A 1.What is Hadoop YARN? Apache Hadoop YARN (Yet Another Resource Negotiator) is a cluster management technology.YARN is one of the key features in the second-generation Hadoop 2 version of the Apache Software Foundation's open source distributed processing framework. YARN (Yet Another Resource Negotiator) is a component of the MapReduce project created to overcome some performance issues in Hadoop'soriginal design. MapReduce Version 2 is a re-write of the original MapReduce code run as anapplication on top of YARN.