Professional Documents
Culture Documents
3408 Shiv Big Data Presentation
3408 Shiv Big Data Presentation
HADOOP ECO-SYSYEM
3408
SHIVPRAKASH VISHWAKARMA
INTRODUCTION
In simple words, a pipeline in data science is “a set of actions which
changes the raw (and confusing) data from various sources (surveys,
knowledge of data science. dabl is inspired by the Scikit-learn library and it tries
HADOOP ECO-SYSTEM
and solutions.
•There are four major elements of Hadoop i.e. HDFS,