Professional Documents
Culture Documents
DataScience Training Course Syllabus
DataScience Training Course Syllabus
Science-GIIT
Course Content
1. Introduction to Data Science
Learning Objectives - This module will give you an understanding of Big Data and the Roles and
Responsibilities of a Data Scientist. You will learn how Hadoop and R are used in Big Data Analytics and
what are the methodologies used in the Analysis. This module will cover common Big Data as well as
non-Big Data problems and available methods in Data Science to solve these problems. We will also
solve few real-life data sets a Data Scientist encounter in his day to day work using R, Hadoop and
Mahout.
Topics - Introduction to Big Data, Roles played by a Data Scientist, Analyzing Big Data using Hadoop and
R, Methodologies used for analysis, the Architecture and Methodologies used to solve the Big Data
problems, For example, Data Acquisition from various sources, Data preparation, Data transformation
using Map Reduce (RMR), Application of Machine Learning Techniques, Data Visualization etc., problem
statement of few data science problems which we shall solve during the course.
Data
Science-GIIT
4. Machine Learning Techniques Using R Part-2
Learning Objectives - In this module, you will learn Unsupervised Machine Learning Techniques and the
implementation of different algorithms, for example, K-Means Clustering, TF-IDF and Cosine Similarity.
Topics - Understanding K-Means Clustering, Understanding TF-IDF and Cosine Similarity and their
application to Vector Space Model, Implementing Association rule mining in R.
Data
Science-GIIT
8. Mahout Introduction and Algorithm Implementation
Learning Objectives - In this module, you will understand Apache Mahout Machine Learning Library and
will also gain an insight into the methods to achieve Parallel Processing using Algorithms in Mahout.
Topics - Implementing Machine Learning Algorithms on larger Data Sets with Apache Mahout.
10. Project
Learning Objectives - In this module, you will learn various approaches to solve a Data Science problem
and How different technologies and Tools (R, Hadoop, Mahout) work together in a typical Data Science
Project.