You are on page 1of 3

Day 1

R Intro and Installation:
Downloading and Installing R, Getting Help on a function, Viewing Documentation,
General issues in , RPackages Management

Data Inputting / Output and Munging in R
Data Types, Subsetting, Writing data, Reading tabular data files, Reading from csv
files, Creating a vector and vector operations, Initializing a data frameControl
structures, Selecting data frame cols by position and name, Changing directoriesRe-
directing R output, Basics of SQLRODBC and DBI Package, Performing queries,
Advanced Data Handling, Combining and restructuring data frames

Data Visualization in R
Creating a bar chart, dot plot, Creating a scatter plot, pie chart, Creating a histogram
and box plot, Other plotting functions, Plotting with base graphics, Plotting with
Lattice graphics, Plotting and coloring in R, Trellis, Lattice, ggplot2, ggplot, Googleviz

Functions and Programming in R
Flow Control: For loop, If condition, While conditions and repeat loop, Debugging
tools, Concatenation of DataCombining, Vars, cbind, rbindSapply, apply, tapply

Univariate Statistics
Summarizing data, measures of central tendency, Measures of variability,

Hypothesis Testing and ANOVA
Introducing statistical inference, Estimators and confidence intervals, Central Limit
theorem, Parametric and non-parametric statistical tests, Analysis of variance
(ANOVA), Case studies
Predictive Modeling
Correlation, Simple linear regression, Multiple linear regression, Model diagnostics
and validation, Simple linear regression, Multiple Regression model, Logistic
regression, Hierarchical Clustering, K-Means Clustering, PCA for Dimensionality
Reduction, Case study
Logistic regression
Moving from linear to logistic, Model assumptions and Odds ratio, Logistic
regression , Real world case study, Model assessment and gains table, ROC curve
and KS statistic
Machine Learning and advanced data science
Recommendation Engine, Clustering, Hierarchical clustering, K-means clustering,
Discriminant Analysis, Canonical Correspondence Analysis, Association Rules,
Classification Tree, Binary and Regression Tree, Outlier Detection, Location Based
analysis, Twitter and Facebook feeds Word Cloud and other network analysis, Social
Network Sentiment Analysis, Social network based location analysis
Time Series and Analysis
Moving Average based forecasting, ARIMA, ARMA models, Exponential Smoothing,
Winter’s method
Big Data R
Hadoop connection to R and analytics under RHadoop, Rmapreduce: rmr2 package,
RHadoop, RHive
R to Tableau connection and modeling
R Shiny and KnitR introduction

Day 2

1. Using Python for Data Science and Visualization

2. Intro to Python

3. Libraries and packages for data analysis – Pandas, Numpy, SciPy, Scikit-Learn

4. Performing basic data analysis
Regression Analysis – Linear and Logistics
Time Series Modeling
Supervised and Unsupervised Learning
Decision Trees
Cluster Analysis – K Nearest Neighbors and K Means
Spam Filtering
Bayesian Analysis – Naive Bayes


Graphics with Matplotlib