You are on page 1of 36

learn data science by building

ACADEMY
SYLLABUS
learn data science by building

Menara Kadin lvl .4 (Parking Building)


Jl . H . R . Rasuna Said Blok X- 5
No.Kav 3 , RT.1/RW.2 , Kuningan Timur,
Kecamatan Setia Budi,
DKI Jakar ta 12950

community@algorit.ma
w w w.algorit.ma
+62 877-7835 -3007
teamalgoritma
Team Algoritma
teamalgoritma
teamalgoritma
Our Journey Our Core Products
Modern workforces are under-equipped to solve the kind of problems we face as a collective in Data Science Academy
the digital era. While the complexity and enormity of data have grown explosively in the past few
years, the tools we use to solve these problems have seen only incremental improvements every Non-stop, highly practical pearning. Takes a learn by building approach in 14 modules
decade or so. and 2 capstone projects to become an employable data scientist.

Data science, the field dedicated to the study of insights extraction, statistical modeling, and artificial ACADEMY REGULAR ACADEMY FULL-STACK
intelligence, can empower professionals in profound ways and it will continue to play an increasingly Bootcamp using R in just 11 weeks Bootcamp using R, Python, & SQL
integral role in our interaction with technology. in just 15 weeks
COURSES:
We are founded with the vision of democratizing data science skills and equip every professional - Data Visualization Specialization COURSES:
with a set of core skills across the various domains of data visualization, regression, data modeling, - Machine Learning Specialization - Data Visualization Specialization
machine learning, and statistical programming literacy. Whether you’re a marketing executive, a - 2 Capstone Projects - Machine Learning Specialization
business analyst, an entrepreneur, or a financial market professional, we want to help you be a - Data Science Communications & - Data Analytics Specialization
rockstar and a highly effective executive. Presentation Workshop - 2 Capstone Projects
- Demo Day Coaching - Data Science Communications &
Algoritma’s pedagogical excellence is recognized regionally, an achievement backed by our Presentation Workshop
illustrious track record in year-long corporate consulting projects and other shorter consultative, DURATION & FREQUENCY: - Demo Day Coaching
customized training engagements. 4-5 days a week for 3 months
Day Class: 13:00 – 16:00 DURATION & FREQUENCY:
Night Class: 18:00 – 21:00 4-5 days a week for 4 months
Night Class: 18:00 – 21:00
- Algoritma was founded in Jakarta by
Samuel Chan and Nayoko Wicaksono - First Corporate Training, PT Salam
Pacific Indonesia Lines
- Our first workshop: Kickstart Your Data Career Support & Algoritma Data Career Day
Science Career at Rework, Setiabudi After graduating from Algoritma Data Science Academy Program, students from each cohort
June 2017 October 2017 who have fulfilled the criteria can join the Data Career Day. This event is held for students to
be able to showcase their project to the attendees which consist of our Hiring Partners,
Corporate Clients, and the public.

- Student Track was initiated for the first The student will be given a 4-week period after they finished their project, for them to prepare
- Algoritma Data Science Academy time, Algoritma gave scholarship for for Data Career Day. During Data Career Day, participants will also get the chance to be
launched for the very first time. chosen students from 6 universities
interviewed by our Hiring Partners in the Speed Dating session.
Students are taught to learn data (UI, UGM, ITB, Telkom University,
science using R programming Prasetya Mulya, and Bina Nusantara
language in 3 months University) to learn data science every
January 2018 May 2018 Saturdays in the Algoritma Weekend
Track program Other Workshops & Events
Kickstart Series
KICKSTART A collection of 3-4 hours introductory seminar for professionals
- Algoritma held the very first Data - Data Analytics Specialization is
introduced to the public. Students
SERIES who are curious about what data science is and how it impacts
Career Day. 6 chosen alumni from
the first and second cohort (Agon
and Bifrost) showcased their project
NEW will learn how to use Python and SQL
for Data Analysis in one month, using
their career and company.

to the Hiring Partners and Algoritma’s real business data donated by


June 2018 corporate network July 2019 Algoritma’s Corporate Network. CORPORATE Corporate Training
In-House corporate training for industries from a wide range of
TRAINING domains. We provide training to enable your organization to
capitalize on the potential already working under your roof.

1 2
l el a
e ranr nd d
a taat as csicei n
enc ec eb b
y yb b
uui l idl d
i ni n
gg

DATA
DATA VISUALIZATION
VISUALIZATION SPECIALIZATION
SPECIALIZATION
AAfun,
fun,hands-on,
hands-on,and
andproject-based
project-basedspecialization
specialization
that
thathelps
helpsstudent
studentgain
gainfull
fullproficiency
proficiencyin
indata
data
visualization
visualizationsystems
systemsand
andtools.
tools.Create
Createcompelling
compelling
narratives
narrativesby
bycombining
combiningcharting
chartingelements
elementswith
with
custom
customaesthetics
aestheticsunder
underthe
theguidance
guidanceofof
our
ourinstructors.
instructors.

The
Thelearn-by-building
learn-by-buildingmodule
modulein inall
allthe
theworkshops
workshops
follows
followsour
ourproject-based
project-basedlearning
learningphilosophy
philosophyto
to
this
thisspecialization.
specialization.The
Thecourse
coursecapstone
capstonerequires
requires
that
thatthe
thestudent
studentbuild
buildaareal-world
real-worldapplication
applicationunder
under
stringent
stringentcriteria
criteriamodeled
modeledafter
afterreal
real
business
businessscenarios.
scenarios.

https://algorit.ma/data-visualization-specialization/
https://algorit.ma/data-visualization-specialization/
Programming for
Data Science
3-day workshop

P4DS
Module 1: Data Science in R
Data Science in R Working with Data
R
R Programming
Programming Basics
Basics Reading
Reading &
& Extracting
Extracting Data
Data
Why
Why Learn
Learn R?
R? Understanding
Understanding Statistics
Statistics
R
R Studio
Studio Interface
Interface Exploratory
Exploratory Data
Data Analysis
Analysis
Data
Data Structures
Structures in
in R
R
Programming
Programming for for Data
Data Science
Science isis aa course
course
that cover the important programming
that cover the important programming Data Manipulation
paradigms
paradigms andand tools
tools used
used by
by data
data Working
Working with
with your
your Global
Global Environment
Environment
analysts
analysts and data scientists today.
and data scientists today. YouYou Getting
Getting familiar
familiar with
with your
your Workspace
Workspace
will
will be
be guided
guided through
through aa series
series of
of coding
coding Continuous
Continuous and
and Categorical
Categorical Data
Data
exercises
exercises designed
designed to
to maximize
maximize youryour
familiarity
familiarity with data science programming
with data science programming
in
in RStudio,
RStudio, an
an integrated
integrated development
development
environment
Module 2: Data Manipulation
environment for the
for the statistical
statistical computing
computing
language
language R.R.
Data Manipulation II Practical Data Cleansing
Vector
Vector Types
Types and
and Classes
Classes The
The Data
Data Transformation
Transformation Process
Process
Upon
Upon completion
completion of of this
this workshop,
workshop, youyou
List
List and
and Objects
Objects Reproducible
Reproducible Data
Data Science
Science Projects
Projects
will be familiar with the programming
will be familiar with the programming
Matrix
Matrix and
and Data
Data Frames
Frames Reading
Reading and
and Writing
Writing from
from your
your IDE
IDE
language,
language, popular
popular tools,
tools, libraries
libraries (data
(data
science
science packages) and tool kits
packages) and tool kits required
required
to
to excel
excel in
in your
your data
data analysis
analysis and
and R in Practice
statistical
statistical computing projects.
computing projects. Programming
Programming Exercise:
Exercise: e-Commerce
e-Commerce Retail
Retail Datasets
Datasets
In-depth
In-depth review
review of
of Data
Data Frame
Frame subsetting
subsetting
Sampling
Sampling and
and Randomization
Randomization
For more information:
Cross-Tabulations
Cross-Tabulations
https://algorit.ma/course/programming-
Aggregations
Aggregations
for-data-science

57 86
Academy Modules
Graded Quiz
Working with R
R Scripts and Functions
R Markdown
Why Care about Reproducibility

Learning-by-Building Module (2 points)


Retail Sales Pre-Diagnostic Cleanup Script
Build a programming script that reads data into our
workspace, perform various data cleansing tasks, and
save the result in the appropriate formats for data
science work

Reproducible Data Science


Create an R Markdown file that combines data
transformation code with explanatory text. Add
formatting styles and hierarchical structure using
Markdown.

7
9 10
8
Practical
Statistics
2-day workshop

PS
Module 1: Descriptive Statistics
5-Number Summary Central Tendency & Variability
Mean, Median and Mode Probability Distribution Function
Measures of Central Tendency Visualizing Central Tendency
Quantiles in R Variation, Variance and Covariance
Pave the statistical foundation for more
advanced machine learning theories later Standard Score and z-Score
on in the specialization by picking up the Standard Normal Curve
key ideas in statistical thinking. Learn to Central Limit Theorem
interpret correlations, construct z-Score Calculation & Student's T-test
confidence intervals and other statistical
principles that form the basis of many
common machine learning models.
Module 2: Inferential Statistics
The 2-day course is optional for
participations of the Data Science and Probabilities Intervals
Machine Learning Specialization and Probability Mass Function Confidence Intervals
intended for learners without prior Probability Density Function Prediction Intervals
experience in statistics. Expected Values p-Values

Inferential Statistics in Practice


Hypothesis Testing
Deriving Scientific Truths from Data
For more information: Case Study
https://algorit.ma/course/practical-statistics

11
9 10
12
Academy Modules
Tips & Techniques: R for Statisticians
Density Plots
Interpreting Box Plots (Box-and-Whisker)
Better summary statistics with skimr()
Pais Matrix

Learning-by-Building Module (Not Graded)


Statistical Treatment of Retail Dataset
Using what you’ve learned, formulate a question and
derive a statistical hypothesis test to answer the question.
You have to demonstrate that you’re able to make
decisions using data in a scientific manner.
Examples of questions can be:
Is there a different in profitability between standard
shipment and same-day shipment?
Supposed there is no difference in profitability
between the different product segment, what is the
probability that we obtain the current observation due
to pure chance alone?

11
13 14
12
Data Visualizaiton
in R
3-day workshop
4-day

Module 1: Plotting Essentials

DVinR
Base Plotting I Base Plotting II
Plots and Lines Histograms and Curves
Built-in Plot Types Cleveland's Dot Plot
Legends and Annotations Axis, Titles, Subtitles and Panel Styles
Other built-in Plotting The notorious pie chart
Functionalities

A fun, hands-on, and project-based


Working with ggplot2 Enhancing ggplot2
Grammar of Graphics System Axis, titles and scales
workshop that helps student gain full
Mapping aesthetics Adding themes to your plots
proficiency in data visualization systems
Working with Geometries Custom aesthetics and styles
and tools. Create compelling narratives by
Background image Working with Legends
combining charting elements with custom
aesthetics under the guidance of our
instructors. Module 2: Richer Visualization Techniques
The 3-day
4-day course follows our Enhancing ggplot2 II Enhancing ggplot2 III
learn-by-building approach, in that Flipping coordinates and Axis Enriching: Scatterplots and bubble plots
students are tasked to reproduce a series Rotation Enriching: Jitterplots
of plots applying what they’ve learned. Multi-dimensional Faceting Enriching: Boxplots and violin plots
While it covers the three main plotting Text Layers and Label Layers Layer transparency
systems in R, its particular focus is on Expected Values
ggplot2 and the additional libraries
centered around it that bring interactivity
Enhancing ggplot2 IV Other Visualization Toolset
and enhanced aesthetic options to the art
Enriching: Column Plots Discrete, Continuous, and
of creating rich, powerful visualizations.
Enriching: Texts and Labels Gradient colors
Enriching: Horizontal and Facet with wraps and grids
For more information: Vertical Lines Visualizing Spatial Data
https://algorit.ma/course/data-visualization Fills and Colors Working with Leaflet and Maps

13
15 16
14
Academy Modules
Project: Mining Trending Videos on YouTube
Hands-on data visualization
Identifying temporal patterns in trending videos
Combining aesthetics and geometries

Learning-by-Building Module (2 points)


Creating a Publication-Grade Plot
Applying what you’ve learned, create an economics- or
social-related plot that is polished with the appropriate
annotations, aesthetics and some simple commentary.
You may use the same "YouTube Trending Videos" dataset
or any other dataset for this practice.
Creating an Interactive Map
Applying what you’ve learned, create a web page with an
interactive map embedded on it. Use a custom icon for the
map markers to represent business locations, and show
details about each location pin (“markers”) upon user’s
interaction with it.

15
17 18
16
Interactive Plotting &
Web Dashboard
Dashboards
3-day
4-day workshop

IP&WD
Module 1: Interactive Visualization
Working with Plotly Publication & Layout Options
Refresher on dplyr Multiple Plots Arrangement
ggplotly function More export functions
Visualization as a HTML widget Subplots
Building on the foundation from previous
Range Slider and other Tips and Techniques for Layouts
classes, we will create a series of
interactivity
interactive plots and gadgets that renders
multiple visualization elements based on
user’s input. This is the final workshop
leading up to the data visualization Module 2: Web Dashboard Development
capstone project.
Flex Dashboard Interactive Document
The 4-day
3-day course follows our Creating Flex Dashboard from Inputs and Outputs
learn-by-building approach, in that RStudio The renderPlot() function
students are tasked to reproduce a series Layouts Embedded Application
of plots applying what they’ve learned. It Hands-on Practice: Text, Plots, Demonstration and Practical Advice
covers an exhaustive list of techniques Tables
that add interactivity to an R document Demonstration and Practical
and set the stage for the data science Advice
capstone project.
Shiny Web App
Shiny Dashboard
Tabs and Pagination
For more information:
UI, Server and Shiny Functions
https://algorit.ma/course/web-dashboards
Custom Styles, Structure

17
19 20
18
Academy Modules
Tips on Web Dashboard Deployment
Working with live data
App deployment solutions
Tips for live dashboard performance

Learning-by-Building Module (4 points)


Building an Interactive Dashboard
Applying what you’ve learned, create a paginated web
dashboard with a rich set of UI elements coupled with
the appropriate server logic. The web dashboard can be
of any theme, using any dataset, but must feature an
input panel that accepts end user inputs and render the
output accordingly.

19
21 22
20
Data Visualization
VISUA
LIZE
LIZEY
Capstone Project
After having learned and explored appropriate
techniques on visualizing data, students are
required to deploy an interactive dashboard web

OURSU
application using a shiny server which contains

YOUR
any plotting objects such as ggplot and/or leaflet
that display useful insights. In addition, students
are given the freedom to use their own dataset
or past datasets from previous classes.

Marks of the project is out of 30 points, the rubrics

CCESS
for assessment and grading will be discuss in the class.

SUCCESS
Then
take action
23
21 22
24
learn data science by building

MACHINE LEARNING SPECIALIZATION


A careful combination of statistical theory, hands-on
coding and programming exercises to help students
understand and implement some of the most widely-
used and fundamental, machine learning algorithms.

By building regressors and classifier algorithms from


scratch, the student will go beyond applying machine
learning models to actually developing their own
models — and learn the right approach to fine-tuning
the model performance as well as evaluating model fit
against unseen data. Upon completion of the workshop,
the student will be well versed in an array of important,
versatile machine learning algorithms and equipped
with the right knowledge to apply them to future
https://algorit.ma/machine-learning-specialization/ datasets in their daily job.
Programming for
Data Science
3-day workshop

P4DS
Module 1: Data Science in R
Data Science in R Working with Data
R Programming
R Programming Basics
Basics Reading &
Reading & Extracting
Extracting Data
Data
Why Learn
Why Learn R?
R? Understanding Statistics
Understanding Statistics
R Studio
R Studio Interface
Interface Exploratory Data
Exploratory Data Analysis
Analysis
Data Structures
Data Structures in
in R
R
Programming for
Programming for Data
Data Science
Science isis aa course
course
that cover the important programming
that cover the important programming Data Manipulation
paradigms and
paradigms and tools
tools used
used by
by data
data Working with
Working with your
your Global
Global Environment
Environment
analysts and data scientists today.
analysts and data scientists today. You You Getting familiar
Getting familiar with
with your
your Workspace
Workspace
will be guided through a series of coding
will be guided through a series of coding Continuous and
Continuous and Categorical
Categorical Data
Data
exercises designed
exercises designed toto maximize
maximize your
your
familiarity with data science programming
familiarity with data science programming
in RStudio,
in RStudio, an
an integrated
integrated development
development
environment for the statistical computing
environment for the statistical computing
Module 2: Data Manipulation
language
language R.R.
Data Manipulation II Practical Data Cleansing
Vector Types
Vector Types and
and Classes
Classes The Data
The Data Transformation
Transformation Process
Process
Upon completion
Upon completion ofof this
this workshop,
workshop, you
you
List and
List and Objects
Objects Reproducible Data
Reproducible Data Science
Science Projects
Projects
will be familiar with the programming
will be familiar with the programming
Matrix and
Matrix and Data
Data Frames
Frames Reading and
Reading and Writing
Writing from
from your
your IDE
IDE
language, popular
language, popular tools,
tools, libraries
libraries (data
(data
science packages) and tool kits required
science packages) and tool kits required
to excel
to excel in
in your
your data
data analysis
analysis and
and R in Practice
statistical computing projects.
statistical computing projects. Programming Exercise:
Programming Exercise: e-Commerce
e-Commerce Retail
Retail Datasets
Datasets
In-depth review
In-depth review of
of Data
Data Frame
Frame subsetting
subsetting
For more information: Sampling and
Sampling and Randomization
Randomization
https://algorit.ma/course/programming- Cross-Tabulations
Cross-Tabulations
for-data-science Aggregations
Aggregations

25
27 28
26
Academy Modules
Graded Quiz: Working with R
R Scripts and Functions
R Markdown
Why Care about Reproducibility

Learning-by-Building Module (2 points)


Retail Sales Pre-Diagnostic Cleanup Script
Build a programming script that reads data into our
workspace, perform various data cleansing tasks, and
save the result in the appropriate formats for data
science work

Reproducible Data Science


Create an R Markdown file that combines data
transformation code with explanatory text. Add
formatting styles and hierarchical structure using
Markdown.

29
27 28
30
Practical
Statistics
2-day workshop

PS
Module 1: Descriptive Statistics
5-Number Summary Central Tendency & Variability
Mean, Median and Mode Probability Distribution Function
Measures of Central Tendency Visualizing Central Tendency
Quantiles in R Variation, Variance and Covariance
Pave the statistical foundation for more
advanced machine learning theories later Standard Score and z-Score
on in the specialization by picking up the Standard Normal Curve
key ideas in statistical thinking. Learn to Central Limit Theorem
interpret correlations, construct z-Score Calculation & Student's T-test
confidence intervals and other statistical
principles that form the basis of many
common machine learning models.
Module 2: Inferential Statistics
The 2-day course is optional for
participations of the Data Science and Probabilities Intervals
Machine Learning Specialization and Probability Mass Function Confidence Intervals
intended for learners without prior Probability Density Function Prediction Intervals
experience in statistics. Expected Values p-Values

Inferential Statistics in Practice


Hypothesis Testing
Deriving Scientific Truths from Data
Case Study

31
29 30
32
Academy Modules
Tips & Techniques: R for Statisticians
Density Plots
Interpreting Box Plots (Box-and-Whisker)
Better summary statistics with skimr()
Pais Matrix

Learning-by-Building Module (Not Graded)


Statistical Treatment of Retail Dataset
Using what you’ve learned, formulate a question and
derive a statistical hypothesis test to answer the question.
You have to demonstrate that you’re able to make
decisions using data in a scientific manner.
Examples of questions can be:
Is there a different in profitability between standard
shipment and same-day shipment?
Supposed there is no difference in profitability
between the different product segment, what is the
probability that we obtain the current observation due
to pure chance alone?

13
31 32
14
Regression
Models
4-day
3-day workshop

RM
Module 1: Regression Models I
OLS Regression Linear Models in R
Understanding Least Squares Understanding Coefficients
Outliers I Plotting Regression
Simple Linear Regression Model Construction

This course strives for a fine balance Intepreting Linear Models


between business applications and Residuals Manually
mathematical rigor in its treatment to Coefficients Manually
regression models, one of the most R-Squared Manually
essential statistical techniques in the field
of machine learning. Its aim is to equip
you with the knowledge to investigate
relationships between variables of a data Module 2: Regression Models II
effectively and rigorously.
Intepreting Linear Models Multiple Regression
We strongly recommend that you Estimates and Standard Errors Multicollinearity and VIF
complete Practical Statistics prior to t-value and P-value Model Assumptions
taking this course. Upon completion of Adjusted R-Squared Bias-Variance Trade-off
this workshop, you will acquire a rigorous Outliers II: Leverage and Influence
statistical understanding of machine
learning models, allowing you to Dive Deeper: Regression Models
extrapolate the same ideas into other, Model Selection and Specification
more advanced machine learning models. Step-wise Regression
All-possible Regressions
Residual Plots
Model Diagnostics
For more information: Limitations of Regression Models
https://algorit.ma/course/regression-models

35
33 34
36
Academy Modules
Graded Quiz

Learning-by-Building Module (3 points)


Recommendation on Lowering Crime Rates
Write a regression analysis report applying what you’ve
learned in the workshop. Using the dataset provided by
you, write your findings on the different socioeconomic
variables most highly correlated to crime rates.

Explain your recommendations where appropriate.

37
35 36
38
Classification in
Machine Learning I

CIML1
4-day
3-day workshop

Module 1: Logistic Regression


Relating Probabilities Logistic Regression
Learn to solve binary and multi-class
to Odds from First Principles
Understanding Odds Sigmoidal Logistic Function
classification models using machine
Understanding Log of Odds Key Assumptions of Sigmoid Function
learning algorithms that is easily
Plotting Odds and Log of Odds Extra Proof: Intuition behind the
understood and readily interpretable. You
Sigmoid Function
will learn to write a classification
algorithm from scratch, and appreciate Logistic Regression Practical Tips and
the mathematical foundations in Action Case Study
underpinning logistic regressions and Binary Logistic Regression Flight Delay Prediction Examples
nearest neighbors algorithms. Interpreting Coefficients Customer Churn and Attrition
Interpretation Against Examples
We strongly recommend that you Continuous & Discrete Variables Risk Modeling on Loans from
complete the regression models workshop Quarter 4, 2017
prior to taking this course. Upon
completion of this workshop, you will Performance Evaluation
acquire the depth to develop, apply, and and Model Selection
evaluate two highly versatile algorithms AIC (Akaike Information Criteria)
widely used today. Null Deviance and Residual Deviance
Hauck Donner Effect

For more information:


https://algorit.ma/course/classification-1

39
37 38
40
Academy Modules
Graded Quiz
Module 2: Learning-by-Building Module (3 points)
Nearest Neighbours Algorithm
Logistic Regression on Credit Risk
Applying what you’ve learned, present a simple R Markdown
Closer Look at Classification document in which you demonstrate the use of logistic regression on
Probabilties vs Class Response
the lbb_loans.csv dataset. Explain your findings wherever necessary and
Confusion Matrix
show the necessary data preparation steps. To help you through the
Sensitivity, Specificity & Precision
exercise, consider the following questions throughout the document:

k-NN in Action How do we correctly interpret the negative coefficients obtained from
Characteristics of k-NN your logistic regression?
Positives and Negatives How do we know which of the variables are more statistically significant
Diagnosing Breast Cancer with k-NN as predictors?
What are some strategies to improve your model?

Building Blocks of k-NN Customer Segment Prediction


Distance Function (Euclidean, Minkowsky) Applying what you’ve learned, present a simple R Markdown document
The k parameter in which you demonstrate the use of knn on the wholesale.csv dataset.
Standardization vs Min-Max Normalization Compare the k-NN to the logistic regression model and answer the
following questions throughout the document:
k-NN from First Principles What is your accuracy? Was the logistic regression better than kNN in
Classifying Customer Segments with k-NN
terms of accuracy? (recall the lesson on obtaining an unbiased estimate
Writing your own k-NN Classifier
of the model’s accuracy)
Predicting using your own k-NN Classifier
Was the logistic regression better than our kNN model at explaining
which of the variables are good predictors of a customer’s industry?
List down 1 disadvantage and 1 strength of each of the approach (kNN
and logistic regression)

39
41 42
40
Classification in
Machine Learning II
4-day
3-day workshop

Module 1: Naive Bayes

CIML2
Law of Probability Naive Bayes Classifier
Dependent and Independent Characteristics of a Naive
Events Bayes Classifier
Bayes Theorem The "naive" assumptions
Formula for Posterior Customer Churn example
Probability

Practical and Performance Naive Bayes in Action


Considerations Spam Classification
Learn to apply the law of probabilities, Predicting on Text (Corpus)
The Case for Smoothing
boosting, bootstrap aggregation, k-fold Predicting Political Party Affiliation
Laplace (Add-One)
cross validation, ensembling methods,
Thinking about Training vs
and a variety of other techniques as we
Prediction Speed
build some of the most widely used
machine learning algorithms today. Learn
to add performance to your models using Module 2: Tree-Based Methods and Ensembles
mathematically sound principles you’ll
learn in this course.
Decision Trees Decision Tress in Action
Advantages and Model Predicting Diabetes from
We strongly recommend that you
Characteristics Diagnostics Measurement
complete the Machine Learning:
Information Gain and AUC Curve
Classification 1 workshop prior to taking
Splitting Criterion Key Considerations and
this course. Some concepts presented
Pruning and Tree Size Practical Advice
throughout the lecture may be
less-than-ideal for practitioners who have Machine Learning High-Performance
not completed the pre-requisite courses.
Theories Machine Learning
Logistic Regression, Naive Bayes Bias-Variance Tradeoff revisited
and Decision Trees have more in k-Fold Cross Validation
common than you think Predicting Exercise Form with
For more information: Industrial Applications Fitness Tracker Data
https://algorit.ma/course/classification-2 Thinking about Decision
Boundaries

43
41 42
44
Academy
Academy Modules
Modules
Graded
Graded Quiz
Quiz

Learning-by-Building
Learning-by-Building Module
Module (3
(3 points)
points)
Identifying
Identifying Risky
Risky Bank
Bank Loans
Loans
Use
Use any
any ofof the
the 33 classification
classification algorithms
algorithms you’ve
you’ve learned
learned
in
in this
this lesson
lesson toto predict
predict the
the risk
risk status
status of
of aa bank
bank loan.
loan.
The
The variable
variable default
default in
in the
the dataset
dataset indicates
indicates whether
whether the
the
applicant
applicant diddid default
default on
on the
the loan
loan issued
issued by
by the
the bank.
bank.

Use
Use an
an RR Markdown
Markdown document
document to
to lay
lay out
out your
your process,
process,
and
and explain
explain the
the methodolody
methodolody in
in 11 or
or 22 brief
brief paragraph.
paragraph.
The
The student
student should
should be
be awarded
awarded the
the full
full (3)
(3) points
points when:
when:
The
The preprocessing
preprocessing steps
steps are
are done,
done, and
and the
the student
student
show
show anan understanding
understanding of of holding
holding outout aa test
test // cross
cross
validation
validation set
set for
for an
an estimate
estimate of of the
the model’s
model’s
performance
performance on on unseen
unseen data
data
The
The model’s
model’s performance
performance is is sufficiently
sufficiently explained
explained
(accuracy
(accuracy may
may notnot be
be the
the most
most helpful
helpful metric
metric here!
here!
Recall
Recall about
about what
what you’ve
you’ve learned
learned regarding
regarding specificity
specificity
and
and sensitivity)
sensitivity)
The
The student
student demonstrated
demonstrated extra extra effort
effort inin evaluating
evaluating
his/her
his/her model,
model, and
and proposes
proposes waysways toto improve
improve the the
accuracy
accuracy obtained
obtained from
from the
the initial
initial model
model

45
43 44
46
Unsupervised
Machine Learning
4-day
3-day workshop

Module 1: Dimensionality Reduction

UML
Background Principal Component
Understanding Unsupervised Analysis
Learning Rethinking about Covariances
The "dimensionality" problem The Case for PCA
Industrial Use of PCA Eigenvalues and Eigenvectors
PCA from First Principles PCA in Action
Just enough Matrix Algebra Dubious Property Sales in NYC
Learn PCA (Principal Component Mathematical Proof PCA on US Arrests data
Analysis), Clustering, and other algorithms Visualization and Visual Proof Biplot and the variables factor map
to work with unsupervised machine
learning tasks where the target variable is PCA in Action II
not known or defined. Applying what you’ll Eigenfaces
learn from this workshop, you will be PCA on credit loan data
tasked to develop an anomaly detection or Deconstruction and Reconstructing Faces with PCA
an e-commerce product recommendation Principal Components by hand
model that can be related to real-life
business scenarios.
Module 2: k-Means Clustering
We strongly recommend that you
complete the pre-requisite courses prior Understanding Clustering k-Means Clustering
to taking this course. Some concepts Centroid-based Clustering in Action
presented throughout the lecture may be Algorithms Cluster-based Product
less-than-ideal for practitioners who are The k-Means Procedure Recommendation
new to the field of machine learning. Mathematical Details Scaling and Implementation Details
Visualizing Clusters
Evaluating k-Means
Between sum-of-squares
For more information: Within sum-of-squares
https://algorit.ma/course/unsupervised-ml Combining k-Means with PCA

47
45 46
48
Academy Modules
Graded Quiz

Learning-by-Building Module (3 points)


Diving into Wholesale Transactions
Using any of the two unsupervised learning algorithms
you’ve learned, produce a simple R markdown document
where you demonstrate an exercise of either clustering or
dimensionality reduction on the wholesale.csv data
provided to you

Digging Deep into NYC Property Sales


Using any of the two unsupervised learning algorithms
you’ve learned, produce a simple R markdown document
where you demonstrate an exercise of either clustering or
dimensionality reduction on the nyc data provided to you

Explain your choice of parameters (how you choose k for


k-means clustering, or how you choose to retain n number
of dimensions for PCA) from the original data. What are
some business utility for the unsupervised model you’ve
developed? The R Markdown document should be not
longer than 4 paragraph, and contain one or two
visualization.

47
49 50
48
Time Series &
Forecasting

TS&F
4-day
3-day workshop

Module 1: Time Series I


Working with Time Series Time Series in Action
Application of Time Series Indonesia's gas emissions,
Decomposition of time series allows us to Definition of a ts object 1970-2012
learn about the underlying seasonality, Functions to work with Frequency, Start and End
trend and random fluctuations in a timeseries Time Series Plots
systematic fashion. In this workshop, we
learn the methods to account for
seasonality and trend, work with Classical Decomposition Classical Decomposition
autocorrelation models and create Trend, Seasonality and in Action
industry-scale forecasts using modern Residuals Monthly Airline Passenger,
tools and frameworks. Understanding Lags 1949-1960
Additive vs Multiplicative The decompose function
We strongly recommend that you Understading Smoothing
complete the pre-requisite workshops
prior to taking this course. Some concepts
presented throughout the lecture may be Techniques to work with Time Series
less-than-ideal for practitioners who have Adjusting for Seasonality
not completed the pre-requisite courses. Detrending
Decomposing Non-Seasonal Time Series

For more information:


https://algorit.ma/course/forecasting

51
49 50
52
Module 2: Forecasting
Forecasting I
Simple Moving Average
Simple Moving Average from First Principles
Log-transformation
Academy Modules
Forecasting II Graded Quiz
Forecasting using One-sided SMA
Forecasting using Exponential Smoothing
Holt's Exponential Smoothing
Learning-by-Building Module (3 points)
Forecasting the Crime rate in Chicago
Forecasting III Download the dataset from Chicago Crime Portal, and use a sample of
The beta and gamma coefficients these data to build a forecasting project where you inspect the seasonality
Mathematical Details and trend of crime in Chicago. Submit you project in the form of a RMD
Holt-Winters Exponential Smoothing format, and address the following question:
Is crime generally rising in Chicago in the past decade (last 10 years)?
Advanced Time Series Is there a seasonal component to the crime rate?
ACF and PACF Which time series method seems to capture the variation in your time
ARMA and ARIMA Models series better? Explain your choice of algorithm and its key assumptions
Stationarity and Differencing
Student should awarded the full (3) points if they address at least 2 of the
above questions.
Advanced Time Series II
Augmented Dickey-Fuller (ADF) test
Seasonal ARIMA
Tips to work with xts
Facebook's Prophet
Quantmod for quantitative traders

54
51
53 52
Neural Network &
Deep Learning
3-day workshop
4-day

Module 1: Neural Network


Artificial Neural Networks Neural Network Architecture

NN&DL
The biological brain inspiration Layers, Nodes and Signals
Cost function Network Topology
The building blocks of Direction of signal
neural networks

Neural Network Multi-Layer Perceptrons (MLP)


Architecture II Backpropagation of error
Hidden Layers Feed-forward vs Recurrent
Develop artificial neural networks that can Computing with Neural Network Mathematical Details
recognize face, handwriting patterns and Mathematical Details
are at the core of some of the most
cutting-edge cognitive models in the AI Module 2: Deep Learning
landscape. We will learn to create a
backpropagation neural network from
Neural Networks from Neural Networks from Scratch
scratch, and use our neural network for
classification tasks. This class is the final First Principles Gradient Descent by hand
Sum of Squared Errors Neural Network by hand
course in the Machine Learning
Cross-Entropy Error Learning Rate and Implementation
Specialization.
The Gradient Descent Algorithm Details
We strongly recommend that you
complete the pre-requisite workshops Neural Networks in Action Deep Learning in Action
prior to taking this course. Some concepts Putting it all together Theorizing with Effect of Depth
presented throughout the lecture may be Parameterization and Practical Activation Functions
less-than-ideal for practitioners who have Advice Visualizing Logarithmic Loss
not completed the pre-requisite courses. Deep Learning for Classification
and Regression

Deep Learning in Action II MXNet in Action


Predicting Bank Telemarketing Thining about Parallelism
For more information: Campaign MNIST handwritten digit recognition
https://algorit.ma/course/neuralnet Visualizing tricks for Deep Neural Predictions with MXNet and Practical
Networks Advice
Parameterization and Practical
Advice
53
55 56
54
Academy Modules
Graded Quiz

Learning-by-Building Module (3 points)


Image Classification using Neural Network
Build a neural network capable of classifying images into
one of many classes and explain the choice of your
architecture. Test your neural network using unseen
images – can your algorithm correctly classify 80%
of images?

55
57 58
56
LEARN
Machine Learning
Capstone Project

MORE
After having learned various machine learning
methods and its application, students are required
to choose one project that challenge them to
construct an optimal model from the dataset given.
The selection of methods include Forecasting,
Regression and Classification.

Marks of the project is out of 24 points, the rubrics


for assessment and grading will be discuss in the class.

AND

DIVE
DEEPER
59
57 58
60
learn data science by building

PYTHON FOR
DATA ANALYSTS

DATA ANALYTICS
SPECIALIZATION
EXPLORATORY
DATA ANALYSIS

DATA WRANGLING &


GROUP BY
AGGREGATION DATA ANALYTICS SPECIALIZATION
Light in theory and perfect for beginners/non-
programmers who are looking to learn data analysis.
The course has an easier learning curve and takes a
SQL
SQL &
& DATA
DATA
VISUALIZATION
VISUALIZATION more accessible approach by getting participants to
WITH PANDAS
understand the “how” part first, rather than a detailed
IN PYTHON

breakdown of the “why”. It is modeled after real-world


Analytics apps, with elements of storing and querying
from SQL, preprocessing with pandas, reshaping data
and producing Visualization.

In this program, students will learn how to use Python


programming language for Data Analysis where
students are prompted to write short snippets of code
in frequent intervals, before being offered an
explanation on the underlying theoretical frameworks.

https://algorit.ma/data-analytics-specialization/
Python for
PYTHON FOR
Data Analysts
DATA ANALYSTS
5-day workshop

Module 1: Python for Data Analysis

P4DA
Python Programming Basic
Working with Jupyter Notebook
PYTHON FOR
Python Syntaxes and Jargons
DATA ANALYSTS

Module 2: Introduction to DataFrames


Importing pandas Library
In this 15-hour course, we will cover a
EXPLORATORY
DATA ANALYSIS comprehensive practice in applying Reading csv Data
Python’s data analysis library: pandas. Python Data Types
This library is the core member of many Data Frame Structure
Python-based scientific computing
environments. You will be guided in a
DATA WRANGLING &
GROUP BY
gentle introduction to Python Module 3: Exploratory Data Analysis Tools I
AGGREGATION
programming using Jupyter Notebook.
Understanding Data Frame Attributes
Upon the completion of this course, you Categorical and Numerical Variables
will be familiar with Python programming Using panda’s Built-in Statistics Summary
SQL
SQL &
& DATA
DATA
and utilizing pandas for simple exploratory Indexing and Subsetting in pandas
VISUALIZATION
VISUALIZATION
WITH PANDAS
IN PYTHON data analysis.

Module 4: Learn-by-Building
Use a small sample of a larger CRM dataset to perform exploratory
data analysis process. Perform what you have learned using pandas!
Try and extract insights from the data in order to understand which
customers offer a better value proposition to the business.

61 62
Exploratory
EXPLORATORY
Data Analysis
DATA ANALYSIS
4-day workshop

Module 1: Exploratory Data Analysis Tools II

EDA
Frequency Table in pandas
Higher Dimensional Table
Data Aggregation
Using Pivot Table
PYTHON FOR
DATA ANALYSTS

Module 2: Working with Data Types


Get more in-depth on exploratory data Working with Date Time Data
analysis practice you can perform using
EXPLORATORY
DATA ANALYSIS
Working with Categorical Data
pandas in this 12-hour course. Pick up the
essential exploratory tools in this library to
cover more statistical capabilities of Module 3: Dealing with Untidy Data
pandas. We will also guide you through
DATA WRANGLING &
GROUP BY
one of the most demanding, yet important Not a Number (NaN)
AGGREGATION
process in data analytics: data cleansing. Checking NaN Values
Upon completion of this course, you will
Missing Values Treatment
uncover more possibilities in working with Removing Duplicate Values
SQL
SQL &
& DATA
DATA
data using pandas.
VISUALIZATION
VISUALIZATION
WITH PANDAS
IN PYTHON
Module 4: Learn-by-Building
Use item dataset listed in a popular e-commerce website to perform
the exploratory data analysis process you have learned. We will explore
different product categories in terms of various price and scale range.
This module was gathered as part of a larger research work where the
analyst wanted to study the price convergence of Indonesia essential
household items.

63 64
Data Wrangling &
DATA WRANGLING &
GROUP BY
Group by Aggregation
AGGREGATION
4-day workshop

Module 1: Data Wrangling and Reshaping

DW&GA
Stack and Unstack
Working with Multiindex Data Frame
Data Melt
Using Group By
PYTHON FOR
DATA ANALYSTS

Module 2: Learn-by-Building
Reshaping data is an important Use stock dataset to compare a stock’s price variance. Use data
component of any data wrangling toolkit
EXPLORATORY
DATA ANALYSIS wrangling techniques you have learned to answer the questions
as it allows the analyst to “massage” the related to stock data analysis. Find out which stock has the lowest
data into the desired shape for further volume, most funded, and has the highest closing price difference
processing. each day on average!

DATA WRANGLING &


GROUP BY
As we go through a comprehensive
AGGREGATION
process in uncovering pandas capabilities,
we will learn more adept techniques in
working with data in this 12-hour course.

SQL
SQL &
& DATA
DATA
VISUALIZATION
VISUALIZATION
WITH PANDAS
IN PYTHON

65 66
SQL & Data Visualization
SQL
SQL &
& DATA
DATA
VISUALIZATION
VISUALIZATION
with Pandas
WITH PANDAS
IN PYTHON
4-day workshop

Module 1: Data Visualization in Python


Grammar of Graphics
Using matplotlib

SQL&DVWP
Plotting Using pandas Object
PYTHON FOR
DATA ANALYSTS

Module 2: Working with SQL Database


Creating Database Connection
This 12-hour course guide students in using SQL Basic Queries
Python’s grammar of graphic for data
EXPLORATORY

SQL for Table Join


DATA ANALYSIS

visualization. We will also cover important


data analytics stack, i.e. SQL and how to SQL for Conditional Statement
integrate it to Python.

DATA WRANGLING &


GROUP BY
Upon completion of this course, students will Module 3: Learn-by-Building
AGGREGATION
be able to gain proficiency for an end-to-end
process for data analysis using Python. Use a music store's operations database to query all sales relating to
particular artists or albums. Using the relational schema of the
database, we will learn how to query invoice data from the database
in order to analyze the store’s top valuable customers. We will also
apply the visualization techniques we have learned to produce an
SQL
SQL &
& DATA
DATA
VISUALIZATION
VISUALIZATION
WITH PANDAS
IN PYTHON
accompanying chart for the analysis.

67 68
ENROLL NOW TO OUR ACADEMY!
bit.ly/algo_academy

“Learning how to do
data science is like
learning to ski.

How to Apply: You have to do it.”


Go to bit.ly/algo_academy
1 (or scan the QR CODE) ~ Claudia Perlich,
Chief Scientist, Dstillery.
2 Click ENROLL NOW! and fill in the form.

One of our Education Consultants will


3 reach out to you in 1x24 hours (working days).

4 Arrange payment

5 Congratulations! You're on your way to start


your Data Science journey!

You might also like