Professional Documents
Culture Documents
ACADEMY
SYLLABUS
learn data science by building
community@algorit.ma
w w w.algorit.ma
+62 877-7835 -3007
teamalgoritma
Team Algoritma
teamalgoritma
teamalgoritma
Our Journey Our Core Products
Modern workforces are under-equipped to solve the kind of problems we face as a collective in Data Science Academy
the digital era. While the complexity and enormity of data have grown explosively in the past few
years, the tools we use to solve these problems have seen only incremental improvements every Non-stop, highly practical pearning. Takes a learn by building approach in 14 modules
decade or so. and 2 capstone projects to become an employable data scientist.
Data science, the field dedicated to the study of insights extraction, statistical modeling, and artificial ACADEMY REGULAR ACADEMY FULL-STACK
intelligence, can empower professionals in profound ways and it will continue to play an increasingly Bootcamp using R in just 11 weeks Bootcamp using R, Python, & SQL
integral role in our interaction with technology. in just 15 weeks
COURSES:
We are founded with the vision of democratizing data science skills and equip every professional - Data Visualization Specialization COURSES:
with a set of core skills across the various domains of data visualization, regression, data modeling, - Machine Learning Specialization - Data Visualization Specialization
machine learning, and statistical programming literacy. Whether you’re a marketing executive, a - 2 Capstone Projects - Machine Learning Specialization
business analyst, an entrepreneur, or a financial market professional, we want to help you be a - Data Science Communications & - Data Analytics Specialization
rockstar and a highly effective executive. Presentation Workshop - 2 Capstone Projects
- Demo Day Coaching - Data Science Communications &
Algoritma’s pedagogical excellence is recognized regionally, an achievement backed by our Presentation Workshop
illustrious track record in year-long corporate consulting projects and other shorter consultative, DURATION & FREQUENCY: - Demo Day Coaching
customized training engagements. 4-5 days a week for 3 months
Day Class: 13:00 – 16:00 DURATION & FREQUENCY:
Night Class: 18:00 – 21:00 4-5 days a week for 4 months
Night Class: 18:00 – 21:00
- Algoritma was founded in Jakarta by
Samuel Chan and Nayoko Wicaksono - First Corporate Training, PT Salam
Pacific Indonesia Lines
- Our first workshop: Kickstart Your Data Career Support & Algoritma Data Career Day
Science Career at Rework, Setiabudi After graduating from Algoritma Data Science Academy Program, students from each cohort
June 2017 October 2017 who have fulfilled the criteria can join the Data Career Day. This event is held for students to
be able to showcase their project to the attendees which consist of our Hiring Partners,
Corporate Clients, and the public.
- Student Track was initiated for the first The student will be given a 4-week period after they finished their project, for them to prepare
- Algoritma Data Science Academy time, Algoritma gave scholarship for for Data Career Day. During Data Career Day, participants will also get the chance to be
launched for the very first time. chosen students from 6 universities
interviewed by our Hiring Partners in the Speed Dating session.
Students are taught to learn data (UI, UGM, ITB, Telkom University,
science using R programming Prasetya Mulya, and Bina Nusantara
language in 3 months University) to learn data science every
January 2018 May 2018 Saturdays in the Algoritma Weekend
Track program Other Workshops & Events
Kickstart Series
KICKSTART A collection of 3-4 hours introductory seminar for professionals
- Algoritma held the very first Data - Data Analytics Specialization is
introduced to the public. Students
SERIES who are curious about what data science is and how it impacts
Career Day. 6 chosen alumni from
the first and second cohort (Agon
and Bifrost) showcased their project
NEW will learn how to use Python and SQL
for Data Analysis in one month, using
their career and company.
1 2
l el a
e ranr nd d
a taat as csicei n
enc ec eb b
y yb b
uui l idl d
i ni n
gg
DATA
DATA VISUALIZATION
VISUALIZATION SPECIALIZATION
SPECIALIZATION
AAfun,
fun,hands-on,
hands-on,and
andproject-based
project-basedspecialization
specialization
that
thathelps
helpsstudent
studentgain
gainfull
fullproficiency
proficiencyin
indata
data
visualization
visualizationsystems
systemsand
andtools.
tools.Create
Createcompelling
compelling
narratives
narrativesby
bycombining
combiningcharting
chartingelements
elementswith
with
custom
customaesthetics
aestheticsunder
underthe
theguidance
guidanceofof
our
ourinstructors.
instructors.
The
Thelearn-by-building
learn-by-buildingmodule
modulein inall
allthe
theworkshops
workshops
follows
followsour
ourproject-based
project-basedlearning
learningphilosophy
philosophyto
to
this
thisspecialization.
specialization.The
Thecourse
coursecapstone
capstonerequires
requires
that
thatthe
thestudent
studentbuild
buildaareal-world
real-worldapplication
applicationunder
under
stringent
stringentcriteria
criteriamodeled
modeledafter
afterreal
real
business
businessscenarios.
scenarios.
https://algorit.ma/data-visualization-specialization/
https://algorit.ma/data-visualization-specialization/
Programming for
Data Science
3-day workshop
P4DS
Module 1: Data Science in R
Data Science in R Working with Data
R
R Programming
Programming Basics
Basics Reading
Reading &
& Extracting
Extracting Data
Data
Why
Why Learn
Learn R?
R? Understanding
Understanding Statistics
Statistics
R
R Studio
Studio Interface
Interface Exploratory
Exploratory Data
Data Analysis
Analysis
Data
Data Structures
Structures in
in R
R
Programming
Programming for for Data
Data Science
Science isis aa course
course
that cover the important programming
that cover the important programming Data Manipulation
paradigms
paradigms andand tools
tools used
used by
by data
data Working
Working with
with your
your Global
Global Environment
Environment
analysts
analysts and data scientists today.
and data scientists today. YouYou Getting
Getting familiar
familiar with
with your
your Workspace
Workspace
will
will be
be guided
guided through
through aa series
series of
of coding
coding Continuous
Continuous and
and Categorical
Categorical Data
Data
exercises
exercises designed
designed to
to maximize
maximize youryour
familiarity
familiarity with data science programming
with data science programming
in
in RStudio,
RStudio, an
an integrated
integrated development
development
environment
Module 2: Data Manipulation
environment for the
for the statistical
statistical computing
computing
language
language R.R.
Data Manipulation II Practical Data Cleansing
Vector
Vector Types
Types and
and Classes
Classes The
The Data
Data Transformation
Transformation Process
Process
Upon
Upon completion
completion of of this
this workshop,
workshop, youyou
List
List and
and Objects
Objects Reproducible
Reproducible Data
Data Science
Science Projects
Projects
will be familiar with the programming
will be familiar with the programming
Matrix
Matrix and
and Data
Data Frames
Frames Reading
Reading and
and Writing
Writing from
from your
your IDE
IDE
language,
language, popular
popular tools,
tools, libraries
libraries (data
(data
science
science packages) and tool kits
packages) and tool kits required
required
to
to excel
excel in
in your
your data
data analysis
analysis and
and R in Practice
statistical
statistical computing projects.
computing projects. Programming
Programming Exercise:
Exercise: e-Commerce
e-Commerce Retail
Retail Datasets
Datasets
In-depth
In-depth review
review of
of Data
Data Frame
Frame subsetting
subsetting
Sampling
Sampling and
and Randomization
Randomization
For more information:
Cross-Tabulations
Cross-Tabulations
https://algorit.ma/course/programming-
Aggregations
Aggregations
for-data-science
57 86
Academy Modules
Graded Quiz
Working with R
R Scripts and Functions
R Markdown
Why Care about Reproducibility
7
9 10
8
Practical
Statistics
2-day workshop
PS
Module 1: Descriptive Statistics
5-Number Summary Central Tendency & Variability
Mean, Median and Mode Probability Distribution Function
Measures of Central Tendency Visualizing Central Tendency
Quantiles in R Variation, Variance and Covariance
Pave the statistical foundation for more
advanced machine learning theories later Standard Score and z-Score
on in the specialization by picking up the Standard Normal Curve
key ideas in statistical thinking. Learn to Central Limit Theorem
interpret correlations, construct z-Score Calculation & Student's T-test
confidence intervals and other statistical
principles that form the basis of many
common machine learning models.
Module 2: Inferential Statistics
The 2-day course is optional for
participations of the Data Science and Probabilities Intervals
Machine Learning Specialization and Probability Mass Function Confidence Intervals
intended for learners without prior Probability Density Function Prediction Intervals
experience in statistics. Expected Values p-Values
11
9 10
12
Academy Modules
Tips & Techniques: R for Statisticians
Density Plots
Interpreting Box Plots (Box-and-Whisker)
Better summary statistics with skimr()
Pais Matrix
11
13 14
12
Data Visualizaiton
in R
3-day workshop
4-day
DVinR
Base Plotting I Base Plotting II
Plots and Lines Histograms and Curves
Built-in Plot Types Cleveland's Dot Plot
Legends and Annotations Axis, Titles, Subtitles and Panel Styles
Other built-in Plotting The notorious pie chart
Functionalities
13
15 16
14
Academy Modules
Project: Mining Trending Videos on YouTube
Hands-on data visualization
Identifying temporal patterns in trending videos
Combining aesthetics and geometries
15
17 18
16
Interactive Plotting &
Web Dashboard
Dashboards
3-day
4-day workshop
IP&WD
Module 1: Interactive Visualization
Working with Plotly Publication & Layout Options
Refresher on dplyr Multiple Plots Arrangement
ggplotly function More export functions
Visualization as a HTML widget Subplots
Building on the foundation from previous
Range Slider and other Tips and Techniques for Layouts
classes, we will create a series of
interactivity
interactive plots and gadgets that renders
multiple visualization elements based on
user’s input. This is the final workshop
leading up to the data visualization Module 2: Web Dashboard Development
capstone project.
Flex Dashboard Interactive Document
The 4-day
3-day course follows our Creating Flex Dashboard from Inputs and Outputs
learn-by-building approach, in that RStudio The renderPlot() function
students are tasked to reproduce a series Layouts Embedded Application
of plots applying what they’ve learned. It Hands-on Practice: Text, Plots, Demonstration and Practical Advice
covers an exhaustive list of techniques Tables
that add interactivity to an R document Demonstration and Practical
and set the stage for the data science Advice
capstone project.
Shiny Web App
Shiny Dashboard
Tabs and Pagination
For more information:
UI, Server and Shiny Functions
https://algorit.ma/course/web-dashboards
Custom Styles, Structure
17
19 20
18
Academy Modules
Tips on Web Dashboard Deployment
Working with live data
App deployment solutions
Tips for live dashboard performance
19
21 22
20
Data Visualization
VISUA
LIZE
LIZEY
Capstone Project
After having learned and explored appropriate
techniques on visualizing data, students are
required to deploy an interactive dashboard web
OURSU
application using a shiny server which contains
YOUR
any plotting objects such as ggplot and/or leaflet
that display useful insights. In addition, students
are given the freedom to use their own dataset
or past datasets from previous classes.
CCESS
for assessment and grading will be discuss in the class.
SUCCESS
Then
take action
23
21 22
24
learn data science by building
P4DS
Module 1: Data Science in R
Data Science in R Working with Data
R Programming
R Programming Basics
Basics Reading &
Reading & Extracting
Extracting Data
Data
Why Learn
Why Learn R?
R? Understanding Statistics
Understanding Statistics
R Studio
R Studio Interface
Interface Exploratory Data
Exploratory Data Analysis
Analysis
Data Structures
Data Structures in
in R
R
Programming for
Programming for Data
Data Science
Science isis aa course
course
that cover the important programming
that cover the important programming Data Manipulation
paradigms and
paradigms and tools
tools used
used by
by data
data Working with
Working with your
your Global
Global Environment
Environment
analysts and data scientists today.
analysts and data scientists today. You You Getting familiar
Getting familiar with
with your
your Workspace
Workspace
will be guided through a series of coding
will be guided through a series of coding Continuous and
Continuous and Categorical
Categorical Data
Data
exercises designed
exercises designed toto maximize
maximize your
your
familiarity with data science programming
familiarity with data science programming
in RStudio,
in RStudio, an
an integrated
integrated development
development
environment for the statistical computing
environment for the statistical computing
Module 2: Data Manipulation
language
language R.R.
Data Manipulation II Practical Data Cleansing
Vector Types
Vector Types and
and Classes
Classes The Data
The Data Transformation
Transformation Process
Process
Upon completion
Upon completion ofof this
this workshop,
workshop, you
you
List and
List and Objects
Objects Reproducible Data
Reproducible Data Science
Science Projects
Projects
will be familiar with the programming
will be familiar with the programming
Matrix and
Matrix and Data
Data Frames
Frames Reading and
Reading and Writing
Writing from
from your
your IDE
IDE
language, popular
language, popular tools,
tools, libraries
libraries (data
(data
science packages) and tool kits required
science packages) and tool kits required
to excel
to excel in
in your
your data
data analysis
analysis and
and R in Practice
statistical computing projects.
statistical computing projects. Programming Exercise:
Programming Exercise: e-Commerce
e-Commerce Retail
Retail Datasets
Datasets
In-depth review
In-depth review of
of Data
Data Frame
Frame subsetting
subsetting
For more information: Sampling and
Sampling and Randomization
Randomization
https://algorit.ma/course/programming- Cross-Tabulations
Cross-Tabulations
for-data-science Aggregations
Aggregations
25
27 28
26
Academy Modules
Graded Quiz: Working with R
R Scripts and Functions
R Markdown
Why Care about Reproducibility
29
27 28
30
Practical
Statistics
2-day workshop
PS
Module 1: Descriptive Statistics
5-Number Summary Central Tendency & Variability
Mean, Median and Mode Probability Distribution Function
Measures of Central Tendency Visualizing Central Tendency
Quantiles in R Variation, Variance and Covariance
Pave the statistical foundation for more
advanced machine learning theories later Standard Score and z-Score
on in the specialization by picking up the Standard Normal Curve
key ideas in statistical thinking. Learn to Central Limit Theorem
interpret correlations, construct z-Score Calculation & Student's T-test
confidence intervals and other statistical
principles that form the basis of many
common machine learning models.
Module 2: Inferential Statistics
The 2-day course is optional for
participations of the Data Science and Probabilities Intervals
Machine Learning Specialization and Probability Mass Function Confidence Intervals
intended for learners without prior Probability Density Function Prediction Intervals
experience in statistics. Expected Values p-Values
31
29 30
32
Academy Modules
Tips & Techniques: R for Statisticians
Density Plots
Interpreting Box Plots (Box-and-Whisker)
Better summary statistics with skimr()
Pais Matrix
13
31 32
14
Regression
Models
4-day
3-day workshop
RM
Module 1: Regression Models I
OLS Regression Linear Models in R
Understanding Least Squares Understanding Coefficients
Outliers I Plotting Regression
Simple Linear Regression Model Construction
35
33 34
36
Academy Modules
Graded Quiz
37
35 36
38
Classification in
Machine Learning I
CIML1
4-day
3-day workshop
39
37 38
40
Academy Modules
Graded Quiz
Module 2: Learning-by-Building Module (3 points)
Nearest Neighbours Algorithm
Logistic Regression on Credit Risk
Applying what you’ve learned, present a simple R Markdown
Closer Look at Classification document in which you demonstrate the use of logistic regression on
Probabilties vs Class Response
the lbb_loans.csv dataset. Explain your findings wherever necessary and
Confusion Matrix
show the necessary data preparation steps. To help you through the
Sensitivity, Specificity & Precision
exercise, consider the following questions throughout the document:
k-NN in Action How do we correctly interpret the negative coefficients obtained from
Characteristics of k-NN your logistic regression?
Positives and Negatives How do we know which of the variables are more statistically significant
Diagnosing Breast Cancer with k-NN as predictors?
What are some strategies to improve your model?
39
41 42
40
Classification in
Machine Learning II
4-day
3-day workshop
CIML2
Law of Probability Naive Bayes Classifier
Dependent and Independent Characteristics of a Naive
Events Bayes Classifier
Bayes Theorem The "naive" assumptions
Formula for Posterior Customer Churn example
Probability
43
41 42
44
Academy
Academy Modules
Modules
Graded
Graded Quiz
Quiz
Learning-by-Building
Learning-by-Building Module
Module (3
(3 points)
points)
Identifying
Identifying Risky
Risky Bank
Bank Loans
Loans
Use
Use any
any ofof the
the 33 classification
classification algorithms
algorithms you’ve
you’ve learned
learned
in
in this
this lesson
lesson toto predict
predict the
the risk
risk status
status of
of aa bank
bank loan.
loan.
The
The variable
variable default
default in
in the
the dataset
dataset indicates
indicates whether
whether the
the
applicant
applicant diddid default
default on
on the
the loan
loan issued
issued by
by the
the bank.
bank.
Use
Use an
an RR Markdown
Markdown document
document to
to lay
lay out
out your
your process,
process,
and
and explain
explain the
the methodolody
methodolody in
in 11 or
or 22 brief
brief paragraph.
paragraph.
The
The student
student should
should be
be awarded
awarded the
the full
full (3)
(3) points
points when:
when:
The
The preprocessing
preprocessing steps
steps are
are done,
done, and
and the
the student
student
show
show anan understanding
understanding of of holding
holding outout aa test
test // cross
cross
validation
validation set
set for
for an
an estimate
estimate of of the
the model’s
model’s
performance
performance on on unseen
unseen data
data
The
The model’s
model’s performance
performance is is sufficiently
sufficiently explained
explained
(accuracy
(accuracy may
may notnot be
be the
the most
most helpful
helpful metric
metric here!
here!
Recall
Recall about
about what
what you’ve
you’ve learned
learned regarding
regarding specificity
specificity
and
and sensitivity)
sensitivity)
The
The student
student demonstrated
demonstrated extra extra effort
effort inin evaluating
evaluating
his/her
his/her model,
model, and
and proposes
proposes waysways toto improve
improve the the
accuracy
accuracy obtained
obtained from
from the
the initial
initial model
model
45
43 44
46
Unsupervised
Machine Learning
4-day
3-day workshop
UML
Background Principal Component
Understanding Unsupervised Analysis
Learning Rethinking about Covariances
The "dimensionality" problem The Case for PCA
Industrial Use of PCA Eigenvalues and Eigenvectors
PCA from First Principles PCA in Action
Just enough Matrix Algebra Dubious Property Sales in NYC
Learn PCA (Principal Component Mathematical Proof PCA on US Arrests data
Analysis), Clustering, and other algorithms Visualization and Visual Proof Biplot and the variables factor map
to work with unsupervised machine
learning tasks where the target variable is PCA in Action II
not known or defined. Applying what you’ll Eigenfaces
learn from this workshop, you will be PCA on credit loan data
tasked to develop an anomaly detection or Deconstruction and Reconstructing Faces with PCA
an e-commerce product recommendation Principal Components by hand
model that can be related to real-life
business scenarios.
Module 2: k-Means Clustering
We strongly recommend that you
complete the pre-requisite courses prior Understanding Clustering k-Means Clustering
to taking this course. Some concepts Centroid-based Clustering in Action
presented throughout the lecture may be Algorithms Cluster-based Product
less-than-ideal for practitioners who are The k-Means Procedure Recommendation
new to the field of machine learning. Mathematical Details Scaling and Implementation Details
Visualizing Clusters
Evaluating k-Means
Between sum-of-squares
For more information: Within sum-of-squares
https://algorit.ma/course/unsupervised-ml Combining k-Means with PCA
47
45 46
48
Academy Modules
Graded Quiz
47
49 50
48
Time Series &
Forecasting
TS&F
4-day
3-day workshop
51
49 50
52
Module 2: Forecasting
Forecasting I
Simple Moving Average
Simple Moving Average from First Principles
Log-transformation
Academy Modules
Forecasting II Graded Quiz
Forecasting using One-sided SMA
Forecasting using Exponential Smoothing
Holt's Exponential Smoothing
Learning-by-Building Module (3 points)
Forecasting the Crime rate in Chicago
Forecasting III Download the dataset from Chicago Crime Portal, and use a sample of
The beta and gamma coefficients these data to build a forecasting project where you inspect the seasonality
Mathematical Details and trend of crime in Chicago. Submit you project in the form of a RMD
Holt-Winters Exponential Smoothing format, and address the following question:
Is crime generally rising in Chicago in the past decade (last 10 years)?
Advanced Time Series Is there a seasonal component to the crime rate?
ACF and PACF Which time series method seems to capture the variation in your time
ARMA and ARIMA Models series better? Explain your choice of algorithm and its key assumptions
Stationarity and Differencing
Student should awarded the full (3) points if they address at least 2 of the
above questions.
Advanced Time Series II
Augmented Dickey-Fuller (ADF) test
Seasonal ARIMA
Tips to work with xts
Facebook's Prophet
Quantmod for quantitative traders
54
51
53 52
Neural Network &
Deep Learning
3-day workshop
4-day
NN&DL
The biological brain inspiration Layers, Nodes and Signals
Cost function Network Topology
The building blocks of Direction of signal
neural networks
55
57 58
56
LEARN
Machine Learning
Capstone Project
MORE
After having learned various machine learning
methods and its application, students are required
to choose one project that challenge them to
construct an optimal model from the dataset given.
The selection of methods include Forecasting,
Regression and Classification.
AND
DIVE
DEEPER
59
57 58
60
learn data science by building
PYTHON FOR
DATA ANALYSTS
DATA ANALYTICS
SPECIALIZATION
EXPLORATORY
DATA ANALYSIS
https://algorit.ma/data-analytics-specialization/
Python for
PYTHON FOR
Data Analysts
DATA ANALYSTS
5-day workshop
P4DA
Python Programming Basic
Working with Jupyter Notebook
PYTHON FOR
Python Syntaxes and Jargons
DATA ANALYSTS
Module 4: Learn-by-Building
Use a small sample of a larger CRM dataset to perform exploratory
data analysis process. Perform what you have learned using pandas!
Try and extract insights from the data in order to understand which
customers offer a better value proposition to the business.
61 62
Exploratory
EXPLORATORY
Data Analysis
DATA ANALYSIS
4-day workshop
EDA
Frequency Table in pandas
Higher Dimensional Table
Data Aggregation
Using Pivot Table
PYTHON FOR
DATA ANALYSTS
63 64
Data Wrangling &
DATA WRANGLING &
GROUP BY
Group by Aggregation
AGGREGATION
4-day workshop
DW&GA
Stack and Unstack
Working with Multiindex Data Frame
Data Melt
Using Group By
PYTHON FOR
DATA ANALYSTS
Module 2: Learn-by-Building
Reshaping data is an important Use stock dataset to compare a stock’s price variance. Use data
component of any data wrangling toolkit
EXPLORATORY
DATA ANALYSIS wrangling techniques you have learned to answer the questions
as it allows the analyst to “massage” the related to stock data analysis. Find out which stock has the lowest
data into the desired shape for further volume, most funded, and has the highest closing price difference
processing. each day on average!
SQL
SQL &
& DATA
DATA
VISUALIZATION
VISUALIZATION
WITH PANDAS
IN PYTHON
65 66
SQL & Data Visualization
SQL
SQL &
& DATA
DATA
VISUALIZATION
VISUALIZATION
with Pandas
WITH PANDAS
IN PYTHON
4-day workshop
SQL&DVWP
Plotting Using pandas Object
PYTHON FOR
DATA ANALYSTS
67 68
ENROLL NOW TO OUR ACADEMY!
bit.ly/algo_academy
“Learning how to do
data science is like
learning to ski.
4 Arrange payment