You are on page 1of 90

hine Learning

w How do organisations come up with and implement new AI features?


Which teams are involved and who does what?
A Real World Example

A new Ridesharing Company


Ridesharing Company
Book a Ride
Ridesharing Company
Book a Ride

Notice
g
somethin
missing?
Finding out the missing feature.
problem customers are

SOCIAL MEDIA

FORUMS APP SUPPORT TEAM


Customer's don't know when they'll Book a Ride

reach their destination.

?
?
going to get involved?

CEO CTO CFO

DATA SCIENTIST
ML ENGINEER

PRODUCT MANAGER

APP DEVELOPER DATABASE ADMIN


Who's going to get involved?

CEO CTO CFO

DATA SCIENTIST
ML ENGINEER

PRODUCT MANAGER

APP DEVELOPER DATABASE ADMIN


Who's going to get involved?

CEO CTO CFO

ML ENGINEER DATA SCIENTIST

PRODUCT MANAGER

APP DEVELOPER DATABASE ADMIN


The First Step
Formulate the problem statement.

Show a predicted
arrival time of the
user once they
have selected the
destination.

PRODUCT MANAGER
oblem Statement
Unfortunately, its not enough to simply formulate Problem statement
Book a Ride
oblem Statement
Unfortunately, its not enough to simply formulate Problem statement
Book a Ride
oblem statement
Add a Success Criteria

Show a predicted
arrival time of the
user once they
have selected the
destination.
Prediction is 95%
Accurate
PRODUCT MANAGER
ement and Success
nd comes after the current success

Book a Ride
ement and Success
nd comes after the current success

Book a Ride
ement and Success

Book a Ride

Show a predicted
arrival time of the
user once they
have selected the
destination.
Prediction is 95%
Accurate.
PRODUCT MANAGER
Calculated under 1s.
tement and success
n 100K$ comes after current success

Show a predicted
arrival time of the
user once they
have selected the Clock w
destination. 90 % A
tickma
Prediction is 95%
Tonne
Accurate.
CFO s
PRODUCT MANAGER
Calculated under 1s.
tement and success
Needs to include Internal Constraints as well.

Show a predicted
arrival time of the
user once they
have selected the
destination.
Prediction is 95%
Accurate.
PRODUCT MANAGER
Calculated under 1s.
CFO
tement and success
Shouldn’t forget about Internal Constraints as well.

Show a predicted
arrival time of the
user once they
have selected the
destination.
Prediction is 95%
Accurate.
Calculated under 1s.
Should cost the
PRODUCT MANAGER company < 100K$ $
take 3 months to
develop CFO
Should you ML?

Does this
problem even
need Artificial
PM Intelligence? CTO
problem even need
an't predict times So we need ML

Regular
algorithms can’t
predict times, so
PM we need Machine CTO
Learning here.
PM with MLE & DS

Show a predicted
arrival time of the
user once they
have selected the
destination.
Prediction is 95% CTO
Accurate.
Calculated under 1s.
Should cost the
company < 100K$ $
take 3 months to
PM
develop

Data Scientist ML Engineer


a Scientist Takes over.

Show a predicted
arrival time of the
user once they
have selected the
destination.
Prediction is 95%
Accurate.
Calculated under 1s.
Should cost the
company < 100K$ $
take 3 months to
develop
erable for Data

DATA SCIENTIST

MODEL
ble for Data Scientist:

DATA SCIENTIST ML ENGINEER

MODEL
ble for Data Scientist:

DATA SCIENTIST ML ENGINEER

MODEL
ble for Data Scientist:

Book a Ride

DATA SCIENTIST ML ENGINEER

MODEL
Myth: Data Scientist is just a fancy term for a
Software Engineer working in AI

DATA SCIENTIST
Researches and
ML is a relatively new Field

RESEARCH

DATA ENGINEER

EXPERIMENT
Researches and
ML is a relatively new Field

RESEARCH

DATA ENGINEER
DATA SCIENTIST
EXPERIMENT
tement & Succcess

Show a predicted arrival


time of the user once
they have selected the
destination.
Prediction is 95% Accurate.
Calculated under 1s.

Should cost the


company < 100K$ &
take 3 months to
develop
PM
DATA SCIENTIST
reinvent the wheel.

DS Icon rch Icons


ting Res ea
Exis

DATA SCIENTIST
Convert Problem Statement to
Specific AI Category

CLASSIFICATION REGRESSION

DATA SCIENTIST

REINFORCEMENT PREDICTION
Convert Problem Statement to
Specific AI Category

Comes with Experience

Up to date Knowhow of what's


happening in the ML Industry.
PREDICTION
DATA SCIENTIST
Convert Problem Statement to
Specific AI Category

PREDICTION
TTP
DATA SCIENTIST TRAVEL TIME
PREDICTION
Paper compares many ML Algorithms for TTP.

DATA

xgBoost Random Forest Decision Tree LSTM


ALGORITHMS

DATA SCIENTIST

MODELS
Paper compares many ML Algorithms for TTP.
Data Scientist Chooses the best two.

DATA

ALGORITHMS

DATA SCIENTIST

Random Forest LSTM


ature Engineering

FEATURES ALGORITHM
hat are Features?
SUPERVISED MACHINE LEARNING

FEATURES ALGORITHM

os t pro b a b ly, today’s


or M
Trip to Liqu iquor
9 trip to the L
Store took 6 g to take
erday Store is goin
minutes yest
NEW DATA 69 mins.
PAST DATA
PREDICTION
hat are Features?
hould this past data
ature Engineering
ature Engineering
om this and build a

ALGORITHM
ature Engineering
om this and build a

ALGORITHM
ature Engineering
om this and build a

ALGORITHM
ature Engineering
om this and build a

ALGORITHM
Where do we get this data?
Data Acquisition Strategies

Internal External
Internal Data for ML
Ride data from the past.
Internal Data for ML
Ride data from the past.

SERVER LOGS

DATABASE ADMIN

APP DATABASE
Internal Data for ML
Ride data from the past.
SOURCE DESTIN TIMEST TRAVEL
ATION AMP TIME
18/8/2010
Home Office FRI 420 mins
6:00AM
19/8/2010
Office Wine Shop SAT 69 mins
8:00PM
20/8/2010
Wine Shop Hospital SUN 666 mins
1:00AM
21/8/2010
Hospital Home MON 007 mins
5:00PM

DATABASE ADMIN
SOURCE DESTIN TIMEST TRAVEL
ATION AMP TIME
18/8/2010
Home Office FRI 420 mins
6:00AM

19/8/2010
Office Wine Shop SAT 69 mins
8:00PM

20/8/2010
Wine Shop Hospital SUN 666 mins
1:00AM

21/8/2010
Hospital Home MON 007 mins
5:00PM

DATABASE ADMIN
External Data for ML
Data outside the company
External Data for ML
Data outside the company

CFO
Budgeting for Data Acquisition

CFO
Data should Ideally look like
Data should Ideally look like
Data in the Real World
ata Preparation
ata Preparation
Data Processing

SOURCE DESTIN TIMEST TRAVEL DATE TEMP RAIN


ATION AMP TIME FALL
ata Preparation
Removing outliers

DESTIN TIMEST TRAVEL


SOURCE ATION AMP TIME
18/8/2010
Office Wineshop FRI 69 mins
6:00AM

19/8/2010
Office Wine Shop SAT 69 mins
8:00PM

20/8/2010
Office Wine Shop SUN 1666 mins
1:00AM

21/8/2010
Office Wine Shop MON 69 mins
5:00PM
ata Preparation
Data Inconsistency

SOURCE DESTIN TIMEST TRAVEL


ATION AMP TIME
18/8/2010
Office Wine Shop FRI 69 min
6:00AM

Wine Shop 21/8/2010


Office Mon 20 km
6:00AM
ata Preparation
Data Interoperability

SOURCE DESTIN TIMEST TRAVEL DATE TEMP RAIN


ATION AMP TIME FALL
18/8/2010 8/18/2010
FRI 0600 28°C 10mm
Office Wine Shop FRI 69 mins
6:00AM
8/18/2010
FRI 1615 34°C 5 mm
18/8/2010
Office Wine Shop FRI 96 mins
6:15PM
ata Preparation
Looks simple, but takes a long time
Dividing Data into Training & Testing
Dividing Data into Training & Testing

TRAINING
TESTING
Training Data is fed to the Algorithm to make the model

ALGORITHM

MODEL
Training Data is fed to the Algorithm to make the model

MODEL
i m a t i o n o f Tranin
An
ng
disappeari
nd result
Test data a
d a t a a r ro ws int
Test
e l m a k e s a pred
Mo d
o m p a r ed against A
C
o r re c t t u rns green
C
curacy n
Overall Ac
ACCURACY
Hyper Parameter Tuning

ALGORITHM

ACCURACY

MODEL
Training Data is usually HUGE

100s of GBs and even TBs!


Training Data is usually HUGE

100s of GBs and even TBs!


Or we can just use the cloud

100s of GBs and even TBs!


ta is HUGE and takes

100s of GBs and even TBs!

Even then Training can take days or even weeks


After a few weeks

Calendar
again
Accurac
Algo ge
DS cha
new for the IT Sector

Microbiologist Astrophysicist
dels trained
sly
dels trained
sly
M odel 1, Model 2
o n wi th
ls hav ing a tag ic
End Mode
etc.
andidate Models
Show a predicted
arrival time of the
user once they
have selected the
destination.
Prediction is 95%
Accurate.
Calculated under 1s.
Should cost the
company < 100K$ $
take 3 months to
develop
andidate Models
Show a predicted
arrival time of the
user once they
have selected the
destination.
Prediction is 95%
Accurate.
Calculated under 1s.
Should cost the
company < 100K$ $
take 3 months to
develop
Discusses with App

When will data be needed


What should be the format of the
data

APP DEVELOPER
ML ENGINEER
neer Implements

Back End
Front End

ML ENGINEER
MODEL
MLE Makes sure the Model is Performant

Book a Ride

ML ENGINEER
MODEL
MLE Makes sure the Model is Error free

Book a Ride

ML ENGINEER
MODEL
Model needs Traffic & Weather in Realtime

ML ENGINEER
MODEL
But Maps API already being used

ML ENGINEER
MODEL
Multiple Models are Deployed

ML ENGINEER
MODEL MODEL

A/B Testing
Final Apps

ML ENGINEER

MODEL MODEL

CTO Staging & Production


MODEL MODEL

You might also like