Professional Documents
Culture Documents
David Chappell
PRINCIPAL, CHAPPELL & ASSOCIATES
@DChappellAssoc www.davidchappell.com
Machine Training data
Learning Supervised and unsupervised learning
Concepts
Classifying machine learning problems
and algorithms
Training a model
Testing a model
Using a model
Terminology
The prepared data The value you want to The value you want to
used to create a predict is in the predict is not in the
model training data training data
Creating a model is The data is labeled The data is unlabeled
called training a model
Most common
Data Preprocessing with Supervised Learning
Features Target Value
Data 2) Create
Preprocessing training
Module(s) data
Data Source 2
...
Training Data
100011010011
110111110110
Data Source N
Available Data Preprocessing Modules
Categorizing Machine Learning Problems:
Regression
Cluster 2
For unsupervised learning
Example question:
Cluster 1 Cluster 3
o What are our customer
segments?
Styles of Machine Learning Algorithms:
Examples
P(A) P(B|A)
P(A|B) =
P(B)
Chosen
Candidate
Learning
Model
2) Input training data Algorithm 3) Generate
(75% of all data for candidate
features 1, 3, and 6) model
Training Data
Testing a Model with Supervised Learning
Target
Feature 1 Feature 3 Feature 6 Value
Candidate
Model
1) Input test data 2) Generate
(remaining 25% of target values
all data for features from test data
1, 3, and 6)
Training Data
3) Compare target values
generated from test data with
actual target values
Improving a Model: Some Options
1) Choose different features
Chosen
Learning Candidate
Algorithm Model
Model