Professional Documents
Culture Documents
El aprendizaje automático
permite a las
computadoras aprender e
inferir de los datos.
2
Machine Learning in Our Daily Lives
Spam Filtering
3
Machine Learning in Our Daily Lives
4
Machine Learning in Our Daily Lives
5
Machine Learning in Our Daily Lives
6
Types of Machine Learning
7
Types of Machine Learning
8
Types of Machine Learning
9
Types of Supervised Learning
10
Types of Supervised Learning
11
Supervised Learning Overview
Data with
answers + Model Fit model
12
Regression: Numerical Answers
Movie data
(unknown revenue) + model Predict Predicted
revenue
13
Classification: Categorical Answers
Labeled
data + Model Fit model
Unlabeled
data + model Predict Labels
14
Classification: Categorical Answers
Emails labeled as
spam/not spam + Model Fit model
Unlabeled
emails + model Predict Spam or
not spam
15
Machine Learning Vocabulary
Target: predicted category or value of the data
(column to predict)
16
Machine Learning Vocabulary
Sepal length Sepal width Petal length Petal width Species
17
Machine Learning Vocabulary
Sepal length Sepal width Petal length Petal width Species
18
Machine Learning Vocabulary
Target: predicted category or value of the data
(column to predict)
19
Machine Learning Vocabulary
Sepal length Sepal width Petal length Petal width Species
20
Machine Learning Vocabulary
Target: predicted category or value of the data
(column to predict)
21
Machine Learning Vocabulary
Sepal length Sepal width Petal length Petal width Species
22
Machine Learning Vocabulary
Target: predicted category or value of the data
(column to predict)
23
Machine Learning Vocabulary
Sepal length Sepal width Petal length Petal width Species
24
What is Classification?
A flower shop wants to
guess a customer's
purchase from similarity to
most recent purchase.
26
What is Classification?
Which flower is a
customer most likely to
purchase based on
similarity to previous
purchase?
27
What is Classification?
Which flower is a
customer most likely to
purchase based on
similarity to previous
purchase?
28
What is Classification?
Which flower is a
customer most likely to
purchase based on
similarity to previous
purchase?
29
What is Classification?
Which flower is a
customer most likely to
purchase based on
similarity to previous
purchase?
30
What is Needed for Classification?
Model data with:
Features that can be quantitated
31
What is Needed for Classification?
Model data with:
Features that can be quantitated
Labels that are known
32
What is Needed for Classification?
Model data with:
Features that can be quantitated
Labels that are known
33
K Nearest Neighbors Classification
34
K Nearest Neighbors Classification
60
Survived
20
0 10 20
Number of Malignant Nodes
35
K Nearest Neighbors Classification
60
40
Age
Predict
20
0 10 20
Number of Malignant Nodes
36
K Nearest Neighbors Classification
Neighbor Count (K = 1): 60
0
1
40
Age
Predict
20
0 10 20
Number of Malignant Nodes
37
K Nearest Neighbors Classification
Neighbor Count (K = 2): 60
1
1
40
Age
Predict
20
0 10 20
Number of Malignant Nodes
38
K Nearest Neighbors Classification
Neighbor Count (K = 3): 60
2
1
40
Age
Predict
20
0 10 20
Number of Malignant Nodes
39
K Nearest Neighbors Classification
Neighbor Count (K = 4): 60
3
1
40
Age
Predict
20
0 10 20
Number of Malignant Nodes
40
What is Needed to Select a KNN Model?
41
What is Needed to Select a KNN Model?
Correct value for 'K' 60
How to measure
closeness of
neighbors?
40
Age
20
0 10 20
Number of Malignant Nodes
42
K Nearest Neighbors Decision Boundary
K=1 60
40
Age
20
0 10 20
Number of Malignant Nodes
43
K Nearest Neighbors Decision Boundary
K = All 60
40
Age
20
0 10 20
Number of Malignant Nodes
44
Value of 'K' Affects Decision Boundary
K=1 K=All
60 60
40 40
20 20
0 10 20 0 10 20
Number of Malignant Nodes Number of Malignant Nodes
45
Value of 'K' Affects Decision Boundary
K=1 K=All
60 60
40 40
20 20
0 10 20 0 10 20
Number of Malignant Nodes Number of Malignant Nodes
40
Age
20
0 10 20
Number of Malignant Nodes
47
Measurement of Distance in KNN
60
40
Age
20
0 10 20
Number of Malignant Nodes
48
Euclidean Distance
60
40
Age
20
0 10 20
Number of Malignant Nodes
49
Euclidean Distance (L2 Distance)
60
d
40 ∆ Age
Age
0 10 20
Number of Malignant Nodes
50
Manhattan Distance (L1 or City Block Distance)
60
40 ∆ Age
Age
20 ∆ Nodes
𝑑 = ∆𝑁𝑜𝑑𝑒𝑠 + ∆𝐴𝑔𝑒
0 10 20
Number of Malignant Nodes
51
Scale is Important for Distance Measurement
60
40
Age
20
12345
Number of Surgeries
52
Scale is Important for Distance Measurement
60
24
40 22
Age
20
20
18
12345
Number of Surgeries
53
Scale is Important for Distance Measurement
60
24
40 22
Nearest
Age Neighbors!
20
20
18
12345
Number of Surgeries
54
Scale is Important for Distance Measurement
"Feature Scaling" 60
40
Age
20
0 1 2 3 4 5
Number of Surgeries
55
Scale is Important for Distance Measurement
"Feature Scaling" 60
40
Age
20
0 1 2 3 4 5
Number of Surgeries
56
Scale is Important for Distance Measurement
"Feature Scaling" 60
40
Age
Nearest
Neighbors!
20
0 1 2 3 4 5
Number of Surgeries
57
Comparison of Feature Scaling Methods
Standard Scaler: Mean center data and scale to unit variance
Minimum-Maximum Scaler: Scale data to fixed range (usually 0–1)
Maximum Absolute Value Scaler: Scale maximum absolute value
58
Feature Scaling: The Syntax
Import the class containing the scaling method
from sklearn.preprocessing import StandardScaler
59
Feature Scaling: The Syntax
Import the class containing the scaling method
from sklearn.preprocessing import StandardScaler
60
Feature Scaling: The Syntax
Import the class containing the scaling method
from sklearn.preprocessing import StandardScaler
X_scaled = KNN.transform(X_data)
61
Feature Scaling: The Syntax
Import the class containing the scaling method
from sklearn.preprocessing import StandardScaler
X_scaled = KNN.transform(X_data)
62
Multiclass KNN Decision Boundary
K=5 60
Full remission
Partial remission 40
Age
Did not survive
20
0 10 20
Number of Malignant Nodes
63
Regression with KNN
K = 20 K=3 K=1
64
Characteristics of a KNN Model
Fast to create model because it simply stores data
Slow to predict because many distance calculations
Can require lots of memory if data set is large
65
K Nearest Neighbors: The Syntax
Import the class containing the classification method
from sklearn.neighbors import KNeighborsClassifier
66
K Nearest Neighbors: The Syntax
Import the class containing the classification method
from sklearn.neighbors import KNeighborsClassifier
67
K Nearest Neighbors: The Syntax
Import the class containing the classification method
from sklearn.neighbors import KNeighborsClassifier
Fit the instance on the data and then predict the expected value
KNN = KNN.fit(X_data, y_data)
y_predict = KNN.predict(X_data)
68
K Nearest Neighbors: The Syntax
Import the class containing the classification method
from sklearn.neighbors import KNeighborsClassifier
Fit the instance on the data and then predict the expected value
KNN = KNN.fit(X_data, y_data)
y_predict = KNN.predict(X_data)
The fit and predict/transform syntax will show up throughout the course.
69
K Nearest Neighbors: The Syntax
Import the class containing the classification method
from sklearn.neighbors import KNeighborsClassifier
Fit the instance on the data and then predict the expected value
KNN = KNN.fit(X_data, y_data)
y_predict = KNN.predict(X_data)
70