Professional Documents
Culture Documents
Jia-Bin Huang
ECE-5424G / CS-5824 Virginia Tech Spring 2019
• Thank you all for participating this class!
• SPOT survey!
Dimensionality
Continuous Regression reduction
k-NN (Classification/Regression)
• Model
• Cost function
None
• Learning
Do nothing
• Inference
, where
Linear regression (Regression)
• Model
• Cost function
• Learning
1) Gradient descent: Repeat {}
2) Solving normal equation
• Inference
Naïve Bayes (Classification)
• Model
• Cost function
Maximum likelihood estimation:
Maximum a posteriori estimation :
• Learning
(Discrete )
(Continuous ) mean , variance ,
Logistic regression (Classification)
• Model
• Cost function
• Learning
Gradient descent: Repeat {}
•
Hard-margin SVM formulation
𝑥2
margin
𝑥2
𝑥
[]
SVM with kernels 𝑓0
𝑓1
• Hypothesis: Given , compute features 𝑓= 𝑓
2
• Predict ⋮
𝑓𝑚
• Training (original)
𝑥1 𝑎
(2 )
1
“Output”
𝑥2 𝑎
(2 )
2
hΘ( 𝑥 )
𝑥3 𝑎
(2 )
3
Layer 1 Layer 2 Layer 3 Slide credit: Andrew Ng
Neural network
“activation” of unit in layer
𝑥0 𝑎
(2 )
0 matrix of weights controlling
𝑥1 𝑎
(2 )
1 function mapping from layer to layer
𝑥2 𝑎
(2 )
2
hΘ( 𝑥 )
𝑥3 (2 ) unit in layer
𝑎 3
units in layer
Size of ?
𝑠 𝑗 +1 ×(𝑠 𝑗 + 1)
Slide credit: Andrew Ng
Neural network “Pre-activation”
𝑥0 𝑎
(2 )
0
𝑥1 𝑎
(2 )
1
𝑥2 𝑎
(2 )
2
hΘ( 𝑥 )
𝑥3 𝑎
(2 )
3
𝑥0 𝑎
(2 )
0
𝑥1 𝑎
(2 )
1
𝑥2 𝑎
(2 )
2
hΘ( 𝑥 )
𝑥3 𝑎
(2 )
3
Add
𝑥0 𝑎
(2 )
0
𝑥1 𝑎
(2 )
1
𝑥2 𝑎
(2 )
2
hΘ( 𝑥 )
𝑥3 𝑎
(2 )
3
Slide credit: Andrew Ng
Bias / Variance Trade-off
• Training error
• Cross-validation error
Loss
Degree of Polynomial
Source: Andrew Ng
Bias / Variance Trade-off
• Training error
• Cross-validation error
Degree of Polynomial
Bias / Variance Trade-off with
Regularization
• Training error
• Cross-validation error
Loss
λ
Source: Andrew Ng
Bias / Variance Trade-off with
Regularization
• Training error
• Cross-validation error
λ
Source: Andrew Ng
K-means algorithm
Randomly initialize cluster centroids
Cluster assignment step
Repeat{
for = 1 to
index (from 1 to ) of cluster centroid
closest to
Centroid update step
for = 1 to
average (mean) of points assigned to cluster
} Slide credit: Andrew Ng
Expectation Maximization (EM) Algorithm
• Goal: Find that maximizes log-likelihood
Jensen’s inequality:
Expectation Maximization (EM) Algorithm
• Goal: Find that maximizes log-likelihood
• (because it is a distribution)
EM algorithm
Repeat until convergence{
(Probabilistic inference)
(M-step) Set
}
Anomaly detection algorithm
1. Choose features that you think might be indicative of anomalous
examples
2. Fit parameters
Anomaly if
Problem motivation
Movie Alice (1) Bob (2) Carol (3) Dave (4)
(romance) (action)
Love at last 5 5 0 0 0.9 0
Romance 5 ? ? 0 1.0 0.01
forever
Cute puppies ? 4 0 ? 0.99 0
of love
Nonstop car 0 0 5 4 0.1 1.0
chases
Swords vs. 0 0 5 ? 0 0.9
karate
Problem motivation
Movie Alice (1) Bob (2) Carol (3) Dave (4)
(romance) (action)
Love at last 5 5 0 0 ? ?
Romance 5 ? ? 0 ? ?
forever
Cute puppies ? 4 0 ? ? ?
of love
Nonstop car 0 0 5 4 ? ?
chases
Swords vs. 0 0 5 ? ? ?
karate
[] [] [] [] []
(1 )
0 ( 2)
0 ( 3)
0 (4)
0 (1 )
?
𝜃 = 5 𝜃 = 5 𝜃 = 0 𝜃 = 0 𝑥 = ?
0 0 5 5 ?
Collaborative filtering optimization objective
• Given , estimate
• Given , estimate
• For a user with parameter and movie with (learned) feature , predict a
star rating of
Semi-supervised Learning
Problem Formulation
• Labeled data
• Unlabeled data
• Compute
using Naïve Bayes classifier