Professional Documents
Culture Documents
V4.3
2014
haiguang2000@qq.com
qq10822884
2017-06-08
2014
Machine Learning()
Web
10 18
ppt
2014 2014
https://www.coursera.org/course/ml
potplayer
ppt
_V
http://pan.baidu.com/s/1pKLATJl xn4w
2017-6-7
1.0 2014.12.16
1.1 2014.12.31
2.0 2015.02.17
2.1 2015.02.23
2.2 2015.03.02
2.3 2015.03.14
2.4 2015.05.02
2.5 2015.05.13
3.1 2016.01.15
3.2 2016.02.15
3.3 2016.02.19
4.0 2016.02.24
4.1 2016.03.20
4.2 2016.03.28
4.3 2017.06.08
1 .............................................................................................................................................. 1
(Introduction) .................................................................................................... 1
1.1 ............................................................................................................................ 1
1.2 .................................................................................................... 4
1.3 .................................................................................................................... 6
1.4 .............................................................................................................. 10
(Linear Regression with One Variable) ................................................ 15
2.1 .................................................................................................................. 15
2.2 .................................................................................................................. 18
2.3 I ............................................................................................ 20
2.4 II ........................................................................................... 21
2.5 .................................................................................................................. 23
2.6 .............................................................................................. 26
2.7 .............................................................................................. 29
2.8 .......................................................................................................... 31
(Linear Algebra Review)........................................................................... 32
3.1 .............................................................................................................. 32
3.2 ...................................................................................................... 34
3.3 .......................................................................................................... 35
3.4 .................................................................................................................. 36
3.5 ...................................................................................................... 37
3.6 .................................................................................................................. 38
2 ............................................................................................................................................ 39
(Linear Regression with Multiple Variables) ........................................ 39
4.1 .................................................................................................................. 39
4.2 ...................................................................................................... 41
4.3 1- ................................................................................. 43
4.4 2- ..................................................................................... 45
4.5 .................................................................................................. 46
4.6 .................................................................................................................. 48
4.7 .............................................................................. 51
Octave (Octave Tutorial).......................................................................................... 53
5.1 .................................................................................................................. 53
5.2 .................................................................................................................. 60
5.3 .................................................................................................................. 69
5.4 .................................................................................................................. 76
5.5 forwhileif ............................................................................. 82
5.6 ...................................................................................................................... 88
5.7 .......................................................................................... 93
3 ............................................................................................................................................ 96
(Logistic Regression) ........................................................................................ 96
6.1 .................................................................................................................. 96
I
6.2 .................................................................................................................. 98
6.3 ................................................................................................................ 100
6.4 ................................................................................................................ 102
6.5 ................................................................................ 105
6.6 ................................................................................................................ 108
6.7 ............................................................................................ 112
(Regularization) ................................................................................................. 115
7.1 ........................................................................................................ 115
7.2 ................................................................................................................ 117
7.3 .................................................................................................... 119
7.4 ........................................................................................ 120
4 .......................................................................................................................................... 122
(Neural Networks: Representation)............................................... 122
8.1 ............................................................................................................ 122
8.2 ........................................................................................................ 124
8.3 1............................................................................................................. 128
8.4 2............................................................................................................. 132
8.5 1................................................................................................. 134
8.6 II................................................................................................. 136
8.7 ................................................................................................................ 138
5 .......................................................................................................................................... 139
(Neural Networks: Learning) ............................................................. 139
9.1 ................................................................................................................ 139
9.2 ........................................................................................................ 141
9.3 .................................................................................... 144
9.4 ............................................................................................ 147
9.5 ................................................................................................................ 148
9.6 ............................................................................................................ 150
9.7 ................................................................................................................ 151
9.8 ................................................................................................................ 152
6 .......................................................................................................................................... 155
(Advice for Applying Machine Learning) ................................... 155
10.1 .............................................................................................. 155
10.2 ...................................................................................................... 158
10.3 ...................................................................................... 160
10.4 .................................................................................................. 162
10.5 / ............................................................................................ 164
10.6 .............................................................................................................. 166
10.7 .............................................................................................. 168
(Machine Learning System Design) ....................................... 170
11.1 ...................................................................................................... 170
11.2 .............................................................................................................. 171
11.3 .............................................................................................. 174
11.4 .............................................................................. 175
II
11.5 .................................................................................................. 177
7 .......................................................................................................................................... 181
(Support Vector Machines) ................................................................... 181
12.1 .............................................................................................................. 181
12.2 .............................................................................................. 188
12.3 ...................................................................... 194
12.4 1............................................................................................................... 201
12.5 2............................................................................................................... 203
12.6 .................................................................................................. 205
8 .......................................................................................................................................... 208
(Clustering) ........................................................................................................ 208
13.1 .............................................................................................. 208
13.2 K- ........................................................................................................... 211
13.3 .............................................................................................................. 213
13.4 .......................................................................................................... 214
13.5 .......................................................................................................... 215
(Dimensionality Reduction) ............................................................................... 216
14.1 .............................................................................................. 216
14.2 .......................................................................................... 219
14.3 .................................................................................................. 220
14.4 .................................................................................................. 222
14.5 .............................................................................................. 223
14.6 .................................................................................................. 224
14.7 .................................................................................. 226
9 .......................................................................................................................................... 227
(Anomaly Detection) ................................................................................. 227
15.1 .......................................................................................................... 227
15.2 .............................................................................................................. 229
15.3 ...................................................................................................................... 230
15.4 .......................................................................... 232
15.5 .................................................................................. 233
15.6 .............................................................................................................. 234
15.7 ...................................................................................... 236
15.8 ...................................................... 239
(Recommender Systems)........................................................................... 242
16.1 .......................................................................................................... 242
16.2 .......................................................................................... 244
16.3 .............................................................................................................. 246
16.4 ...................................................................................................... 247
16.5 ...................................................................................... 248
16.6 ...................................................................... 250
10 ........................................................................................................................................ 251
(Large Scale Machine Learning)..................................................... 251
17.1 .............................................................................................. 251
III
17.2 .................................................................................................. 252
17.3 .................................................................................................. 253
17.4 .............................................................................................. 254
17.5 .............................................................................................................. 256
17.6 .......................................................................................... 258
(Application Example: Photo OCR) ................................ 259
18.1 .............................................................................................. 259
18.2 .............................................................................................................. 260
18.3 .................................................................................. 262
18.4 .................................................................. 263
(Conclusion)....................................................................................................... 264
19.1 .......................................................................................................... 264
IV
- 1 -(Introduction)
(Introduction)
1.1
: 1 - 1 - Welcome (7 min).mkv
AI
A B
web
1
- 1 -(Introduction)
web
DNA
AI
AI
12 IT HR
2
- 1 -(Introduction)
3
- 1 -(Introduction)
1.2
Arthur Samuel
Samuel 50
Samuel
Samuel
E T
P E P T
e t p
Tom Mitchell
P P
T E
4
- 1 -(Introduction)
5
- 1 -(Introduction)
1.3
750
$150, 000
$200, 000
6
- 1 -(Introduction)
1 0
5 1 5
0 1
012
30 1 2 3
7
- 1 -(Introduction)
X O X
2 3 5
5 3
3 5
8
- 1 -(Introduction)
1.
2.
0 1
0 1
0 1
9
- 1 -(Introduction)
1.4
URL
news.google.com
10
- 1 -(Introduction)
DNA
email Facebook +
11
- 1 -(Introduction)
12345678910,
12345678910
JAVA
12
- 1 -(Introduction)
[W,s,v] = svd((repmat(sum(x.*x,1),size(x,1),1).*x)*x');
Octave Octave
Matlab Matlab
Octave Octave
SVM
C++ Java
Octave Octave
Octave
C++ Java
C++
Octave
Octave
13
- 1 -(Introduction)
14
- 1 -(Linear Regression with One Variable)
2.1
1250
220000()
0/1
m
15
- 1 -(Linear Regression with One Variable)
Training Set
x /
y /
(x,y)
(x(i),y(i) ) i
h hypothesis
h hypothesis() h
h x y y h
x y
h hypothesis
16
- 1 -(Linear Regression with One Variable)
h x 0 1 x /
17
- 1 -(Linear Regression with One Variable)
2.2
m m = 47
h x 0 1 x
parameters0 1 y
modeling error
18
- 1 -(Linear Regression with One Variable)
x i
2
1 m i
J 0 ,1 h
2m i 1
y
0 1 J(0,1)
J(0,1)
J(0,1)
19
- 1 -(Linear Regression with One Variable)
2.3 I
20
- 1 -(Linear Regression with One Variable)
2.4 II
J(0,1)
21
- 1 -(Linear Regression with One Variable)
0 1
0 1
J 0 1
22
- 1 -(Linear Regression with One Variable)
2.5
J(0,1)
0,1,...,n
local minimum
global minimum
360
learning rate
23
- 1 -(Linear Regression with One Variable)
0 1 j=0
j=1 J0 J1
0 1
0:= 0 1:= 1
0 1
0 1
24
- 1 -(Linear Regression with One Variable)
25
- 1 -(Linear Regression with One Variable)
2.6
j : j J ( )
j
J()
learning rate
11 1
j : j J ( )
j
26
- 1 -(Linear Regression with One Variable)
1 1 1
J()
27
- 1 -(Linear Regression with One Variable)
28
- 1 -(Linear Regression with One Variable)
2.7
: 2 - 7 - GradientDescentForLinearRegression (6 min).mkv
j=0
j=1
29
- 1 -(Linear Regression with One Variable)
""
""
(normal equations)
30
- 1 -(Linear Regression with One Variable)
2.8
31
- 1 -(Linear Algebra Review)
3.1
42 4 2 m n mn 42
Aij i j
41
1 0 1 0
32
- 1 -(Linear Algebra Review)
33
- 1 -(Linear Algebra Review)
3.2
1 0 4 0.5 5 0.5
2 5 2 5 4 10
3 1 0 1 3 2
1 0 3 0 1 0
3 2 5 6 15 2 5 3
3 1 9 3 3 1
34
- 1 -(Linear Algebra Review)
3.3
mn n1 m1
35
- 1 -(Linear Algebra Review)
3.4
mn no mo
A B
36
- 1 -(Linear Algebra Review)
3.5
ABBA
ABC=ABC
1,
I E I
1 0
AI=IA=A
37
- 1 -(Linear Algebra Review)
3.6
A mm
OCTAVE MATLAB
A mn m n i j a(i,j)
A=a(i,j)
j A j i AT=B( A'=B
A 1 1 45
(AB)T=ATBT
(AB)T= BTAT
(AT)T=A
(KA)T=KAT
matlab
x=y'
38
- 2 -(Linear Regression with Multiple Variables)
4.1
x1,x2,...,xn
x(i) i i vector
1416
3
x
(2)
2
40
i
x j
i j i j
2 x3 2
(2) (2)
x 3
h h x 0 1 x1 2 x2 ... n xn
n+1 n x0=1
n+1 n+1
X m*(n+1) h x X T
T
39
- 2 -(Linear Regression with Multiple Variables)
40
- 2 -(Linear Regression with Multiple Variables)
4.2
x y i 2
1 m i
J 0 ,1... n h
2m i 1
h x X 0 x0 1 x1 2 x2 ... n xn
T
n>=1
41
- 2 -(Linear Regression with Multiple Variables)
42
- 2 -(Linear Regression with Multiple Variables)
4.3 1-
0-
2000 0-5
-1 1
43
- 2 -(Linear Regression with Multiple Variables)
n sn
44
- 2 -(Linear Regression with Multiple Variables)
4.4 2-
0.001
=0.010.030.10.31310
45
- 2 -(Linear Regression with Multiple Variables)
4.5
X1=frontagex2=depthx=frontage*depth=area
h x 0 1 x
h x 0 1 x1 2 x2
2
h x 0 1 x1 2 x2 3 x3
2 3
46
- 2 -(Linear Regression with Multiple Variables)
47
- 2 -(Linear Regression with Multiple Variables)
4.6
J j 0
j
X x0=1 y
XT X XT y
1
T -1 A=XTX(XTX)-1=A-1
48
- 2 -(Linear Regression with Multiple Variables)
Octave
pinv(X'*X)*X'*y
n (XTX)-1
O(n3)
n 10000
49
- 2 -(Linear Regression with Multiple Variables)
50
- 2 -(Linear Regression with Multiple Variables)
4.7
( normal equation )
XT X XT y
1
=inv(X'X ) X'y X'X
X'X Octave
Octave
pinv() inv()
pinv() X'X
pinv() inv() ?
inv() x1
x2 1
3.28 ( )
x1=x2* (3.28)2
X'X
X'X
m n m 10 n
10 101
51
- 2 -(Linear Regression with Multiple Variables)
10
100 101
100 101
X'X
x1 x2
X'X Octave
pinv ( ) X'X
XTX
52
- 2 -Octave (Octave Tutorial)
5.1
Octave
C++JavaPythonNumpy
Octave Octave
Octave
C++ Java
Octave C++
Java
OctaveMATLABPythonNumPy
Octave MATLAB
matlabmatlab Octave D
matlab MATLAB
PythonNumPy R R
PythonNumPy
Octave NumPy
R Octave
Octave
Octave
Octave
53
- 2 -Octave (Octave Tutorial)
Octave Octave
Octave
5 + 6 11
3 2581/22 ^ 6
1==2 false ( )
54
- 2 -Octave (Octave Tutorial)
1==2 0
( ~= )
( != )
1 0 1 || 0
XOR ( 1, 0 ) 1
Octave 324.x
Octave
Octave
Octave
A 3 A 3
b "hi"
55
- 2 -Octave (Octave Tutorial)
C 3 1 C
A A
DISP
C C
sprintf 0.6%f ,a 6
56
- 2 -Octave (Octave Tutorial)
V 1 2 3V 3 ( )1 ( )
1;2;3 3 1
V=10.12
V 1 0.1
2 V 1 11
V 1:6 V 1 6
ones(2, 3)
57
- 2 -Octave (Octave Tutorial)
w A
W Rand Rand
rand(3, 3) 33
0 1 0 1
W N 0
58
- 2 -Octave (Octave Tutorial)
hist
help
Octave
Octave
Octave
59
- 2 -Octave (Octave Tutorial)
5.2
Octave Octave
Octave
Octave
A A
A = [1 2; 3 4; 5 6]
3 2 Octave size()
size(A) 3 2
size() 12 sz
sz = size(A)
sz 12 3 2
size(sz) sz 1 2 12 1
60
- 2 -Octave (Octave Tutorial)
2 sz
size(A, 1) 3 A A
size(A, 2) 2 A
v v = [1 2 3 4] length(v)
length(A) A 32 3
length length
length([1;2;3;4;5]) 5
Octave Octave
pwd Octave
cd C:\Users\ang\Desktop
featuresX.dat priceY.dat
61
- 2 -Octave (Octave Tutorial)
featuresX
47 2104 3
1600 3
Octave featuresX.dat
featuresX priceY.dat
load('featureX.dat')
Octave
who Octave
featuresX featuresX
62
- 2 -Octave (Octave Tutorial)
size(featuresX) 47 2 472
size(priceY) 47 1 47
who whos
double
63
- 2 -Octave (Octave Tutorial)
whos featuresX
v= priceY(1:10)
Y 10 v
save hello.mat v v
hello.mat hello.mat
MATLAB MATLAB
MATLAB
clear
hello.mat v v
hello.mat save
ascii
hello.txt
64
- 2 -Octave (Octave Tutorial)
32 A(3,2)
A (3,2) A 32 3 2
A(2,:)
A(:,2) A 2 4 6
A([1 3],:)
A 1 3 A
A A(:,2)
A 10 11 12
A [10;11;12] A
1 3 5 10 11 12
65
- 2 -Octave (Octave Tutorial)
A A
A(:) A
91
CC = [A B] A
B C A B
66
- 2 -Octave (Octave Tutorial)
C = [A; B][A; B]
A B
C 62
C A
[A B] [A, B]
67
- 2 -Octave (Octave Tutorial)
Octave
Octave
Octave
68
- 2 -Octave (Octave Tutorial)
5.3
Octave
Octave A 32
B 3 2 C 2 2
A C AC 32
22 32
A .*B Octave A
A .* B 1 11 11 2 12 24
Octave
69
- 2 -Octave (Octave Tutorial)
A A .^ 2 A
V V [1; 2; 3] 1 ./ V
1 ./ A A
e e
70
- 2 -Octave (Octave Tutorial)
abs v v
v V
-1 v -v -1*v
v 1
3 1 1 1 v [1 2 3]
ones(3,1) v + ones(3,1) v 1 v
v+1v + 1 v 1
71
- 2 -Octave (Octave Tutorial)
A A, A
(A) A A
A 15
15 2 ind ind 2
max(A) A
3 1 0
[1 1 0 1] a 3
3 1 0
find(a<3) a 3
3 3
72
- 2 -Octave (Octave Tutorial)
c 7
7 7
find
help help
help find
sum(a) a
prod(a)prod product()
73
- 2 -Octave (Octave Tutorial)
floor(a) a 0.5 0
ceil(a) 0.5 1
33
max(A,[],1)
8 9 7 1 A
max(A,[],2)
8 7 9
max(A) A
max(max(A)) A max(A(:)) A
A 9 9
74
- 2 -Octave (Octave Tutorial)
99 sum(A,1)
99 369
sum(A,2) A
369
A 99
eye(9)
I9
A 0
sum(sum(A.*eye(9))
369
369
flipup/flipud /
pinv(A)
A A
1 0
75
- 2 -Octave (Octave Tutorial)
5.4
J()
Octave
Octave
plot(t, y1)
t y1
y2
76
- 2 -Octave (Octave Tutorial)
Octave
cos(x) 1
y2plot(t, y2)
r r
ylabel('value')
legend('sin', 'cos')
title('myplot')
77
- 2 -Octave (Octave Tutorial)
close
Octave
plot(t, y1)
x y 0.5
1-1 1
axis
Octave help
Clf
A 55 magic
imagesc(A) 5*5
5*5 A
colorbar imagesc(A)colorbarcolormap
gray imagesccolorbar
colormap gray
79
- 2 -Octave (Octave Tutorial)
imagesc(magic(15))colorbarcolormap gray
15*15 magic
a=1,b=2,c=3 Enter
80
- 2 -Octave (Octave Tutorial)
a=1; b=2;c=3;
colormap
Octave Octave
if while for
81
- 2 -Octave (Octave Tutorial)
5.5 forwhileif
for
v 10 1
2 i end
v 2 2 i
1 10 i 1 10
82
- 2 -Octave (Octave Tutorial)
indices () 1 10
indices 1 10
i = indices i 1 10 disp(i)
for
Octave
while
83
- 2 -Octave (Octave Tutorial)
i 1 v(i) 100 i 1
i 5
100
while
break () (end)
if i
i 6 while
v 5 999
if while end
if-else
84
- 2 -Octave (Octave Tutorial)
Octave exit
Octave quit
(functions)
squarethisnumber.m Octave
Windows
Octave
y Octave
x y x
search path ()
Octave
addpath C:\Users\ang\desktop
Octave Octave
85
- 2 -Octave (Octave Tutorial)
Users\ang\desktop
SquareThisNumber
cd
Octave
SquareAndCubeThisNumber(x) (x x )
y1 y2
y1 y2
C C++
Octave
125
Octave J() J
Octave X = [1 1; 1 2; 1 3];
86
- 2 -Octave (Octave Tutorial)
0 x [1;2;3] y [1;2;3] 0 01
1 45
theta [0; 0] 0
0 = 01 0 2.333
1 2 3 2m
2.33
X y
if
Octave Octave
87
- 2 -Octave (Octave Tutorial)
5.6
Octave
Octave a b
h(x) j =0 j = n
h(x) Tx
012 n =2 x x0x1x2
88
- 2 -Octave (Octave Tutorial)
0 012 MATLAB
1 MATLAB 0 theta(1)
for j 1 n+1 0 n
for n
x prediction theta x
for
Octave x
Octave
C++
89
- 2 -Octave (Octave Tutorial)
C++
C++
j 012 j 012
n 2 012
for j 0
1 2 j
90
- 2 -Octave (Octave Tutorial)
for
n+1
x(i)
91
- 2 -Octave (Octave Tutorial)
u = 2v +5w u 2 v 5 w
for 012
Octave
C++Java
92
- 2 -Octave (Octave Tutorial)
5.7
'ml-class-ex1'
warmUpExercise.m
55 A = eye(5)
55 warmUpExercise()
5x5
93
- 2 -Octave (Octave Tutorial)
Octave C:\Users\ang\Desktop\ml-class-ex1
'warmUpExercise()'
5x5
submit()
'1'
94
- 2 -Octave (Octave Tutorial)
1 1
95
- 3 -(Logistic Regression)
(Logistic Regression)
6.1
: 6 - 1 - Classification (8 min).mkv
y (Logistic
Regression)
96
- 3 -(Logistic Regression)
y 0 1
1 0
y 0 1 0 1
1 0
0 1
y 1 0 0 1
97
- 3 -(Logistic Regression)
6.2
0 1
0 1
0 1
h 0.5 y=1
h 0.5 y=0
0.5
[0,1]
0 1
h(x)=g(TX)
98
- 3 -(Logistic Regression)
g logistic function S
Sigmoid function
1
g z
1 e z
1
h x
1 e
T
X
h(x)=1
estimated probablity h x P y 1| x;
x h(x)=0.7 70%
y y 1-0.7=0.3
99
- 3 -(Logistic Regression)
6.3
(decision boundary)
h 0.5 y=1
h 0.5 y=0
z=0 g(z)=0.5
z>0 g(z)>0.5
z<0 g(z)<0.5
z=TX
TX 0 y=1
TX 0 y=0
y=1
x1+x2=3 1
100
- 3 -(Logistic Regression)
y=0 y=1
h x g 0 1 x1 2 x2 3 x12 4 x22 [-1 0 0 1 1]
101
- 3 -(Logistic Regression)
6.4
1
h x
1 e
T
X
non-convex function
x y i 2
1 m 1 i
J h
m i 1 2
102
- 3 -(Logistic Regression)
J
1 m
x , y
Cos t h
m i 1
i i
h(x) Cost(h(x),y)
h 1 h y=0 h 0 0 y=0
h 0 h
Cost(h(x),y)
103
- 3 -(Logistic Regression)
J()
h(x)=g(TX)
octave
fminunc
end
initialTheta = zeros(2,1);
104
- 3 -(Logistic Regression)
6.5
J()
p(y=1|x;) x
y=1 y=1
J()
105
- 3 -(Logistic Regression)
(gradient descent)
j
1 m
J = h xi y i xji
m i 1
i
i=1 m x j
J
j
0
1
n 2 0 1 2 n
...
n
106
- 3 -(Logistic Regression)
h x X
T
1
h x
1 e
T
X
0 n
107
- 3 -(Logistic Regression)
6.6
J()
J()
J() J
01 n
J()
J()
J()
J() J
j
108
- 3 -(Logistic Regression)
J() J
j
BFGS () L-BFGS (
) J()
(line search)
BFGS L-BFGS
L-BGFS BFGS
Octave MATLAB
Octave
CC + + Java
L-BFGS
109
- 3 -(Logistic Regression)
0 1
1 2 J() 1
5
2 5
J()
Octave
jVal=(theta(1)-5)^2+(theta(2)-5)^2;
gradient=zeros(2,1);
gradient(1)=2*(theta(1)-5);
gradient(2)=2*(theta(2)-5);
end
21
costFunction
fminunc Octave
options=optimset('GradObj','on','MaxIter',100);
initialTheta=zeros(2,1);
GradObj On(on)
100
21 fminunc@
110
- 3 -(Logistic Regression)
costFunction
Octave
gradientgradient
theta(1) theta(2)
111
- 3 -(Logistic Regression)
6.7
(logistic regression)
"" (one-vs-all)
y=1y=2y=3y=4
"" 1 31 4
0 1 2 3 1 2 3 4 1
112
- 3 -(Logistic Regression)
""
y=1
y=2 y=3
1 ""
2 3 1
113
- 3 -(Logistic Regression)
y=1
y=2 ,
h
i
x
i y=i x
x h
i
x
max h x
i
i
i
i y
114
- 3 -(Regularization)
(Regularization)
7.1
(over-fitting)
(regularization)
115
- 3 -(Regularization)
1.
PCA
2. magnitude
116
- 3 -(Regularization)
7.2
3 4 3 4
3 4
3 4
Regularization Parameter 0
h(x)=0
117
- 3 -(Regularization)
n
2
j
j 1
Cost Function
0 0
118
- 3 -(Regularization)
7.3
j=1,2,...,n
(n+1)*(n+1)
119
- 3 -(Regularization)
7.4
J()
J()
h(x)=g(TX)
Octave fminuc
1.
h(x)
2. 0
120
- 3 -(Regularization)
121
- 4 -(Neural Networks: Representation)
8.1
x1x2
100 100
x1x2+x1x3+x1x4+...+x2x3+x2x4+...+x99x100, 5000
RGB
122
- 4 -(Neural Networks: Representation)
50x50
2500 25002/2
123
- 4 -(Neural Networks: Representation)
8.2
90
124
- 4 -(Neural Networks: Representation)
BrainPort
FDA ()
125
- 4 -(Neural Networks: Representation)
YouTube
126
- 4 -(Neural Networks: Representation)
127
- 4 -(Neural Networks: Representation)
8.3 1
input/Dendrite/output/Axon
128
- 4 -(Neural Networks: Representation)
activation unit
weight
129
- 4 -(Neural Networks: Representation)
a1,a2,a3
h(x)
3 Input Layer
bias unit
a
(j)
i
j i (j) j j+1
(1)
j+1
(1)
3*4
130
- 4 -(Neural Networks: Representation)
a x x
( FORWARD PROPAGATION )
x0
10 ... ... ... a1
x1
x, a X ... ... ... ... a a 2
x2
... ... ... 33 a3
x3
X a
131
- 4 -(Neural Networks: Representation)
8.4 2
( FORWARD PROPAGATION )
Neuron Networks
132
- 4 -(Neural Networks: Representation)
[x1~x3][a(2)1~a(2)3],
a0,a1,a2,a3 x0,x1,x2,x3
x a
133
- 4 -(Neural Networks: Representation)
8.5 1
x1,x2,...,xn
AND
OR
AND output
sigmod
AND
h(x)
g(x)
134
- 4 -(Neural Networks: Representation)
AND
OR
OR AND
135
- 4 -(Neural Networks: Representation)
8.6 II
-302020AND
-102020OR
10-20NOT
XNOR 1 0 XNOR=x1ANDx2
OR((NOTx1)AND(NOTx2))
(NOTx1)AND(NOTx2)
136
- 4 -(Neural Networks: Representation)
AND (NOTx1)AND(NOTx2) OR
XNOR
137
- 4 -(Neural Networks: Representation)
8.7
y=1,2,3.
1 0
x 4 4
[a b c d]T a,b,c,d 1
138
- 5 -(Neural Networks: Learning)
9.1
m x yL
SL=1, y=0 or 1
K SL=K, yi = 1 i K>2
scalar y
h(x) K
139
- 5 -(Neural Networks: Learning)
K K
K y
0 j
sl +1 i sl
h(x)-
regularization bias
140
- 5 -(Neural Networks: Learning)
9.2
h(x)
x(1),y(1)
K=4SL=4L=4
yk
k=1:K
141
- 5 -(Neural Networks: Learning)
g'(z(3)) S g'(z(3))=a(3).*(1-a(3))((3))T(4)
=0
j j
i i
l i j
142
- 5 -(Neural Networks: Learning)
Octave fminuc
1*11
143
- 5 -(Neural Networks: Learning)
9.3
144
- 5 -(Neural Networks: Learning)
145
- 5 -(Neural Networks: Learning)
146
- 5 -(Neural Networks: Learning)
9.4
147
- 5 -(Neural Networks: Learning)
9.5
- +
0.001
Octave
148
- 5 -(Neural Networks: Learning)
149
- 5 -(Neural Networks: Learning)
9.6
1011
150
- 5 -(Neural Networks: Learning)
9.7
1.
2. h(x)
3. J
4.
5.
6.
151
- 5 -(Neural Networks: Learning)
9.8
Dean Pomerleau
152
- 5 -(Neural Networks: Learning)
ALVINN NavLab
ALVINN
ALVINN ALVINN
30x32 ALVINN
ALVINN
153
- 5 -(Neural Networks: Learning)
ALVINN 12
154
- 6 -(Advice for Applying Machine Learning)
10.1
x1x2 x3
155
- 6 -(Advice for Applying Machine Learning)
x1 x2 x1x2
lambda
1.
2.
3.
4.
5.
6.
""
156
- 6 -(Advice for Applying Machine Learning)
157
- 6 -(Advice for Applying Machine Learning)
10.2
h(x)
70%
30%
158
- 6 -(Advice for Applying Machine Learning)
1. J
2.
159
- 6 -(Advice for Applying Machine Learning)
10.3
10
1. 10
2. 10
3.
4. 3
160
- 6 -(Advice for Applying Machine Learning)
161
- 6 -(Advice for Applying Machine Learning)
10.4
162
- 6 -(Advice for Applying Machine Learning)
d d
d d
163
- 6 -(Advice for Applying Machine Learning)
10.5 /
0-10 2
0,0.01,0.02,0.04,0.08,0.15,0.32,0.64,1.28,2.56,5.12,10 12
1. 12
2. 12
3.
4. 3
164
- 6 -(Advice for Applying Machine Learning)
165
- 6 -(Advice for Applying Machine Learning)
10.6
sanity check
100 1
166
- 6 -(Advice for Applying Machine Learning)
167
- 6 -(Advice for Applying Machine Learning)
10.7
1.1
1.
2.
3.
4.
5.
6.
168
- 6 -(Advice for Applying Machine Learning)
169
- 6 -(Machine Learning System Design)
11.1
100
1 0 1001
1.
2.
3.
4. watch w4tch
170
- 6 -(Machine Learning System Design)
11.2
error analysis
24
1.
2.
3.
171
- 6 -(Machine Learning System Design)
discount/discounts/discounted/discounting
172
- 6 -(Machine Learning System Design)
173
- 6 -(Machine Learning System Design)
11.3
skewed classes
0.5%
0.5% 1%
PrecisionRecall
1. True Positive,TP
2. True Negative,TN
3. False Positive,FP
4. False Negative,FN
=TP/TP+FP
=TP/TP+FN
174
- 6 -(Machine Learning System Design)
11.4
0-1 0.5
Precision=TP/TP+FP
Recall=TP/TP+FN
0.5 0.70.9
0.5 0.3
175
- 6 -(Machine Learning System Design)
F1 F1 Score
F1
176
- 6 -(Machine Learning System Design)
11.5
__ (to,two,too)
2001
"
177
- 6 -(Machine Learning System Design)
0.1 1000 10
""
""
"
"
x y
twototoo x
178
- 6 -(Machine Learning System Design)
__ (two)
to too
179
- 6 -(Machine Learning System Design)
x y
y x
180
- 7 -(Support Vector Machines)
12.1
A B
(Support Vector
Machine) SVM
181
- 7 -(Support Vector Machines)
. z T x
y=1
y=1 h(x)
1 h(x) 1 T x
0>> 0 z T x z 0
y=0 0 T x z 0
182
- 7 -(Support Vector Machines)
(x, y)
1/m
1/m
y 1 y 0
y 1 y 1 (1-y)
1
0 y 1 (x, y) y 1 log(1 )
1 e z
z T x y y
1 1 z
z T x
y=1 T x
183
- 7 -(Support Vector Machines)
1
log(1 )
z
1 e
z=1
()
y=1
y=1 y=0
y=0 0
z z
184
- 7 -(Support Vector Machines)
cos t ( z)
1
J()
cos t ( z)
1
cost1 cost0
185
- 7 -(Support Vector Machines)
1/m
1/m 1/m
1/m
(u-5)^2+1
u u=5
10
10(u-5)^2+10 u u 5
m m
B A B
A+B
A B
CCA+B
B C
B A
C 1/ 1/
C 1/ C
186
- 7 -(Support Vector Machines)
1/
SVM C
y 1 0
1 T x 0 0
SVM
187
- 7 -(Support Vector Machines)
12.2
SVM
z cost1(z)
z cost0(z) z
y 1
z 1 cost1(z) 0
0 y
1 T x 0 T x >0
0 T x <=0
T x >0 0 1 0
-1
188
- 7 -(Support Vector Machines)
C 100000
C 0
y=1 0
0 T x <=-1 0
0 0
C 0 C 0
189
- 7 -(Support Vector Machines)
(margin)
190
- 7 -(Support Vector Machines)
C 100000
y=1 y=0
191
- 7 -(Support Vector Machines)
(outlier)
C 1/
C=1/
192
- 7 -(Support Vector Machines)
193
- 7 -(Support Vector Machines)
12.3
min).mkv
u v
u T v u T v u v
u1 u2 u
u u u u u
u u u
2 2
1 2
u
v v
v1 v2 v u v
v u 90
u p p
v u p v u
194
- 7 -(Support Vector Machines)
u T v=p u u
p u
u T v=vTu
u v u v v u u
v u
p u T v
p p
u v u v
90 v u p
u T v p u p
u v 90 p
90 p
90
195
- 7 -(Support Vector Machines)
0 = 0 n 2
x1 x2
12
2
1 1
2 2 2 2 2
n=2 1 2 1 2
2 2
1 2
012 0 0 12
0 12 0
1
2
2
T x
x? u T v
x(i) u v
196
- 7 -(Support Vector Machines)
i i i
x x 1
x2
2 T x(i)
p (i) i
T x(i) p
i i
1 x 1
2 x 2
x(i)
i i
x >=1 x <-1 ,
T T
i
x >=1 T x = p
i i
p
i i
x p
T
197
- 7 -(Support Vector Machines)
1
2
2
0 =0
90
, 0 =0
(0,0)
x(1)
p(1) x(2)
p(2)
p(2)
p(2) 90 p(2) 0
p(i)
i
p >=1, p(i) ,
1
. p(1) , p >=1,
2
p(1) p <= -
1 p(2)
198
- 7 -(Support Vector Machines)
x(2)p(2)
(1)
p(1) p(2) p >1
p(1)
p(3)
p(i)
0 =0
199
- 7 -(Support Vector Machines)
0 = 0
0 0
0 0
C 0 0
200
- 7 -(Support Vector Machines)
12.4 1
... h(x)=f1+f2+...+fn
f1,f2,f3
x
x landmarksl(1),l(2),l(3)
f1,f2,f3
x l(1)
similarityx,l(1)Gaussian
201
- 7 -(Support Vector Machines)
Kernel
x L 0
)
f e-0=1 x L f e-( =0
x1x2 f x l(1) f
x f 2
l(2) y=1
y=0
f1,f2,f3
202
- 7 -(Support Vector Machines)
12.5 2
m l(1)=x(1),l(2)=x(2),...,l(m)=x(m)
x f Tf>=0 y=1
203
- 7 -(Support Vector Machines)
TM T M
liblinear,libsvm
linear kernel
204
- 7 -(Support Vector Machines)
12.6
SVM
SVM
SVM
liblinear libsvm
Polynomial Kernel
String kernel
chi-square kernel
...
Mercer's
k k k
SVM
1 C /
SVM SVM
205
- 7 -(Support Vector Machines)
n m
(1) m n
SVM
SVM
SVM SVM
1 10,000
5 50,000
SVM
SVM SVM
SVM
206
- 7 -(Support Vector Machines)
SVM
SVM
SVM
SVM
SVM
207
- 8 -(Clustering)
(Clustering)
13.1
x(1),x(2)..
x(m) y
208
- 8 -(Clustering)
Facebook Google+
209
- 8 -(Clustering)
210
- 8 -(Clustering)
13.2 K-
K-
K- n :
K cluster centroids
2-4
211
- 8 -(Clustering)
10
1,2,...,k c(1),c(2),...,c(m) i
K-
Repeat {
for i = 1 to m
for k = 1 to K
for i
for k
K-
K-
T-
212
- 8 -(Clustering)
13.3
K-
K- Distortion function
c(i) x(i)
c(1),c(2),...,c(m) 1,2,...,k
K- c(i)
213
- 8 -(Clustering)
13.4
K-
1. K<m
2. K K K
K-
K-
K- K
2--10 K
214
- 8 -(Clustering)
13.5
K-
JK
1 2 2 3 3
K 3 K
T- 3
S,M,L 5 XS,S,M,L,XL T-
215
- 8 -(Dimensionality Reduction)
(Dimensionality Reduction)
14.1
x1:X2
x1 X2
216
- 8 -(Dimensionality Reduction)
X1
X2
217
- 8 -(Dimensionality Reduction)
1000
100
218
- 8 -(Dimensionality Reduction)
14.2
50 GDP
GDP 50
219
- 8 -(Dimensionality Reduction)
14.3
PCA
n k u(1),u(2),...,u(k)
Projected
Error
PCA n k 100 10
220
- 8 -(Dimensionality Reduction)
PCA
PCA PCA
221
- 8 -(Dimensionality Reduction)
14.4
PCA n k
xj= xj -j
covariance matrix
eigenvectors:
svd(sigma)
nn U
n k U K
nk Ureduce
z(i)
x n1 k1
222
- 8 -(Dimensionality Reduction)
14.5
1% 99%
95%
K=1 Ureduce z
1% K=2 1% K
K Octave svd
[U, S, V] = svd(sigma)
S nn 0
223
- 8 -(Dimensionality Reduction)
14.6
PCA 1000
100
i i
Z 100 x 1000
PCA x(1),X(2)
Z(1)
Z(1)
x 2 z 1 z x x appox U reduce z
T
U reduce
x appox
x
224
- 8 -(Dimensionality Reduction)
PCA X Z
PCA PCA
225
- 8 -(Dimensionality Reduction)
14.7
100100
10000
1. 1000
2.
3. Ureduce x z
Ureduce
226
- 9 -(Anomaly Detection)
(Anomaly Detection)
15.1
(Anomaly detection)
QA
()
x(1) x(m) m
xtest
227
- 9 -(Anomaly Detection)
x(1),x(2),..,x(m) xtest
p(x)
X(i) = i
p(x) =
p(x)<
CPU
228
- 9 -(Anomaly Detection)
15.2
x x~N(,2)
m m-1
1/m 1/(m-1)
1/m
229
- 9 -(Anomaly Detection)
15.3
x(1),x(2),...,x(m) 2
p(x)
p(x)<
z p(x)
p(x)= p(x)>
230
- 9 -(Anomaly Detection)
p(x) x
231
- 9 -(Anomaly Detection)
15.4
min). mkv
10000 20
6000
2000 10
2000 10
1. p(x)
2. F1
3. F1
232
- 9 -(Anomaly Detection)
15.5
y=1,
y=0
1. 1.
2. 2.
3. 3.
233
- 9 -(Anomaly Detection)
15.6
x = log(x+c) c
x=xcc 0-1
p(x)
234
- 9 -(Anomaly Detection)
CPU
235
- 9 -(Anomaly Detection)
15.7
X p(x)
p(x)
p(x)
p(x):
236
- 9 -(Anomaly Detection)
|| Octave det(sigma)
1.
2. 1 2
3. 2 1
4.
5.
1233
m>n
m>10n
237
- 9 -(Anomaly Detection)
238
- 9 -(Anomaly Detection)
15.8
n n n
PCA
239
- 9 -(Anomaly Detection)
P(x)
240
- 9 -(Anomaly Detection)
241
- 9 -(Recommender Systems)
(Recommender Systems)
16.1
iTunes Genius
242
- 9 -(Recommender Systems)
5 4
Alice Bob
Carol Dave
nu
nm
r(i,j) i j r(i,j)=1
y(i,j) i j
mj j
243
- 9 -(Recommender Systems)
16.2
x1 x2
x(1)[0.9 0]
(1)
(j) j
x(i) i
j i((j))Tx(i)
i:r(i,j) j
1/2m m 0
244
- 9 -(Recommender Systems)
245
- 9 -(Recommender Systems)
16.3
1. x(1),x(2),...,x(nm)(1),(2),...,(nu)
2.
3. ((j))Tx(i) j i
x(i) x(j)
||x(i)-x(j)||
246
- 9 -(Recommender Systems)
16.4
247
- 9 -(Recommender Systems)
16.5
1.
2.
Y 5 4
Love at last 5 5 0 0
Romance forever 5 ? ? 0
248
- 9 -(Recommender Systems)
i x(i)
j x(i) x(j) i j
i j
i 5
5 j
249
- 9 -(Recommender Systems)
16.6
Eve Eve
Eve
((j))T(x(i))+i Eve
250
- 10 -(Large Scale Machine Learning)
10
17.1
100
20
1000
251
- 10 -(Large Scale Machine Learning)
17.2
252
- 10 -(Large Scale Machine Learning)
17.3
b 2-100
253
- 10 -(Large Scale Machine Learning)
17.4
X X
254
- 10 -(Large Scale Machine Learning)
1000
255
- 10 -(Large Scale Machine Learning)
17.5
A B
$50 $20
A B
y=1y=0
p(y=1)
256
- 10 -(Large Scale Machine Learning)
2 3 3
257
- 10 -(Large Scale Machine Learning)
17.6
CPU
400 4
CPU
258
- 10 -(Application Example: Photo OCR)
18.1
1. Text detection
2. Character segmentation
3. Character classification
259
- 10 -(Application Example: Photo OCR)
18.2
260
- 10 -(Application Example: Photo OCR)
261
- 10 -(Application Example: Photo OCR)
18.3
1.
2.
3.
262
- 10 -(Application Example: Photo OCR)
18.4
min).mkv
100%
72%
100% 72%
89%
100%
1%
100%
10%
263
- 10 -(Conclusion)
(Conclusion)
19.1
x(i)y(i)
K-
x(i)
F1
264
- 10 -(Conclusion)
Andew Ng
265