Professional Documents
Culture Documents
Data Mining - Chapter 3
Data Mining - Chapter 3
Hc k 1 2009-2010
Ni dung
3.1.
3.2.
3.3.
3.4.
ng dng
3.5.
Cc vn vi hi qui
3.6.
Tm tt
y=x+1
Y1
X1
Kho st cc yu t tc ng n xu hng
s dng qung co trc tuyn ti Vit Nam
S tin cy cm nhn
S tng tc (+0.373)
Hi qui (regression)
Phn loi
X: cc bin d bo (predictor/independent
variables)
Y: cc p ng (responses/dependent variables)
Quan
m t s nh hng ca X i vi Y.
11
Phn loi
Hi qui tuyn tnh (linear) v phi tuyn (nonlinear)
Single: X = (X1)
Multiple: X = (X1, X2, , Xk)
13
-Dng ng thng
-Dng parabola
14
15
c lng b thng s (
) t c m
hnh hi qui tuyn tnh n bin
Thng d (residual)
Tng thng d bnh
phng (sum of
squared residuals)
ti thiu ha
Tr c lng ca
16
17
Tr c lng ca
b thng s b
y b0 b1 x1 b2 x2 K bk xk
b X X
Y1
Y
Y 2 , X
Yn
1
1
M
1
X Y
x1,1 x1,2 K
x2,1 x2,2 K
M M
xn ,1 xn ,2 K
x1,k
x2, k
, b
M
xn ,k
b0
b
1
M
bk
18
19
Population
(Thousands) x2
Toy sales
(Thousands of Dollars) y
1.0
200
100
5.0
700
300
8.0
800
400
6.0
400
200
3.0
100
100
10.0
600
400
20
21
Y = f(X, )
V d: hm m, hm logarit, hm Gauss,
Ti u ha cc b
22
3.4. ng dng
3.5. Cc vn vi hi qui
Cc gi nh (assumptions) i km vi bi
ton hi qui.
Lng d liu c x l.
nh gi m hnh hi qui.
3.6. Tm tt
Hi qui
25
Hi & p
26