You are on page 1of 8

Lec1: oN, ÚÚOþ

Ü•²

2012 c 2 14 F

1 oN,
Äk·‚wwÚO(Statistics) ½Â:

/A branch of mathematics dealing with the collection, analysis, interpretation and


presentation of masses of numerical data0—Webster’s New Collegiate Dictionary

/The branch of the scientific method which deals with the data obtained by
counting or measuring the properties of populations0— Fraser (1958)

/The entire science of decision making in the face of uncertainty0— Freund and
Walpole (1987)

/The technology of the scientific method concerned with (1) the design of exper-
iments and investigations, (2) statistical inference0— Mood, Graybill and Boes
(1974)

¤k ½ÂÑ¿›X: ÚO´˜‡±•£íä•8 nØ.

1.1 oN

{ü `, oN´·‚¤a, @ ‡N|¤ 8Ü. 'X

~1.1. b½˜1 ¬k10000‡, Ù¥k ¬•k¢¬, • O¢¬Ç, ·‚ l¥Ä ˜Ü


©, X100‡?1u . džz‡ ¬=•˜‡‡N, ù110000‡ ¬¡•oN (Population),
ù1 ¬ êþ10000¡•oNNþ½öoNŒ .

eoN¥‡N ê8•k•‡, K¡•k•oN (Finite population), ÄK¡•Ã•oN


(Infinite population).
oN f8¡•foN (Subpopulation). XJØÓ foNkXØÓ A (½ö5Ÿ),
@oòù ØÓ foN3ïÄm©žB«©m5U •Ð )oN. 'X,«A½ †¬

1
éA½ foNkXØÓ A, @oXJØ\«©ù ØÓ foN, KT†¬ AŒUÒ
¬ · ØUEOÑ5.
, , «OÑ5ù ØÓ foN•Œ±¼ •O( O(íä). 'X‡ïÄ<‚ p
©Ù, KUì5O,<« r<+©•˜ foN, U ŠÑ•O( íä.
l , UìoN´ÄkX5Ÿ»É foN, Œ±roN©•“ ÓŸ” ( Homogeneity)Ú“
ÉŸ” (Heterogeneity)oN. ýŒÜ©ÚO•{Ñ´‡¦oNÓŸ . ùp·‚=•ÄÓŸoN.
?˜Ú, <‚ý ¤'% Ø´oNS‡N , ´'%‡Nþ ˜‘(½A‘) êþ•
I, XF1 Æ·, "‡ º€. 3þ~¥e ¬• ¬^0L«, e ¬•¢¬^1L«, ·‚
'% ‡N Š´0„´1. Ïd·qŒ¼ oN Xe½Â:
oNŒ±w¤k¤k‡Nþ ,«êþ•I ¤ 8Ü, Ïd§´ê 8Ü.
duz‡‡N Ñy´‘Å , ¤±ƒA ‡Nþ êþ•I Ñy•‘k‘Å5. l
Œ±rd«êþ•Iw¤‘ÅCþ(Random variable,{P•r.v.), Têþ•I3oN¥ ©
ÙÒ´d‘ÅCþ ©Ù. ±þ~5`², b½10000• ¬¥¢¬ê•100‡, Ù{ • ¬,
¢¬Ç•0.01. ·‚½Â‘ÅCþXXe:
(
1 ¢¬
X=
0 ¬,

ÙVÇ©Ù•0–1©ÙB(1, 0.01). ÏdA½‡Nþ êþ•I´r.v. X * Š(Observation).


ù ˜5, oNŒ±^˜‡‘ÅCþ9Ù©Ù5£ã, ¼ Xe½Â:

½  1.1. oN´˜‡VÇ©Ù.

Q,oNŒ±À•´˜‡VÇ©Ù, ÏdedVÇ©Ù•“ xx©Ù”, •²~¡oN•“


xxoN”. 'Xe• ©Ù, K¡•“ oN”; e••ê©Ù, K¡•“ •êoN” .

1.2

(Sample)´ d o N ¥(U , « K)Ä ˜Ü©‡N|¤ 8 Ü. ¤•¹


‡ N ê 8 = ¡ •“ N þ” ½ ö“ Œ ” (sample size). ‡ N Ñ5|¤ L
§¡•Ä (sampling), Ä • ª k ü «: V Ç Ä (probability sampling) Ú š V Ç Ä
(nonprobability sampling).

• VÇÄ ¤¢VÇÄ , ´•oN¥ z‡‡Nѱ˜‡¯kŒ±(½O( VÇ Ä


Ñ5Š• . VÇÄ •ªk: {ü‘ÅÄ , åÄ ,© Ä ,õ ãÄ
(•[žëwÄ N ). Ù¥~^ Ò´{ü‘ÅÄ . ù«Ä •ªkü‡A :
(1) z‡‡NѱƒÓ VÇ Ä . ù¿›Xz‡‡NÑäk“L5.
(2) ‡N ŠƒmƒpÕá.
Ïd, eP •X1 , · · · , Xn , oN•X, K3{ü‘ÅÄ •ªe, X1 , · · · , Xn †o
NX´ÕáÓ©Ù , ~P•
X1 , · · · , Xn i.i.d X

d{ü‘ÅÄ ¼ (X1 , · · · , Xn )¡•{ü‘Å . ^êÆŠóòù˜½ÂQã


Xe:

2
½Â 1.2. k˜oNF, X1 , · · · , Xn •lF ¥Ä Nþ•n ,e
(i) X1 , · · · , Xn ƒpÕá,
(ii) X1 , · · · , Xn ƒÓ©Ù, =Ók©ÙF,
K¡X1 , · · · , Xn •{ü‘Å , kž{¡•{ü ½‘Å .

oN•F, X1 , · · · , Xn •loN¥Ä {ü‘Å , KX1 , · · · , Xn éÜ©Ù•


n
Y
F (x1 ) · F (x2 ) · · · · F (xn ) = F (xi )
i=1

eF k—Ýf,KÙéÜ—Ý•
n
Y
f (x1 ) · f (x2 ) · · · · f (xn ) = f (xi )
i=1

oNNþ ž,•k?1k˜£Ä âU¼ {ü‘Å . oNNþ Œ½¤Ä


3oN¥¤Ó'~ ž, Œ±Cq@•Ã˜£Ä ¼ ´{ü‘Å .
ùp·‚=•Ä{ü‘Å , ±e{¡ .

• šVÇÄ ù´• oN¥ , ‡NvkŬ Äѽö‡N ÄÑ VÇØUO(


(½ž Ä •{. Ïd‡N ÄÑ IO´Äua, oN ˜ b ŠÑ . d
u‡N´š‘Å ÄÑ, ÏdšVÇÄ ØU OÄ Ø . šVÇÄ •ªkó,Ä
(Accidental sampling), Ä (Quota sampling)Ú8 Ä (Purposive sampling) .

3‡N Ä ƒc, O Ä Œ •n ÒŒ±À•´‘ÅCþ, Ïd~P•X1 , · · · , Xn ;


‡N Ä , ÒLy•äN êŠx1 , · · · , xn (¡• *ÿŠ½ö ˜|
Š, ŒU Š‰Œ~¡• ˜m, P•X . Ïd QŒ±À•´‘ÅCþ(Ä c), q
Œ±À•´äN êŠ(Ä ).

1.3 Ä ©Ù

Œ±À•´‘ÅCþ, l VÇ©Ù¡•Ä ©Ù. ‡û½Ä ©Ù, Ò‡Šâ


* Š äN•I 5Ÿ(ù 9k' ;’•£) , ±9éÄ •ªÚéÁ ?1 •ª
), d ~~„7L\˜ <• b½. e¡w˜ ~f:

~1.2. ˜Œ1 ¬ kN ‡, Ù¥¢¬M ‡, N ®•, M ™•. y3l¥ÄÑn‡u Ù¥¢


¬ ‡ê, ^± OM ½¢¬Çp = M/N. Ä •ª•: ؘ£Ä , ˜gʇ, •gÄ ,†
Ä n‡•Ž. ¦Ä ©Ù.

kò¯Kêþz. Xi L«1igÄÑ ,-
(
1 ÄÑ •¢¬
Xi =
0 ÄÑ •Ü‚¬,
¬X1 , · · · , Xn ¥ z˜‡Ñ•U 0, 1Š. ‰½˜| x1 , · · · , xn , z‡xi •0½1. ·‚¤¦
Ä ©Ù•P (X1 = x1 , · · · , Xn = xn ). eP¯‡Ai = {Xi = xi }, |^VǦ{úª
P (A1 · · · An ) = P (A1 )P (A2 |A1 ) · · · P (An |A1 A2 · · · An−1 )

3
ØJ¦ÑÄ ©Ù. •Bu?Ø, kwn = 3. x1 = 1, x2 = 0, x3 = 1,K

P (X1 = 1, X2 = 0, X3 = 1)
= P (X1 = 1)P (X2 = 0|X1 = 1)P (X3 = 1|X1 = 1, X2 = 0)
M N −M M −1 M M −1 N −M
= · · = · ·
N N −1 N −2 N N −1 N −1
Pn
阄œ/, P i=1 xi = a,|^VǦ{úª´¦

P (X1 = x1 , X2 = x2 , · · · , Xn = xn )
M M −1 M −a+1 N −M N −M −n+a+1
= · ··· · ··· , (1.1)
N N −1 N −a+1 N −a N −n+1
Xn
x1 , · · · xn Ñ• 0½ 1, … xi = a ž•þã(J(Ù{œ/• 0) .
i=1

dþãOŽŒ„ X1 , · · · , Xn Ø´ƒpÕá , Ä ©Ù´|^¦{úª, ÏL^‡VÇ


OŽÑ5 .

~1.3. E±þ~•~, Ä •ªU•k˜£Ä , =zgÄ Pe(J, , òÙ˜£ ,2


Ä1 ‡, † Ä n‡•Ž, ¦Ä ©Ù.

3k˜£Ä œ/, zgÄ ž, N ‡ ¬¥ z˜‡ ±1/N VÇ ÄÑ, džP (Xi =


1) = M/N, P (Xi = 0) = (N − M )/N, k
 a  n−a
M N −M
P (X1 = x1 , · · · , Xn = xn ) = , (1.2)
N N
n
X
x1 , · · · xn Ñ• 0½ 1, … xi = a ž•þã(J(Ù{œ/• 0).
i=1

Œ„d~'þ~‡{ü, Ï• ~¥ X1 , · · · , Xn ´ÕáÓ©Ù , þ~¥X1 , · · · , Xn Ø


Õá. n/N é ž, (1.1)Ú(1.2) Oé . Ï n/N é ž,Œrþ~¥ Ø£Ä Š
k˜£Ä 5?n.

1.4 ÚOíä

loN¥Ä ˜½Œ íäoN VÇ©Ù •{¡•ÚOíä(Statistical


Inference)
oN©ÙF /ª®•, •´¹kk•‡™•ëêž, ‡ïÄ ¯K~~Ly•éëê
,«íä. 'X

~1.4. b Ø Ñl ©ÙN (0, σ 2 ), ò,ÔN¡-ng, X1 , · · · , Xn Ä ©ÙéN


´
n
1 X
f (x1 , · · · , xn ) = (2π)−n/2 exp{− (xi − a)2 }
2σ 2 i=1

édÁ ÚOí䌱´ÔN -þ O(^X̄5 O), ½ö¡-°Ý.• . Ïdùa


¯K¡•´ëêÚO.

4
oN©Ù/ª™•ž¤?1 ÚOíä¡•šëêÚOíä, šëêÚOíä ̇
8 ´éoN©ÙŠÑíä.
ÚOíä •)e n•¡SN: (1) JÑ««ÚOíä •{. (2) OŽk'íä•{5U
êþ•I, Xcã~f¥^X̄ ON (a, σ )¥ 2
a^P (|X̄ − a| > c)L«íä5U êþ•I.
(3) 3˜½ ^‡Ú`û5OKeÏé•` ÚOíä•{, ½y²,«ÚOíä•{´•`
.

2 ÚOþ
½  2.1. d ŽÑ þ´ÚOþ (Statistic), ½ , ÚOþ ´ ¼ê.

éù˜½Â·‚ŠXeA:`²:
(1) Ú O þ • † k ', Ø U † ™ • ë ê k '. ~ XX ∼ N (a, σ 2 ), X1 , · · · , Xn ´
Pn Pn
l o NX¥ Ä i.i.d. , K i=1 Xi Ú i=1 Xi2 Ñ ´ Ú O þ, aÚσ 2 • ™ • ë ê ž,
Pn Pn
i=1 Xi /σ ÑØ´ÚOþ.
2 2
i=1 (Xi − a)Ú

(2) du äkü-5, = QŒ±w¤äN ê, qŒ±w¤‘ÅCþ; ÚOþ ´


¼ê, ÏdÚOþ •äkü-5. Ï•ÚOþ ŒÀ•‘ÅCþ(½‘Å•þ), Ïdâk
VÇ©ÙŒó, ù´·‚|^ÚOþ?1ÚOíä •â.

!eZ~^ ÚOþ

1. þŠ

X1 , · · · , Xn ´l,oNX¥Ä ,K¡
n
1X
X̄ = Xi .
n i=1

• þŠ(Sample mean). §©O‡N oNêÆÏ" &E.

2. •

X1 , · · · , Xn ´l,oNX¥Ä ,K¡
n
1 X
Sn2 = (Xi − X̄)2 ,
n − 1 i=1

• • (Sample variance).§©O‡N oN• &E, Sn ‡N oNIO &E.

3. Ý

X1 , · · · , Xn •loNF ¥Ä , K¡
n
1 X
an,k = Xik , k = 1, 2, · · ·
n i=1

• k :Ý. AOk = 1ž, an,1 = X̄,= þŠ. ¡


n
1X
mn,k = (Xi − X̄)k , k = 2, 3, · · ·
n i=1

5
• k ¥%Ý.AOk = 2ž,mn,2 = Sn2 ,= • . :ÝÚ¥%ÝÚ¡• Ý
(Sample moments).

4. ‘‘Å•þ Ý

(X1 , Y1 ), · · · , (Xn , Yn )•l ‘oNF (x, y)¥Ä ,K


n n
1X 2 1 X
X̄ = Xi , SX = (Xi − X̄)2
n i=1 n − 1 i=1
n
1X 1 X
Ȳ = nYi , SY2 = (Yi − Ȳ )2
n i=1 n − 1 i=1
n
1X
SXY = (Xi − X̄)(Yi − Ȳ )
n i=1

©O¡•XÚY þŠ! • 9X ÚY • (Sample covariance).

5. gSÚOþ9Ùk'ÚOþ

X1 , · · · , Xn •loNF ¥Ä , rÙUŒ ü •X(1) ≤ X(2) ≤ · · · ≤ X(n) ,K


¡(X(1) , X(2) , · · · , X(n) )•gSÚOþ (Order statistic), (X(1) , · · · , X(n) ) ?˜Ü©•¡•g
SÚOþ .
|^gSÚOþŒ±½Âe ÚOþ:
(1) ¥ ê:
(
X( n+1 ) n•Ûê
m1/2 = 1
2 (2.1)
2 [X(n/2) + X(n/2+1) ] n•óê

¥ ê (Sample median)‡NoN¥ ê &E. oN©Ù'u,:é¡ž, é¡¥%Q


´oN¥ êq´oNþŠ, džm1/2 •‡NoNþŠ &E.
(2) 4Š: X(1) ÚX(n) ¡• 4 ŠÚ4ŒŠ,§‚Ú¡• 4Š (Extreme values of
sample). 4ŠÚOþ3'u/³¯KÚá Á ÚO©Û¥´~^ ÚOþ.
(3) p© ê (0 < p < 1): Œ½Â•X[(n+1)p] ,d?[a]L«¢êa êÜ©. p =
1/2, n•Ûêž, d½Â†(1)¥ ¥ êƒÓ. p© ê(Sample p-fractile)‡N o
Np© ê&E.
(4) 4 : R = X(n) − X(1) ,¡• 4 (Sample range), §´‡NoN©ÙÑÙ§Ý
&E.

6. CÉXê

X1 , · · · , Xn •loNF ¥Ä ,K¡

V̂ = Sn /X̄ (2.2)

• CÉXê (Sample coefficient of variation). §‡N oNCÉXê(Population coefficient


p
of variation) cν &E. oNCÉXê ½Â´: cν = V ar(X)/E(X),§´ïþoN©ÙÑÙ
§Ý þ, ùÑ٧ݴ±oNþŠ•ü 5Ýþ.

6
7. ÝXê

X1 , · · · , Xn •loNF ¥Ä ,K¡
n n
mn,3 √ X 3
.X 3/2
β̂1 = 3/2
= n (Xi − X̄) (Xi − X̄)2 (2.3)
mn,2 i=1 i=1

• ÝXê (Sample skewness). §‡N oN ÝXê &E, oN ÝXê(Population


3/2
skewness)½Â´: β1 = µ3 /µ2 ,d?µi (i = 2, 3)´oN i ¥%Ý. βs ´‡NoN©Ù šé
¡5½/ -50 ˜«Ýþ. ©ÙN (a, σ ) 2
Ý•".

8. ¸ÝXê

X1 , · · · , Xn •loNF ¥Ä ,K¡
n n 2
mn,4 X
4
.X
2
β̂2 = − 3 = n (Xi − X̄) (Xi − X̄) −3 (2.4)
m2n,2 i=1 i=1

• ¸ÝXê (Sample kurtosis). §‡N oN¸ÝXêβk &E. oN¸ÝXê(Population


kurtosis)½Â´:β2 = µ4 /µ22 − 3,Ù¥µi (i = 2, 4)Xc¤ã. βk ´‡NoN©Ù—Ý-‚3¯ê
NC /¸0 k€§Ý ˜«Ýþ. ©ÙN (a, σ 2 ) ¸Ý•".

n!² ©Ù¼ê

½  2.2. X1 , · · · , Xn •goNF (X)¥Ä i.i.d. , òÙUŒ ü •X(1) ≤ X(2) ≤


· · · ≤ X(n) ,é?¿¢êx,¡e ¼ê


 0 x < X(1)
k

Fn (x) = X(k) ≤ x < X(k+1) , k = 1, 2, · · · , n − 1 (2.5)
 n


1 X(n) ≤ x

•² ©Ù¼ê (Empirical distibution function).

´„² ©Ù¼ê´üNšümëY¼ê, äk©Ù¼ê Ä 5Ÿ. §3x = X(k) , k =


1, 2, 3, · · · , n?kmä, §´3z‡mä:a ÌÝ•1/n F¼ê. eP«5¼ê
(
1 x∈A
I[A] (x) =
0 Ù¦,

KFn (x)ŒL•
n
1X
Fn (x) = I[Xi ≤x] . (2.6)
n i=1

d½ÂŒ•Fn (x)´=•6u X1 , X2 , · · · , Xn ¼ê, Ïd§´ÚOþ. §ŒU Š


•0, 1/n, 2/n, · · · , (n − 1)/n, 1.ePYi = I[Xi ≤x] , i = 1, 2, · · · , n,KP (Yi = 1) = F (x), P (Yi =
n
0) = 1 − F (x), …Y1 , Y2 , · · · , Yn , i.i.d. ∼ b(1, F (x)), nFn (x) =
P
Yi ∼ b(n, F (x)),Ïdk
i=1

Xn  n
P (Fn (x) = k/n) = P Yi = k = [F (x)]k [1 − F (x)]n−k
i=1
k

7
|^ ‘©Ù 5ŸŒ•Fn (x)äke Œ 5Ÿ:
(1) d¥%4•½n, K n → ∞žk

n(F (x) − F (x)) L
p n −−−−→ N (0, 1).
F (x)(1 − F (x))

(2) dBenoulliŒê½Æ, K3n → ∞žk


P
Fn (x) −−−→ F (x)

(3) dBorelrŒê½Æ, K3n → ∞žk

P ( lim Fn (x) = F (x)) = 1


n→∞

(4) •?˜Ú, ke Glivenko-Cantelli Theorem (1933):

½ n 2.1. F (x)•r.v. X ©Ù¼ê, X1 , · · · , Xn • goNF (x) {ü‘Å , Fn (x)•


Ù² ©Ù¼ê, PDn = sup |Fn (x) − F (x)|,Kk
−∞<x<∞

P ( lim Dn = 0) = 1.
n→∞

You might also like