You are on page 1of 15

.

-:

Bilog-Mg . :

-1

-2

-3

-4

.

: .

Abstract:Comparing Efficiency of Maximum Likelihood Estimation and Bayesian


Estimation in Estimating Ability Parameter Utilizing Three Logistic Model.
The aim of the study was to compare efficiency of maximum likelihood
estimation and Bayesian estimation in estimating ability parameter utilizing three
logistic model. For the purpose of this study a mental ability test, which was
developed in a previous study was used.Software Bilog-MG was used to analyze
the data. The findings were as follows:
1. Accuracy of ability parameter estimates increases for high truncated ability
sample and low truncated ability sample, when Bayesian Method was
applied comparing with estimates generated by applying maximum
likelihood estimation regardless of items number.
2. Accuracy of ability parameter estimates increases when Bayesian Method
was applied comparing with estimates generated by applying maximum
likelihood estimation, provided that the average of items difficulty
matching the average of the respondents ability mean.
3. Accuracy of ability parameter estimates increases when maximum
likelihood estimations method was applied comparing with estimates
generated by applying Bayesian Method when all items pool was used,
where as, Bayesian Method become more efficient when shorter test
employed.
4. When items difficulty mean and respondent ability mean differ
significantly, maximum likelihood method become more efficient than
Bayesian Method, provided that respondents ability matching item
difficulty.
The study recommends to investigate efficiency of maximum likelihood
estimation and Bayesian Estimation Method by doing further research based on
another designs.
Key word: Maximum Likelihood Estimation, Bayesian Estimation, Three
Logistic Model.

-2 ) :(Bayesian estimation

) (Prior Information .
-3 ) :(Joint Maximum likelihood estimation

.
-4 ) : (Conditional maximum likelihood estimation

) (Likelihood .
-5 ) :(Marginal Maximum likelihood estimation

(Marginal likelihood function)

(.)Hambelton &Swaminthan,1985

:

.
) (Lord, 1980
) (Relative efficiency, RE (Item
) information function, IIF


:
2

) p (
i

) Pi( )Qi(

I( )
i 1

:
) : I( .
: .
) : pi( ) Qi ( ) 1 Pi (
) : pi( .

A B .
) I A (
) I B (

RE ( )

A B <1
) RE( RE( ) >1 A B =1
) RE( A, B

SEB ( )2
SE A ( )2

RE( )





(Hung, Lin & shin, 2001)
) (Bilog, Bilog- MG, PLC
) (root mean squared differences: RMSD

.BILOG
Bilog-MG 0.3 (Scientific
) Software International, Inc :
(Expected A Posterior, EAP)
(Maximum A posteriori, MAP)


) (EAP
.

) (Hambeleton & Swaminithan, 1985


.
(Stocking, 1990)


) (Hambeleton & Cook, 1983

)(Lord, 1980

) (Ree & Jenesen, 1983 .
( )2004



.
).(Maximum Likelihood Estimation, MLE

)(Likelihood Function a 1,...,N u a u1a ,u 2 a ,..., u na
a n a
:

) InLa ( ) u aj In p j ( ) ( 1 u aj )In 1 p j (
j 1

) Pj ( j
n
) log La (
u p j ( ) Pj
aj
.
0

j 1 p j ( )1 Pj
.

InLa ( )
t I ( )

S .E ( )

) I (
1

t 1



x n ,...., x 2 , x 1

(Prior
) distribution


(Posterior distribution)
.(Bayesian Statistics)

a ~ N ( 0 ,1 ),a 1,..., N :
) f (u a ) f (a
)f (u

f ( a u)


) f ( a u) (likelihood Function af ( a )
( a )
) f (u a :
f (u) ) f (u a

) L(u a ) L(u1 , u2 ...u n 1, ..., n


N

) L(u a a
a 1

Piau i aQia1 u i a
a 1 i 1

Pia :
n
N


a2

1 / 2

f (1 ,2 ,..., N u) L (u 1 ,..., N )e

) f(u
:

Lnf (1,2 ,...,N | u) cons tan t InL(u | 1,...N ) 1/ 2a2

Ln f(1 , 2 ,...., N | u) 0, a 1,..., N


a
n

Lnf( a u ) kiuis ki Pia a 0


a
i 1

ki (uia Pia ) a
i 1

ki D
ki Dai
) Da i ( Pia Ci
) Pia (1 Ci

ki

:
1.7=D
= a i
= bi
) Pia Pi (

n

) a ki (uia Pia
i 1

:


:
.1

.2

.3

:



) (Uniform

.






.
:
-1
.
-2
.
:

)Pelton, 2002(

.



( )999 ( )33

.
( )Farish & Stephen, 1984
)2000(



.

( Garre
)& Vermunt, 2006



.
( & Ban, Hanson, Wang, Qing
) Harris, 2000
( : )Computerized Adaptive Testing: CAT
( )One Estimation Method Cycles
)Stocking's Method( A B
Bilog ( )3000 1000 300

()Standard Error of Estimation: SE

B
( )Anchor Items
.
( )Wang & Vispoel, 1998
:
: ( )Owen's Method )EAP(
(MAP) :


.
( )Linden, 1998


)true posterior( 300


( )mean-squared error .
)Kim, 2001(



.
:
.

( )2004 71
:

)0.92(
.
( )2004
1000
.
:
.1 Bilog-Mg

59 5556 46 42 23 18 11 10 7 3 2 :
.69 67 65 63 60
. 54 .
.2
:

.
:
-


500
.
-


500
.
-


.
.
30
30 27 25 24 22 19 14 9 8 6 1 :
.70 68 64 62 61 58 57 52 50 48 47 45 444143 39 37 34 32
20
68 66 64 58 57 49 48 43 41 40 31 24 20 15 14 12 9 6 1 :
.70
-

20 .

35 8 14 29 39 48 28 33 38 57 15 30 19 31 24 16 :
.61 51 12 58
20
52 :
.37 64 5 4 21 25 6 45 43 34 9 54 49 52 70 53 47 36 1

:
.3 ( )2

:
2.09 1.82 1.08 0.71 -0.04 -1.62 -1.98 .
.4

(.)3
.5 Bilog-MG :
: Expected A (
)Posteriori

.6 )Maximum A Posteriori(

.


( )2004
.


1.297

1.297
. 1
.
.1

SEML

0.6554
-1.98
0.6637
-1.62
0.6153
-0.04
0.4336
0.71
0.2446
1.08
0.2786
1.82
0.3641
2.09
0.4650

:
:SEML .
:SEB .
: REB/LM
.

4.322
0.654 -0.04 = 0.4366
0.71 =
. 2
.
SEB
0.5490
0.5530
0.5238
0.3854
0.2583
0.2384
0.3187
0.4038

REB/ML
1.425
1.440
1.380
1.266
0.897
1.366
1.305
1.297

REB/ML
SEB
SEML

6.636
0.6406
2.6128
-1.98
8.914
0.6791
2.0061
-1.62
0.654
0.5115
0.4138
-0.04
0.871
0.4366
0.3946
0.71
1.142
0.4024
0.4301
1.08
1.015
0.4412
0.4445
1.82
1.074
0.4184
0.4337
2.09
4.322
0.5032
0.9622

.2


0.821
0.821

1
3
.

-1.98
-1.62
-0.04
0.71
1.08
1.82
2.09

SEML
0.8515
0.6157
0.2849
0.2671
0.2185
0.1953
0.2615
0.3849

SEB
0.5719
0.4862
0.4416
0.4096
0.4179
0.4098
0.3430
0.4400

REB/ML
2.2168
1.604
0.416
0.425
0.273
0.227
0.581
0.82

.3


54 1000

0.761
0.761 4
.

-1.98
-1.62
-0.04
0.71
1.08
1.82
2.09

SEML
0.8526
0.6488
0.2485
0.2354
0.2138
0.1510
0.2397
0.3700

SEB
0.5804
0.5250
0.4445
0.3705
0.3996
0.4176
0.3359
0.4391

REB/ML
2.158
1.527
0.3130
0.4037
0.2860
0.1310
0.509
0.761

.4

30 54

30
1000
1.505
1.505 .
5 .


-1.98
-1.62
-0.04
0.71
1.08
1.82
2.09

SEML
1.795
0.9904
0.3941
0.4029
0.3627
0.2919
0.3657
0.5696

SEB
0.5800
0.5861
0.4585
0.4407
0.3980
0.4541
0.4287
0.4780

REB/ML
4.1356
2.855
0.73881
0.8360
0.830
0.4130
0.728
1.505

. 5 33

20
20

2.316 6 20.

-1.98
-1.62
-0.04
0.71
1.08
1.82
2.09

SEML
1.5905
1.3045
0.5183
0.5104
0.4142
0.2939
0.3668
0.7141

SEB
0.5900
0.6098
0.5268
0.4776
0.7172
0.4601
0.3939
0.4965

REB/ML
7.267
4.576
0.9680
1.142
0.986
0.408
0.867
2.316

.6 23



20
13.304
.

-1.98
-1.62
-0.04
0.71
1.08
1.82
2.09

SEML
5.3870
4.0784
1.0912
0.6034
0.4049
0.2783
0.2858
1.7327

SEB
0.7154
0.7243
0.6890
0.6852
0.5467
0.4462
0.4045
0.6016

REB/ML
56.7017
31.706
2.508
0.775
0.548
0.389
0.499
13.304

.7

20

2.680
8 .

-1.98
-1.62
-0.04
0.71
1.08
1.82
2.09

SEML
1.4319
0.9188
0.3070
0.4162
0.4816
0.7728
1.0410
0.7670

SEB
0.5627
0.4466
0.4451
0.3980
0.4942
0.5010
0.5867
0.4906

REB/ML
6.475
4.233
0.476
1.094
0.950
2.380
3.148
2.680

.8




4.32 1.297



( Garre & Vermunt,
.)2006


( )54





30
1.505
20 2.316
)Linden, 1998(



( Wang & Vispoel,
.)1998




2.680 13.304




-0.31 0.21
1.63

.

.


.


Ban. J. Hanson. B. Wang. T., Qing. Y. & Harris. D. (2000). A Comparative
Study of online Pretest Item Calibration/Scaling Methods in CAT.
Papers was Presented at the Annual Meeting of the American
Educational Research Association, New Orleans. [on-line]. Available:
http://eric.ed.gov.
Farish, Stephen. J. (1984). Investigating Item Stability: An Empirical
Investigation into the Variability of Item Statistical Under of Varying
Sample Design and Sample Size. Occasional Paper No. 18. Condition
Publication Type: 143; 110, Australia.
Garre. G. & Vermunt. K. (2006). Avoiding Boundary Estimation in Latent Class
Analysis by Bayesian Posterior Mode Estimation. Behaviormentrika,
Vol. 33. No. 1.
Hambleton, R. K, Cook. L. L. (1983). Robutness of Item Response Models and
Effects of Test Length and Sample Size in the Precision of Ability
Estimates. New York. In D. J. Weiss (Ed), New Horizons in Testing.
Pp.31-49.
Hambleton, R. K. & Swaminathan, H. (1985). Item Response Theory: Principles
and Applications, Boston, Kluwer, Nijhoff Publishing.
Huang, C. Y, Lohss, W. E, Lin, C. Shin, D. (2001). Item Calibration of Licensure
Test with Multiple Specialty Components. Submitted to Division DI:
Educational Measurements, Psychometrics and Assessment. Enabled
Tiger, a web-based Manuscripts Processing System. Michigan State
University.
Kim, S. (2001). An Evaluation of A Markov Chain Monte Carlo Method for the
Rasch Model. Applied Psychological Measurement, Vol. 25, No. 2,
163-176.


1
4
5
6
8
3
12
13
14
15
16
17
19
20
21
22
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38

-1.150
0.727
0.624
0.192
1.186
-0.149
1.020
1.017
1.513
1.468
2.107
0.969
1.702
0.956
0.351
0.510
2.286
0.594
1.221
0.900
2.048
1.214
1.427
1.751
1.268
1.471
0.130
1.406
-0.746
0.002
1.564

0.637
2.491
39
0.207
0.709
0.696
1.661
40
0.379
1.080
0.978
0.473
41
0.373
1.771
1.353
-0.240
43
0.290
1.118
1.362
1.244
44
0.193
0.901
1.030
-0.302
45
0.283
0.873
1.655
-0.639
47
0.228
1.297
0.625
2.029
48
0.301
0.606
1.476
-0.415
49
0.268
1.275
0.803
0.634
50
0.171
3.385
1.624
-0.936
51
0.105
0.940
0.964
-0.547
52
0.255
1.119
1.573
-0.744
53
0.073
0.836
1.865
-0.496
54
0.227
0.801
2.566
1.447
57
0.290
0.835
0.724
1.744
58
0.235
0.644
2.838
1.370
61
0.126
1.004
0.943
0.817
62
0.389
1.104
0.599
0.484
64
0.318
0.707
1.037
1.143
66
0.190
0.711
1.037
0.980
68
0.150
0.714
0.697
-0.906
70
0.139
0.692
0.911
0.514
71
0.136
3.674
0.141
2.887
0.250
0.527
0.193
4.908
0.351
0.554
0.211
0.756
0.287
1.126
0.089
1.020
0.184
2.433

0.225
0.303
0.201
0.166
0.324
0.147
0.252
0.169
0.189
0.084
0.163
0.230
0.132
0.091
0.164
0.091
0.164
0.269
0.352
0.311
0.250
0.365
0.372

You might also like