Introduction

process which examines student response to individual test items/ questions

in order to assess the quality of those items and of the test as a whole. This

test item analysis report is based on a 20 multiple-choice test question

administered to 25 students (see Appendix B). The quality of individual items

is assessed by comparing student’s item responses to their total test scores.

Statistical analysis has been shown to summarize the performance of the test

as a whole (see Appendix A). Three types of graphical representation of

statistical data are also shown in the report (A histogram, frequency polygon,

a normal distribution curve and the Ogive).

The purpose of this report is to disseminate information based on the

descriptive statistics on 20 multiple-choice test items administered to 25

students.

3. Test analysis

Descriptive statistics describes the basic features of the data that is, they

describe what the data shows. They provide simple summaries about the

sample and the measures and present quantitative descriptions in a

manageable form. A set of test scores were considered to calculate the mean,

mode, median and standard deviation, and a normal distribution graph for a

distribution with a mean of 65, a median of 65 and a mode of 65 is drawn (see

Figure 1).

Mean 65.79

Mode 65.00

Median 65.00

STDEV2 479.57

STDEV 21.90

mean, the median and the mode are the same. If the distribution is truly

normal (i.e., bell-shaped), the mean, median, and the mode are all equal to

each other.

1

Figure 1: Normal distribution curve

Mean

Median

Mode

0.4

0.3

0.2

0.1

0.0

The numbers on the x-axis represent the standard deviations from the mean.

The points where there is a change in curvature is one standard deviation on

either side of the mean. Dark blue is less than one standard deviation from

the mean. For the normal distribution, this accounts for about 68% of the set

(dark blue) while two standard deviations from the mean (medium and dark

blue) account for about 95% and three standard deviations (light, medium,

and-4SD

dark-3SD

blue) account

-2SD for about

-1SD 65 99.7%.

+1SD The+2SDcurve is symmetric.

+3SD +4SD This is a

heterogeneous distribution because the values are further away from the

mean.

H 100

L 15

Range 85

Number of intervals 10

Size of interval 8.5

and the frequencies of the preceding interval up to the class interval. This is

explained in Table 3.

students:

2

Table 3: Cumulative frequency distribution

Limit Limit Interval Value Frequency Frequency

15.00 24 15-24 19.5 1 1

25.00 34 25-34 29.5 2 3

35.00 44 35-44 39.5 0 3

45.00 54 45-54 49.5 4 7

55.00 64 55-64 59.5 3 10

65.00 74 65-74 69.5 6 16

75.00 84 75-84 79.5 1 17

85.00 94 85-94 89.5 6 23

95.00 104 95-104 99.5 2 25

Frequency Histogram

7

6

5

4

Frequency

3

2

1

0

15-24 25-34 35-44 45-54 55-64 65-74 75-84 85-94 95-104

Interval

the frequency in a form of a rectangle. The class intervals are marked on the

horizontal axis (X-Axis) and the frequency is marked on the vertical axis (Y-

axis). The intervals are equal; therefore the height of each rectangle is

proportional to the corresponding frequency.

3

Figure 3: Frequency polygon

Frequency Polygon

7

6

5

4

Frequency

3

2

1

0

19.5 29.5 39.5 49.5 59.5 69.5 79.5 89.5 99.5

From Figure 3, the middle-values of the class interval of the given data are

plotted against the corresponding frequencies and the points obtained are

joined by straight lines. Points plotted (19, 1), (29, 2), (39, 0), (49, 4), (59, 3),

(69, 6), (79, 1), (89, 6), and (99, 2).

30

25

Cumulative Frequency

20

15

10

5

0

24 34 44 54 64 74 84 94 104

Upper Values

4

An Ogive is illustrated by an ‘S’ curve shape (see Figure 4). The graph is

drawn by plotting the points with coordinates having abscissae (X-axis) as

actual limits and ordinates(Y-axis) as the cumulative frequencies, (24,1),

(34,3), (44,3),(54,7), (64,10), (74,16), (84,17), (94,23), and (104,25) are the

coordinates of the points.

The reliability of a test refers to the extent to which the test is likely to produce

consistent scores. It theoretically ranges in value from zero (no reliability) to

1.00 (perfect reliability). The KR20 measures test reliability of inter- item

consistency. A higher value indicates a strong relationship between items on

the test.

k 20

k-1 19

Total pq 3.83

Stdev 21.90

(Stdev)2 479.57

KR20 1.04

From Table 4, it is clear that the KR20 is 1.04 this means that the test is

reliable or has a perfect reliability.

4. Item analysis

the effectiveness of individual test items.

items

Using the questions and results from the test, the degree of difficulty of each

question and the corresponding discrimination index was made (see Table 5

and Table 8).

5

Table 5: Difficulty index (p)

Difficulty index

q1 21 25 0.84

q2 22 25 0.88

q3 17 25 0.68

q4 12 25 0.48

q5 21 25 0.84

q6 17 25 0.68

q7 11 25 0.44

q8 12 23 0.52

q9 13 25 0.52

q10 8 24 0.33

q11 23 25 0.92

q12 19 25 0.76

q13 15 25 0.6

q14 21 25 0.84

q15 20 25 0.8

q16 22 24 0.92

q17 15 24 0.63

q18 8 24 0.33

q19 13 25 0.52

q20 16 25 0.64

correctly. It is a measure of how difficult the question was to answer. The

higher the difficulty index, the easier the question is. A value of 1.000 means

that all of the students answered this correct response and this question may

be too easy. If the p-value is greater than 0.75, the item is acceptable and if

the p-value is less than 0, 25, then the item is difficult.

6

Table 6: Interpretation of the difficulty level of questions

q1 0.84 Unacceptable Too easy

q2 0.88 Unacceptable Too easy

q3 0.68 Acceptable Fine

q4 0.48 Acceptable Fine

q5 0.84 Unacceptable Too easy

q6 0.68 Acceptable Fine

q7 0.44 Acceptable Fine

q8 0.52 Acceptable Fine

q9 0.52 Acceptable Fine

q10 0.33 Acceptable Fine

q11 0.92 Unacceptable Too easy

q12 0.76 Unacceptable Too easy

q13 0.6 Acceptable Fine

q14 0.84 Unacceptable Too easy

q15 0.8 Acceptable Fine

q16 0.92 Unacceptable Too easy

q17 0.63 Acceptable Fine

q18 0.33 Acceptable Fine

q19 0.52 Acceptable Fine

q20 0.64 Acceptable Fine

From the Table, it shows that 35% of the questions (1, 2, 5,11,12,14, and 16)

are unacceptable therefore it means that they were too easy and 65 %( 3, 4,

6, 7, 8, 9,10,13,15,17,18,19, and 20) are acceptable which shows that they

were fine (see Table 6).

The discrimination index was used to measure the ability of items/ questions

to distinguish between the lower and the upper group of students taking the

test (see Table 7).

Upper 15

Lower 10

students who have a high score on the test and those that get a low score on

7

the test. It is the difference between the percentage of correct responses in

the upper group and the percentage of the correct responses in the lower

group. Calculation procedures have been used to compare item responses to

total test scores using upper and lower level groups of students.

Discrimination

index

#U #L D

15 6 0.60

15 7 0.53

14 3 0.79

8 4 0.50

15 6 0.60

12 5 0.58

9 2 0.78

10 2 0.80

10 3 0.70

8 0 1.00

14 9 0.36

14 5 0.64

12 3 0.75

15 6 0.60

14 6 0.57

15 7 0.53

12 3 0.75

5 3 0.40

12 1 0.92

11 5 0.55

all positive therefore the item’s discrimination ability is adequate. A positive

value for this index means that higher scoring student tended to select the

response more often.

5. Conclusion

8

Since the KR20 is 1.04 which shows that the test is reliable, it means that the

questions of a test tended to “pull together”. Students who answered a given

question correctly were more likely to answer other questions correctly. If a

parallel test were developed by using similar items, the relative scores of

students would show little change.

6. References

September 12, 2007, from http://www.asu.edu/uts/InterpIAS.pdf

0ctober 11, 2007, From

http://en.wikipedia.org/wiki/Image:Standard_deviation_diagram.svg#file

2007, from http://students.washington.edu/hdevans/lec_11.doc

Classroom Application and Practice (8th Ed).John Wiley &sons,

Inc.United States of America.

from

http://palgrave.com/busines/taylor/taylor1/lectures/lectures/overheads/o

chap5.doc

reports). (2005). Retrieved September 12, 2007, from

http://personal.gscit.monash.edu.au/~dengs/teaching/GCHE/part3-

3.pdf

http://personal.gscit.monash.edu.au/~dengs/teaching/GCHE/part3-

3.pdf

7. Appendices

9

7.1 Appendix A

Prop Prop

#Questions #Correct #Incorrect Correct(p) Incorrect(q) pq

q1 21 4 0.84 0.16 0.13

q2 22 3 0.88 0.12 0.11

q3 17 8 0.68 0.32 0.22

q4 12 13 0.48 0.52 0.25

q5 21 4 0.84 0.16 0.13

q6 17 8 0.68 0.32 0.22

q7 11 14 0.44 0.56 0.25

q8 12 11 0.52 0.48 0.25

q9 13 12 0.52 0.48 0.25

q10 8 16 0.33 0.67 0.22

q11 23 2 0.92 0.08 0.07

q12 19 6 0.76 0.24 0.18

q13 15 10 0.6 0.4 0.24

q14 21 4 0.84 0.16 0.13

q15 20 5 0.8 0.2 0.16

q16 22 2 0.92 0.08 0.08

q17 15 9 0.63 0.38 0.23

q18 8 16 0.33 0.67 0.22

q19 13 12 0.52 0.48 0.25

q20 16 9 0.64 0.36 0.23

Total 3.83

10

Appendix B

Key C B D D B C D A C B A C B D A A C D B C

St No Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 Q12 Q13 Q14 Q15 Q16 Q17 Q18 Q19 Q20

1 C B B A C D A D D A D A A A A C B D B

2 C B D D B D A A C B A C B D A A C D B C

3 C B D D B C D A C B A C B D A A C B D C

4 C B D B B C B A C B A C A D C A C B C C

5 C B D C B C B A C D A C B D A A A B B C

6 C A D D C C A D C D A C A D A A A B D C

7 B B A B B C B B D D A C B D C A A D D C

8 C B D B B C B D B C A C B D A A C A B A

9 C B D A B C D D B D A C B D A A C B D A

10 C B B A B C D C D C A B A D D A C D B C

11 C B D D B C D A C B A C B D A A C D B C

12 C B D D B C D D D A A C A D A A C B B D

13 C B D A B C D A C B A C B D A A A B B C

14 C B D A B C D A C B A C B D A A A B C

15 C B D D B B A A B D A C D A A C B B D D

16 C B D D B C D A C B A C B D A A C D B C

17 B B C C B A D D C A D B D A C A D

18 C B B D B A D D D D A C A D A A C B B C

19 D C A D B A B A D C C D A A D B B B A B

20 C B D D B C D A C A C D B D A A C D B C

21 C A D D C C A D C D A C A D A A A B D C

22 B B A B B C B B D D A C B D C A A D D C

23 C B D B B C B D B C A C B D A A C A B A

24 C B B A C D A D D A D A A A A C B D B

25 C B D D B D A A C B A C B D A A C D B C

11

