You are on page 1of 24

What is reliability?

Methods in Testing reliability

 1. Test-retest
 2. Parallel Forms
 3. Split-Haft
 4. Test of Internal Consistency ( using Kunder-Richardson
& Cronbach’s Apla Method)
 5. Inter-rater Reliability
1.Test-retest
 Administer it one time. Administer it again at another time to the “same
group of examinees”, for the minimum time interval of 30 minutes to not
more than 6 months between the first test.
 is applicable for test that measures stable variables , such as aptitude and
psychomotor measures (e.g., typing test, task in physical education.
 The responses in the test should more or less the same across the two
points in time.
 Pearson Product correlation or Pearson r method as statistical analysis
to measure reliability.
2. Parallel Forms
The two versions of test called “ form”. The items need to be
exactly measure the same skill.
Administer the one form at one time and the other form to another
time to the same group of participants.
It is applicable if there are two versions of a test. This is usually
done when the test is repeatedly used for the different group, such
as entrance examinations and licensure examinations. Different
versions of test are given to a different group of examinees.
To measure the correlation? We can use also as statistic analysis
the Pearson Product correlation or Pearson r method.
3. Split-Half

 Administer a test to a group of examinees. The items need to


split into halves, usually using the odd-even technique.
 This reliability method of testing is applicable when the test has
a large numbers of items.
 1st part. Use Pearson r, after the correlation coefficient.
 2nd part use the another formula called Spearman Brown
Coefficient to internal consistency reliability.
4. Test of Internal consistency
(Using Kunder- Richardson and Cronbach’s Alpha Method.)

 It involves in determining if the scores for each item are


consistently answered by the examinees.
 The Cronbach’s alpha or kunder Richardson is use to
determine the internal consistency of the items.
 This technique will work well when the assessment tool has a
large number of items , it is applicable scale inventories such
as Likert scale form.
5. Inter-rater Reliability

 In this procedure used to determine the consistency of multiple


raters when using rating scales and rubrics to judge
performance.
 The inter-rater Reliability is applicable when the assessment
requires the use of multiple raters.
 The Kendall’s tau coefficient of concordance is used to
determine the ratings provided by multiple raters agree with
each other.
Note: The statistical analysis is required to
determine the reliability of measure. The very
basis of statistical analysis is to determine
reliability is the use of linear regression.
Statistical
Analysis
1. Linear Regression

 It is demonstrated when you have two variables


that are measured. All variable are plotted in a
graph of “ X and Y axis “ that tend to form a
straight line.
 When the straight line is formed. We can say there
is a correlation between the two sets of scores.
25

Monday Tuesday
Two set of scores in a test taken at two different times by the same participants.
test test
20
X Y

10 20

Monday Test
9 15 15

6 12

10 18

12 19 10

4 8

5 7
5
7 10

16 17

8 13
0
2 4 6 8
Tuesday
10
Test 12 14 16 18
2. Computation of 25

20

Pearson r

Monday Test
15

Correlation 10

0
2 4 6 8 10 12 14 16 18
Tuesday Test

 The index of linear regression is called correlation coefficient. It is used


to measure how strong a relationship between two variables.
 IF the points in scatterplot tend to fall within the linear line, the
correlation is said to be strong. If the scatterplot directly proportional,
the correlation coefficient will have a positive value, if the line is invers
we will have a negative value.
Computation of Pearson r Correlation

Formula

N= number of our participants


∑XY=Add all product X&Y
∑X= Add all the X scores
∑Y= Add all the Y scores
X2= Square the value of X
Y2= Square the value of Y
XY= Multiply X & Y
∑X2= Add all squares of X
∑Y2= Add all squares of Y
Example; The spelling test of two syllable words with 20 items for Monday and Tuesday.
Item Monday test Tuesday test
x Y X2 Y2 XY
1 10 20 100 400 200
2 9 15 81 225 135

3 6 12 36 144 172
4 10 18 100 324 180
5 12 19 144 361 228
6 4 8 16 64 32
7 5 7 25 49 35
8 7 10 49 100 70
9 16 17 256 289 272
10 8 13 64 169 104
∑X=87 ∑Y=139 ∑X2=871 ∑Y2=2125 ∑XY=1328
∑X=87 ∑Y=139 2
∑X =871
2
∑Y =2125 ∑XY=1328

 Formula
∑X= Add all the X scores
∑Y= Add all the Y scores
X2= Square the value of X
2
Y = Square the value of Y
XY= Multiply X & Y
2
∑X = Add all squares of X
∑Y2= Add all squares of Y
∑XY=Add all product X&Y
N= Number of item

Pearson r = 0.80
3. Difference between Positive and Negative Correlation
Positive Correlation Negative Correlation

Positive correlation- When the value of the coefficient is positive, means the
higher the score in X, the higher the scores in Y.
Negative correlation – When the value of the coefficient is negative. Means
that the higher the scores in X, the lower the scores in Y.
4. Determining the strength of a correlation
The strength of the correlation also indicates the strength of the
reliability of the test. This is indicated by the value of the correlation
coefficient. The closer the value of 1 or -1 the stronger the relationship is.

 0.80 – 1.00 Very strong relationship


Pearson r =
 0.60 – 0.79 Strong relationship 0.80
 0.40 – 0.59 Substantial /marked relationship Very
strong
 0.20 – 0.39 Weak relationship
relationship
 0.00 – 0.19 Negligible relationship
5. Determining the significance of the correlation

In order to determine the correlation between the two variables


to free from certain errors, it is tested for significance.

 Critical value – compasses with an expected probability of


correlation coefficient values.
 When the value computed greater than the critical value, it means
that the information obtained has more than 95% chance of
correlated and significance.
 To determine the internal consistency of the test. We can use the
statistical analysis called Cronbach’s alpha.
EAMPLE: Supposed that five students
answered checklist about their hygiene with the
scale 1to 5 , where the following corresponding
scores are:
5 - always , 4 – often, 3 – sometimes , 2 – rarely,
1- never. The teacher wanted to determine if the
item have internal consistency.
Cronbach’s alpha Example
Student Item 1 Item 2 Item 3 Item 4 Item Total for each Score – Mean (Score –
5 case (x) Mean )^2

Student A 5 5 4 4 1 19 2.8 7.84

Student B 3 4 3 3 2 15 -1.2 1.44

Student C 2 5 3 3 3 16 -0.2 0.04

Student D 1 4 2 3 3 13 -3.2 10.24

Student E 3 3 4 4 4 18 1.8 3.24

TOTAL FOR 14 21 16 17 13 = 16.2


Each item
(∑X)

∑X2 48 91 54 59 39

2
SD ì 2.2 0.7 0.7 0.3 1.3 2
∑SD ì = 5.2
TOTAL 14 21 16 17 13 = 16.2
Continuation FOR Each
item (∑X)
∑X2 48 91 54 59 39
SD i
2
2.2 0.7 0.7 0.3 1.3 2
∑SD i = 5.2

Cronbach’s Internal n= is the


alpha consistency  Cronbach,s a= number of
scales item
a≥ 0.9 Excellent  Cronbach,s a=
 Cronbach,s a= 0.10
0.9 > a ≥ 0.8 Good
 Cronbach’ s a, consider has internal consistency of 0.60
0.8 > a ≥ 0.7 Acceptable and above.
0.7 > a ≥ 0.6 Questionable  Conclusion, the internal consistency of the
responses is 0.10, indicating low internal
0.6 > a ≥ 0.5 Poor
consistency.
0.5 > a unacceptable
The Kendall’s tau Coefficient of Concordance
The consistency or ratings can be also obtain using coefficient of
concordance.
Five Rater 1 Rater Rater 3 Sum of D D2
Example: Performance demonstrator 2 ratings
task demonstrated by five
students rated by three
raters. The rubric scale
A 4 4 3 11 2.6 6.76
rated from 4 -1. 4 is the
highest and 1 is the lowest.
B 3 2 3 8 -0.4 0.16

C 3 4 4 11 2.6 6.76

D 3 3 2 8 -0.4 0.16

E 1 1 2 4 -4.4 19.36

rating = 8.4 2
∑D = 33.2
The Kendall’s tau Coefficient of Concordance

2
rating = 8.4 ∑D = 33.2

In this sample problem , the Kendall’s tau Coefficient is 0.37


indicates the agreement of the three raters in the five
demonstrator. There is moderate concordance among the three
raters because the value is far from 1.00
Thank You

You might also like