Professional Documents
Culture Documents
Quantitative
Quantitative
includes the
consideration of content includes principally
validity (content and the measurement
form of items), as well of item difficulty
as the evaluation of and item
items in terms of discrimination
effective item-writing
procedures.
Qualitative
Quality test??
Criterias??
01 Validity 02 Relibility
✓Valid
✓Reliability/Consistency/c
✓the test measured what it onsistency
was supposed to measure. ✓level of constancy or
✓Measure according to consistency of the
Item what is measured
✓Use of the right
measurement results of a
test
Analysis measuring tools
✓Internal & external validity
✓Consistency relates to the
error rate of the results of
a test in the form of
scores
03 Distinguish 04 level of difficulty
Power
✓measuring the extent to which an ✓a measure of the degree of 05 Distractor
item is able to distinguish students difficulty of a question. If a
who have mastered competence question has a balanced level ✓distractor answer choices
from students who have not of difficulty, then it can be said
mastered it based on certain that the question is good
criteria
Validity
Content Validity
Expert Judgment
Logic
Construct Validity
Expert judgment
Factor Analysis
Face Validity
Methode:
How to measure:
Matching the test material with V-Aiken Technique
the syllabus, grids, conducting Lawse Technique
discussions with fellow educators, Maslach Technique
and re-examining the substance of
the concept to be measured
Empiric Validity
01 This validity looks for the relationship between test scores with a certain
criterion which is a benchmark outside the test in question.
This empirical validity is also known as Criterion validity/statistical validity.
The types are predictive validity, concurrent validity, and similar validity
04 Similar test scores are if the standard test criteria used are similar.
Example: math content with math content
05 Empirical validity testing:
1. Product moment correlation with deviation number
2. Product moment correlation with rough numbers
3. Rank differences correlation
4. Scatter diagram technique (Scatter Diagram)
Product Moment Correlation with Deviation Number
Product Moment Correlation with Deviation Number
Product Moment Correlation Coarse Numbers
Product Moment Correlation Coarse Numbers
Product Moment Correlation Coarse Numbers
Interpretation of r value
The interpretation of the correlation coefficient (r) is conventionally given by Guilford (1956) as
follows
r Coefficient Criteria
0,80 – 1,00 Very good
0,60 – 0,80 Good
0,40 – 0,60 Enough
0,20 – 0,40 Less
0,00 – 0,20 Very less
The interpretation of the value of r can also be seen by comparing the calculated r with the r
table. If r count > r table, then the data is reliable
Construct Validity
➢Construct is a concept that can be observed (observable)
& measurable (measurable).
➢Construct validity is known as logical validity.
➢Construct validity relates to the extent to which the
question/test can measure & observe psychological
function which is a description of the behavior
Hyot Kuder-Richardson
(KR 20)
Alpha Cronbach
(subjective test)
Reliability Testing by Excel
Split-Half Technique Non Split-Half Technique
by dividing the test into two Salah satu kelemahan perhitungan
koefisien reliabilitas dengan
relatively equal parts (the
menggunakan teknik belah dua
number of questions is the
adalah (1) banyaknya butir soal harus
same), so that each test has two genap, dan (2) dapat dilakukan
kinds of scores, namely the dengan cara yang berbeda sehingga
Internal Consistency Reliability) score of the first part (starting menghasilkan nilai yang berbeda pula
question / odd number) and seperti terlihat pada contoh c.1 dan
A single test is a test consisting the score of the second part contoh c.2.
of a set that is given to a group (final / even number question). menggunakan rumus Kuder-
of subjects in one test so that Richardson (KR-20) dan Kuder-
The test hemisphere reliability
from the test results only one Richardson (KR-21).
coefficient is denoted by r1/2
group of data is obtained.
1/2 and can be calculated using Reliability for subjective test
the formula, namely Pearson's
crude number correlation.
Non Split-Half Technique by using the Cronbach-
Alpha formula
Split-Half Technique
Internal Consistency
Reliability
reliability testing instruments
✓Reliability coefficient 0 – 1
The categories of reliability coefficients (Guilford, 1956: 145) are as
follows:
0.80 < r11 < 1.00 very high reliability
0.60 < r11 < 0.80 high reliability
0.40 < r11 < 0.60 moderate reliability
0.20 < r11 < 0.40 low reliability.
-1.00 < r11 < 0.20 very low reliability (unreliable).
ITEMAN, others
Rasch model
Excel SPSS
Difficulty Level
the opportunity to answer correctly a
question at a certain level of ability which is
usually expressed in the form of an index
Objective Test
01
02 Subjective test
►TK= Mean
The maximum score that has been set On the scoring guidelines
Questions number BA BB DP
1 5 5 0.00
2 5 0 1.00
3 5 1 0.90
4 5 1 0.90
5 3 5 -0.30
6 5 0 1.00
7 5 0 0.90
8 4 1 0.70
9 0 5 -1.00
10 0 0 0.00
BA 5
BB 5
E.q Subjective test
Formula DP by biserial point correlation
Selain rumus tersebut, untuk mengetahui daya pembeda soal bentuk pilihan
ganda dapat digunakan rumus korelasi titik biseral (r pbis) dan korelasi biseral (r
bis), sebagai berikut :
Xb − Xs Yb − Ys nb ns
rpbs = pq or rbis =
SDt SDt un n 2 − n
Mean BA − Mean BB
DP =
Maximum score
No Class students
A B C D E*
1 BA 27% = 40 students 4 12 16 8 0
BB 27%= 40 students 0 12 16 12 0
Question number 1 is really bad because both the upper and lower
groups are confused and both groups choose C. In addition, the 01
distractor or distractor or choice E does not work or is ineffective 02
because no one chooses 03
05 04
No Class Students
A* B C D E
2 Atas/tinggi 27% = 40 orang 40 0 0 0 0
Bawah/rendah 27%= 40 orang 0 8 12 10 0