You are on page 1of 15

PAPER OF TESTING THE VALIDITY AND RELIABILITY OF ASSESSMENT

TOOLS

CREATED BY:

DIMAS ABDILLAH ABADI 4181111057

ELKANA INDAH SARI PAKPAHAN 4183312004

EVRI SARIATI PAKPAHAN 4183111061

TRYANI CINTYA SIAHAAN 4183312002

Dosen Pengampu :

Prof. Dr. Bombok Sinaga, M.Pd

Assisten Dosen :

Fevi Rahmawati

MATHEMATICS DEPARTEMENT

FACULTY OF MATHEMATICS AND SCIENCES

UNIVERSITAS NEGERI MEDAN

2019

i
PREFACE

First of all, we want to say thanks for God’s love and grace for us so that we can finish
this paper on time.The purpose in this paper is to fulfill the assigment that given by
FeviRahmawati as lecturer in Teaching and Learning Evaluation Course.

As a human, we realize that there are many mistakes in making of this paper.
Therefore, we expressed my apologies if there was any mistake.

We also expect criticism and suggestions for future improvements. And hopefully this
paper can provide readers benefits and knowledge. Thank you.

September19th, 2019

Group VI

i
TABLE OF CONTENTS

PREFACE ......................................................................................................................... i

TABLE OF CONTENTS .................................................................................................. ii

CHAPTER I : INTRODUCTION ..................................................................................... 1

1.1 Background.......................................................................................................... 1

2.1 Formulation of Problem ..................................................................................... 2

3.1 Aim ....................................................................................................................... 3

CHAPTER II : DISCUSSION .......................................................................................... 4

2.1 Validity ................................................................................................................. 4

2.2 Reliability ............................................................................................................. 8

CHAPTER III : CLOSING ............................................................................................... 11

3.1 Conclusion ........................................................................................................... 11

3.2 Suggestion ............................................................................................................ 11

BIBLIOGRAPHY ............................................................................................................. 12

ii
CHAPTER I

INTRODUCTION

1.1 Background

Teaching-learning interactions are prioritized on teacher professionalism


and student achievement by emphasizing the integrity of teaching resources. The
process of knowledge transfer requires unity of learning components in synergy to
form the complexity of the realm, both cognitive, affective, and psychomotor as
the foundation for the formation of student knowledge. The linkages between
learning components are carried out in harmony with the learning context and are
guided by the objectives to be achieved. Learning objectives that have been
designed, in accordance with the syllabus, serve as benchmarks for student
learning success.

In line with the achievement of learning objectives, periodic evaluations are


needed on the development of student learning outcomes. Student learning
outcomes as evaluation material to measure the extent of student mastery of the
teaching material that has been delivered. Evaluation as a whole educational
appraisal process includes all the achievements of the education unit resulting in
the success of the business pursued in accordance with the objectives of
education, ie producing output in line with the field being studied. One form of
educational evaluation that is concrete and numerical can be known from student
learning outcomes. Student learning outcomes are obtained through assessment.

Assessment as a process to find out whether the process and results of a


program of activities in accordance with the objectives or criteria that have been
set (Suwandi, 2009: 6). Can not be separated from the form of evaluation,
assessment is closely related to measurement. Measurements produce data for the
assessment process. As stated Kelvin (2009: 6), the quantitative aspects of the
assessment obtained through measurements, while the qualitative aspects of
interpretation and consideration of quantitative data from the measurement results.

1
The measurement results produce descriptive data based on interpretation
according to predetermined assessment criteria.

The consideration of criteria categorized as good questions is expected to be


able to provide information that can be accounted for. In other words, teachers are
required to be able to prepare and conduct assessments well so that learning
objectives that have been set can be achieved optimally. At the end of the lesson,
the teacher is expected to be able to arrange test kits that can be accounted for. As
expressed by Tuckman (in Nurgiyantoro, 2010: 150) that test kits must be
accountable in terms of appropriateness, validity, reliability, interpretability, and
usability. So, the main purpose of the assessment activities is used to determine
the extent to which basic competencies are mastered by students after
participating in a series of learning.

As stated by Tuckman, Purwanto (2011: 114) also agreed that as a


measuring instrument the THB (Test Hasil Belajar) must meet the requirements
as a good measurement tool. A good measuring instrument must meet two
requirements, namely validity and reliability. Purwanto explained that a valid
THB is a THB that measures exactly the condition that you want to measure.
Conversely, THB is said to be invalid when used to measure a situation that is not
precisely measured by the THB.

Validity is closely related to reliability. Reliability or consistency of


measurements is needed to obtain valid results, but reliability can be obtained
without having to be valid (Nurgiyantoro, 2010: 150). If validity is related to the
appropriateness of the interpretation of test results, then reliability is related to the
consistency of test test results. Testing the relatively fixed test results can be said
that the test results are reliable / trustworthy, in the sense that competencies are
tested in harmony with student mastery.

2
1.2 Formulation of Problem

 What is the nature of the test's validity?


 How to calculate the validity of the test?
 What is the nature of test reliability?
 How to calculate the reliability of the test?
1.3 Aim

 Describe the nature of test validity, which includes: understanding of


test validity, types of test validity and how to calculate test validity
 Describe the nature of test reliability which includes: understanding test
reliability, types of test reliability and calculating test reliability.

3
CHAPTER II

DISCUSSION

2.1 Validity

Validity is the principle that requires that the instruments used actually
measure the intended concept (construct validity), that hypothesized causes
actually do produce the effects observed (internal validity), and that results
obtained with one population can be generalized to another (external validity).

Validity is a quality that shows the relationship between a measurement


(diagnosis) with the meaning or purpose of learning criteria or behavior
(Purwanto, 2002: 137). Meanwhile, according to Sukardi (2011: 3), validity is the
degree that shows where a test measures what is to be measured. A measuring
instrument is said to have validity when the measuring instrument is worthy of
measuring the object that should be measured and in accordance with certain
criteria. This means that there is a match between the measuring instrument with
the measurement function and measurement targets. The validity of an evaluation
instrument is nothing but the degree that shows where a test measures what it is
trying to measure (Fathorrasik,2016)

There are two kinds of validity, namely logical validity and empirical
validity, to be clearer about the validity we discuss in the first validity, namely:

 Logical validity (Internal)

Logical validity or validity of reasoning is an instrument whose conditions


meet the requirements based on reasoning. This means that logical validity can be
achieved if an instrument is prepared based on existing provisions. There are two
kinds of logical validity, namely;
a) Content validity

The content validity for an instrument indicates a condition which is


compiled based on the content of the subject matter to be evaluated. So an
instrument in content validity must be validated by someone skilled in its field.
The person who validates is called a validator. To find out whether the test

4
instrument is valid or not, it must be done through a review of the instrument grid
to ensure that the items of the instrument represent or reflect the entire content or
material that should be controlled proportionally. As for the validation assessed by
the validator is (1) appropriateness between indicators and items, (2) clarity of
language or pictures in the problem, (3) appropriateness of questions with
students' level of ability and (4) truth of the material or concept.

b) Construct validity.
While the validity of the extract is based on the condition of an instrument
that is compiled on the psychological aspects that should be evaluated. The
construct (construct) is a framework of a concept that cannot be seen. This
conceptual framework is important in the preparation and development of
measurement / assessment instruments. According to Thorndike (1997: 175)
construct validity is a psychological framework referring to an invisible concept
but literally the concept is used in the preparation of instruments in observed
behavior.

The purpose of testing this validity is to get evidence of the extent to which
the measurement results examine the construct being measured. In developing this
instrument, two stages of construct validity analysis were carried out: (1) the
theoretical stage and (2) the empirical stage. At the theoretical stage, namely by
assessing the design of the instrument by a number of assessors who master the
instrument development problem and through a literature review. At the empirical
stage, it is based on testing instruments to a number of respondents trials. The
construct validation process through the panel) is intended to: (1) examine
instruments starting from the construct to constructing the items, and (2) to assess
the items themselves.

 Empirical Validity (External)

External or empirical validity of an instrument is tested by comparing the


existing criteria on the instrument with empirical facts that occur in the field.
There are two kinds of empirical validity, the first is concurrent validity and the

5
second is predictive validity. For more details, these two validities are based on
Zein and Darto's book, 2012 pages 53-54:

a) Validity is present (concurrent validity)

Validity is now (concurrent validity) is an instrument whose condition is in


accordance with the criteria that are available, and that already exist. According
Suharsimi Arikunto (1999: 68) A test is said to have empirical validity if the
results are in accordance with experience. In comparing the results of a test, we
need a criterion or appeals tool.

b) Predictive validity

Predictive validity is an instrument whose condition is in accordance with


the criteria predicted to occur.

Validity test formula

The first and popular technique used is the Product Moment Correlation
technique proposed by Pearson.

There are two of the Product Moment correlation formula :

 Product moment correlation with Deviation,


 Product moment correlation with rough numbers

6
Preparing to Find Test Validity with Deviations :

Enter into the formula:

7
Preparation for finding validity tests with rough numbers:

2.2 Reliability

Reliability refers to whether a study, if replicated, will achieve the same


results as the first iteration. It also refers to whether tests yield the same results if
retested with a similar population, or if two sets of coders will code a given set of
data the same way. Obviously, since case studies are designed to document and
explore relatively unknown or unique phenomena, it is difficult to replicate them
exactly, if only because historical effects—the passage of time—will render a site
or population different in its second exploration. (Secolsky, 2017)

Reliability comes from the word reliability. Understanding the reliability


(reliability) is the majesty of measurement (Walizer, 1987). Sugiharto and
Situnjak (2006) stated that reliability refers to an understanding that the
instruments used in research to obtain information used can be trusted as data
collection tools and are able to reveal the real information in the field. Ghozali

8
(2009) states that reliability is a tool to measure a questionnaire which is an
indicator of variables or constructs. A questionnaire is said to be reliable or
reliable if a person's answer to a statement is consistent or stable from time to
time. Reliability of a test refers to the degree of stability, consistency, predictive
power, and accuracy. Measurements that have high reliability are measurements
that can produce reliable data.

According to Masri Singarimbun, reliability is an index that shows the


extent to which a measuring instrument can be trusted or reliable. If a measuring
device is used twice - to measure the same symptoms and the measurement results
obtained are relatively consistent, then the gauge is reliable. In other words,
reliability indicates the consistency of a measuring device within the same
symptom measure.

According to Sumadi Suryabrata (2004: 28) reliability shows the extent of


the results of measurements with these tools can be trusted. The measurement
results must be reliable in the sense that they must have a level of consistency and
stability.

Reliability, or reliability, is the consistency of a series of measurements or a


series of measuring instruments. This can be in the form of measurements from
the same measuring instrument (tests with retest) will give the same results, or for
a more subjective measurement, whether two assessors give similar scores
(reliability between assessors).

Reliability is not the same as validity. This means that reliable


measurements will measure consistently, but not necessarily measure what should
be measured. In research, reliability is the extent to which the measurement of a
test remains consistent after repeated repetition of the subject and under the same
conditions. Research is considered reliable when it provides consistent results for
the same measurements. It cannot be relied on if the repeated measurements give
different results.

9
High and low reliability, empirically shown by a number called the value of
the reliability coefficient. High reliability is indicated by the value of rxx
approaching number 1. The general agreement of reliability is considered to be
satisfactory if ≥ 0.700.

Testing the reliability of the instrument using the Cronbach Alpha formula
because the research instrument was in the form of a questionnaire and multilevel
scale. The Cronbach Alpha formula is as follows:

Information :

If the value of alpha > 0.7 means that reliability is sufficient (sufficient
reliability) while if alpha > 0.80 suggests all items are reliable and all tests
consistently have strong reliability. Or, some interpret it as follows:

If alpha> 0.90 then reliability is perfect. If alpha is between 0.70 - 0.90 then
reliability is high. If alpha is 0.50 - 0.70, the reliability is moderate. If alpha
<0.50, reliability is low. If alpha is low, chances are one or more items are not
reliable. (http://qmc.binus.ac.id, 2014)

10
CHAPTER III

CLOSING

3.1 Conclusion

Validity is the degree that shows where a test measures what you want to
measure. Types of validity consist of: Content validity, Construction validity,
Comparable validity, and Predictive validity. How to calculate validity using the
product moment correlation formula. Reliability of the measuring instrument
(instrument) is the determination or the reliability of the instrument in measuring
what it is measuring.

Validity and reliability are research and test parameters. Therefore, for
research to be used, these two aspects must be included in the assessment
instruments. The difference between validity and reliability is that validity refers
to the extent of a test step and what is claimed to measure, while reliability refers
to the consistency of test results. However, when the research or test is valid, then
the data is reliable. However, if the test is reliable, it does not mean that the test is
valid.

3.2 Suggestion

A good measuring instrument must meet two requirements, namely validity


and reliability. Validity is closely related to reliability. Reliability or consistency
of measurements is needed to obtain valid results, but reliability can be obtained
without having to be valid. That is why validity and reliability is really important
to test the assesment tools.

11
BILIOGRAPHY

Binus University. 2014. Uji Validitas dan Reliabilitas.


http://qmc.binus.ac.id/2014/11/01/u-j-i-v-a-l-i-d-i-t-a-s-d-a-n-u-j-i-r-e-l-i-a-
b-i-l-i-t-a-s/ Accessed on September 19th 2019, 07:35 PM.

Fathorrasik, 2016. Validitas dan Reliabilitas Tes.


https://www.kompasiana.com/fathorrasik1/57a0a5e2ae7e611b19e1a4a2/vali
ditas-dan-reliabilitas-tes?page=all. Accessed on September 17th 2019, 05:48
PM.

Hidayat, Anwar. 2012. Penjelasan Berbagai Jenis Uji Validitas dan Cara Hitung.
https://www.statistikian.com/2012/08/uji-validitas.html Accessed on
September 17th 2019, 03:48 PM

Secolsky, Charles and D. Brian Denison. 2017. Handbook on Measurement,


Assessment and Evaluation in Higher Education. New York: Routledg.

Zein, Mas’ud and Darto. 2012. Evaluasi Pembelajaran Matematika. Riau: Daulat
Riau.

12

You might also like