You are on page 1of 21

NAME:

Reg NO:
COURSE CODE: 6507
SEMESTER : Spring 2022
Assignment No.1
QUESTION NO 1

Explain the relationship between measurement and testing. Highlight


the differences between student and program evaluation.

ANSWER:
Measurement is a systematic process of determining the attributes of an object. It
ascertains how fast, tall, dense, heavy, broad, something is. However, one can make
measurements of physical attributes only and if one has to measure those attributes
which cannot be measured with the help of tools. That is where the need
for evaluation arises. It helps in passing value judgement about the policies,
performances, method, techniques, strategies, effectiveness, etc. of teaching.
Measurement provides a solid base to make an evaluation, as you have something
concrete to make a comparison between the objects. Further, Evaluation has a crucial
role to play in reforming the learning and teaching process and suggesting changes in
the curriculum.

When one of the sets of numerals is assigned to each set of objects, be it person or
commodity, as per the accepted rules or standards and described in standard words, units
and symbols, so as to characterize the status of that object it is called as measurement. In
education, measurement implies the quantitative assessment of the student’s performance
in an exam.
It is a mechanical process, which involves the systematic study of the attributes with the
help of appropriate assessment tools. It transforms the variable into variate, which is
effective in making deductions. For instance, Intelligence is measured in terms of IQ,
and the result variable is measured as scores.
Further, it is helpful in comparing the performance of various students as well as in
highlighting their positive and negative points.

Types of Measurement

 Physical Measurement: The measurement of an object which materially exists, it


is called as physical measurement. For instance, measurement of height or
weight of an individual using a measuring tape or weighing machine, starting
from zero points.
 Mental Measurement: Otherwise called as psychological measurement. It is not
defined in absolute terms, rather it is relative. It is not measured with the help of
any instrument but on the basis of the individual’s response or critical
observation. For instance, measuring the amount of work done by an individual is
psychological or mental measurement.
Definition of Evaluation
Evaluation can be defined as the act of assigning value to the measure. It is a systematic
and continuous process wherein the analysis of the outcome derived from the
measurement of the characteristic of the object, person or activity is performed as per the
defined standards. Further, the relative position of the person, object or activity is
ascertained, on the basis of the characteristic.
In evaluation. what we do is, we pass judgement regarding how suitable, desirable or
valuable something is. In education, evaluation alludes to the overall assessment of the
progress of the student, with respect to:
 Defined objectives
 Efficiency of teaching and
 Effectiveness of the curriculum.
It acts like an ‘inbuilt monitor’, within the system, that tends to review the learning
progress, at various points in time. It also provides feedback on various aspects of the
educational systems, such as on teaching to the teachers and on learning to the learners.
So we can conclude that:
Evaluation = Quantitative description + Qualitative Description + Value Judgement
where, the quantitative description includes facts and figures and the qualitative
description includes ranking, weightage and value.
1. Measurement can be understood as the process of determining the attributes
and dimensions of a physical object. On the other hand, evaluation is an ongoing
process of measuring and assigning qualitative meaning, by passing value
judgements.
2. Measurement accounts for the observations which can be expressed numerically,
i.e. quantitative observations. Conversely, evaluation includes both quantitative
and qualitative observations.
3. Measurement entails the assignment of numerals to the person or object as per
the certain rules. As against, evaluation involves the assignment of grades, level
or symbols according to established standards.
4. While measurement focuses on one or more attributes or traits of a person or
object, evaluation covers all the aspects including cognitive, affective and
psychomotor learning.
5. Measurement analyses how much, how tall, how fast, how hot, how far or how
small something is and that too in numerical terms. In contrast, evaluation
answers how well something is which is done by adding meaning or value
judgement to the measurement.
6. With measurement, one cannot make logical assumptions about the learner, but
this is not in the case of evaluation.
7. Measurement consumes less time and energy as it uses tools or measuring
devices, to serve the purpose. As against, evaluation requires observation and it
passes value judgement, which consumes time and energy.
8. When it comes to scope, measurement has a limited scope because it takes into
account only a number of dimensions of personality or attribute. But, evaluation
covers all the dimensions before passing value judgement. Moreover, the
evaluation includes measurement. Hence, its scope is wider.
9. Measurement is content-oriented whereas evaluation is objective oriented.
QUESTION NO 2

What factors can influence the test administration process and how
scoring problems can be addressed at secondary level.

ANSWER:
Test administration guidelines are a set of policies and procedures that outline how
standardized assessments should be distributed and administered. These guidelines
exist in order to increase consistency, ensure test security, and safeguard the fair
and reliable results of exam scores.

Reliability refers to how dependably or consistently a test measures a characteristic. If a


person takes the test again, will he or she get a similar test score, or a much different
score? A test that yields similar scores for a person who repeats the test is said to
measure a characteristic reliably.

How do we account for an individual who does not get exactly the same test score every
time he or she takes the test? Some possible reasons are the following:
 Test taker's temporary psychological or physical state. Test performance can be
influenced by a person's psychological or physical state at the time of testing. For
example, differing levels of anxiety, fatigue, or motivation may affect the
applicant's test results.
 Environmental factors. Differences in the testing environment, such as room
temperature, lighting, noise, or even the test administrator, can influence an
individual's test performance.
 Test form. Many tests have more than one version or form. Items differ on each
form, but each form is supposed to measure the same thing. Different forms of a
test are known as parallel forms or alternate forms. These forms are designed to
have similar measurement characteristics, but they contain different items.
Because the forms are not exactly the same, a test taker might do better on one
form than on another.
 Multiple raters. In certain tests, scoring is determined by a rater's judgments of
the test taker's performance or responses. Differences in training, experience,
and frame of reference among raters can produce different test scores for the
test taker.
Principle of Assessment: Use only reliable assessment instruments and procedures. In
other words, use only assessment tools that provide dependable and consistent
information.

These factors are sources of chance or random measurement error in the assessment
process. If there were no random errors of measurement, the individual would get the
same test score, the individual's "true" score, each time. The degree to which test scores
are unaffected by measurement errors is an indication of the reliability of the test.
Reliable assessment tools produce dependable, repeatable, and consistent information
about people. In order to meaningfully interpret test scores and make useful
employment or career-related decisions, you need reliable tools. This brings us to the
next principle of assessment.

 Test-retest reliability indicates the repeatability of test scores with the passage


of time. This estimate also reflects the stability of the characteristic or construct
being measured by the test.
Some constructs are more stable than others. For example, an individual's
reading ability is more stable over a particular period of time than that
individual's anxiety level. Therefore, you would expect a higher test-retest
reliability coefficient on a reading test than you would on a test that measures
anxiety. For constructs that are expected to vary over time, an acceptable test-
retest reliability coefficient may be lower than is suggested in Table 1.
 Alternate or parallel form reliability indicates how consistent test scores are
likely to be if a person takes two or more forms of a test. A high parallel form
reliability coefficient indicates that the different forms of the test are very similar
which means that it makes virtually no difference which version of the test a
person takes. On the other hand, a low parallel form reliability coefficient
suggests that the different forms are probably not comparable; they may be
measuring different things and therefore cannot be used interchangeably.
 Inter-rater reliability indicates how consistent test scores are likely to be if the
test is scored by two or more raters. On some tests, raters evaluate responses to
questions and determine the score. Differences in judgments among raters are
likely to produce variations in test scores. A high inter-rater reliability coefficient
indicates that the judgment process is stable and the resulting scores are reliable.
Inter-rater reliability coefficients are typically lower than other types of reliability
estimates. However, it is possible to obtain higher levels of inter-rater reliabilities
if raters are appropriately trained.
 Internal consistency reliability indicates the extent to which items on a test
measure the same thing.
A high internal consistency reliability coefficient for a test indicates that the items
on the test are very similar to each other in content (homogeneous). It is
important to note that the length of a test can affect internal consistency
reliability. For example, a very lengthy test can spuriously inflate the reliability
coefficient.

Tests that measure multiple characteristics are usually divided into distinct
components. Manuals for such tests typically report a separate internal
consistency reliability coefficient for each component in addition to one for the
whole test. Test manuals and reviews report several kinds of internal consistency
reliability estimates. Each type of estimate is appropriate under certain
circumstances. The test manual should explain why a particular estimate is
reported.

QUESTION NO 3

Explain practical considerations in planning a test. Develop a protocol


for test administration for secondary level examiners.

ANSWER:
After the overall content of the test has been established through a job analysis, the next
step in test development is to create the detailed test specifications. Test specifications
usually include a test description component and a test blueprint component. The test
description specifies aspects of the planned test such as the test purpose, the target
examinee population, the overall test length, and more. The test blueprint, sometimes also
called the table of specifications, provides a listing of the major content areas and
cognitive levels intended to be included on each test form. It also includes the number of
items each test form should include within each of these content and cognitive areas.
The test description component of an exam program's test specifications is a written
document that provides essential background information about the planned exam
program. This information is then used to focus and guide the remaining steps in the test
development process. At a minimum, the test description may simply indicate who will
be tested and what the purpose of the exam program is. More often, the test description
will usually also include elements such as the overall test length, the test administration
time limit, and the item types that are expected to be used (e.g., multiple choice, essay).
In some cases the test description may also specify a test administration mode (e.g.,
paper-and-pencil, performance-based, computer-based). And, if the test will include any
items or tasks that will need to be scored by human raters, the test description may also
include plans for the scoring procedures and scoring rubrics. The content areas listed in
the test blueprint, or table of specifications, are frequently drawn directly from the results
of a job analysis. These content areas comprise the knowledge, skills, and abilities that
have been determined to be the essential elements of competency for the job or
occupation being assessed. In addition to the listing of content areas, the test blueprint
specifies the number or proportion of items that are planned to be included on each test
form for each content area. These proportions reflect the relative importance of each
content area to competency in the occupation.
Most test blueprints also indicate the levels of cognitive processing that the examinees
will be expected to use in responding to specific items (e.g., Knowledge, Application). It
is critical that your test blueprint and test items include a substantial proportion of items
targeted above the Knowledge-level of cognition. A typical test blueprint is presented in a
two-way matrix with the content areas listed in the table rows and the cognitive processes
in the table columns. The total number of items specified for each column indicates the
proportional plan for each cognitive level on the overall test, just as the total number of
items for each row indicates the proportional emphasis of each content area.
The test blueprint is used to guide and target item writing as well as for test form
assembly. Use of a test blueprint improves consistency across test forms as well as
helping ensure that the goals and plans for the test are met in each operational test. An
example of a test blueprint is provided next. In the (artificial) test blueprint for a Real
Estate licensure exam given below the overall test length is specified as 80 items. This
relatively small test blueprint includes four major content areas for the exam (e.g., Real
Estate Law). Three levels of cognitive processing are specified. These are Knowledge,
Comprehension, and Application.
Each test form written to this table of specifications will include 40% of the total test (or
32 items) in the content area of Real Estate Law. In addressing cognitive levels, 35% of
the overall test (or 28 items) will be included at the Knowledge-level. The interior cells of
the table indicate the number of items that are intended to be on the test from each
content and cognitive area combination. For example, the test form will include 16 items
at the Knowledge-level in the content area of Real Estate Law.

Content Knowledg Comprehensio Applicatio Tota Percentag


e n n l e

Real 16 8 8 32 40%
Estate
Law

Real 4 12   16 20%
Estate
Practices

Financing 8 8 8 24 30%
/
Mortgage
Markets

Real     8 8 10%
Estate
Math

Total 28 28 24 80  

Percentag 35% 35% 30%   100%


e

The test specifications for an exam program provide essential planning materials for the
test development process. Thorough, thoughtful test specifications can guide the
remainder of the test development process, especially item writing efforts and test
assembly. An initial test form can be developed according to these specifications to
appropriately reflect the content and cognitive emphases intended. The specifications can
also be used to guide the development of later, additional test forms. Careful linking
between the job analysis, test specifications, and test items will go a long way to
providing strong content validity and legal defensibility for the exam program.

QUESTION NO 4

Why essay type items are considered easy to administer and difficult
to score? Explain with practical examples.

ANSWER:
An essay test may give full freedom to the students to write any number of pages.
The required response may vary in length. An essay type question requires the pupil to
plan his own answer and to explain it in his own words. The pupil exercises considerable
freedom to select, organise and present his ideas. Essay type tests provide a better
indication of pupil’s real achievement in learning. The answers provide a clue to nature
and quality of the pupil’s thought process.
That is, we can assess how the pupil presents his ideas (whether his manner of
presentation is coherent, logical and systematic) and how he concludes. In other words,
the answer of the pupil reveals the structure, dynamics and functioning of pupil’s mental
life.
The essay questions are generally thought to be the traditional type of questions which
demand lengthy answers. They are not amenable to objective scoring as they give scope
for halo-effect, inter-examiner variability and intra-examiner variability in scoring.

Types of Essay Test:

There can be many types of essay tests:


Some of these are given below with examples from different subjects:
1. Selective Recall.
e.g. What was the religious policy of Akbar?
2. Evaluative Recall.
e.g. Why did the First War of Independence in 1857 fail?
3. Comparison of two things—on a single designated basis.
e.g. Compare the contributions made by Dalton and Bohr to Atomic theory.
4. Comparison of two things—in general.
e.g. Compare Early Vedic Age with the Later Vedic Age.
5. Decision—for or against.
e.g. Which type of examination do you think is more reliable? Oral or Written. Why?
6. Causes or effects.
e.g. Discuss the effects of environmental pollution on our lives.
7. Explanation of the use or exact meaning of some phrase in a passage or a sentence.
e.g., Joint Stock Company is an artificial person. Explain ‘artificial person’ bringing out
the concepts of Joint Stock Company.
8. Summary of some unit of the text or of some article.
9. Analysis
e.g. What was the role played by Mahatma Gandhi in India’s freedom struggle?
10. Statement of relationship.
e.g. Why is knowledge of Botany helpful in studying agriculture?
11. Illustration or examples (your own) of principles in science, language, etc.
e.g. Illustrate the correct use of subject-verb position in an interrogative sentence.
12. Classification.
e.g. Classify the following into Physical change and Chemical change with explanation.
Water changes to vapour; Sulphuric Acid and Sodium Hydroxide react to produce
Sodium Sulphate and Water; Rusting of Iron; Melting of Ice.
13. Application of rules or principles in given situations.
e.g. If you sat halfway between the middle and one end of a sea-saw, would a person
sitting on the other end have to be heavier or lighter than you in order to make the sea-
saw balance in the middle. Why?
14. Discussion.
e.g. Partnership is a relationship between persons who have agreed to share the profits of
a business carried on by all or any of them acting for all. Discuss the essentials of
partnership on the basis of this partnership.
15. Criticism—as to the adequacy, correctness, or relevance—of a printed statement or a
classmate’s answer to a question on the lesson.
e.g. What is the wrong with the following statement?
The Prime Minister is the sovereign Head of State in India.
16. Outline.
e.g. Outline the steps required in computing the compound interest if the principal
amount, rate of interest and time period are given as P, R and T respectively.
17. Reorganization of facts.
e.g. The student is asked to interview some persons and find out their opinion on the role
of UN in world peace. In the light of data thus collected he/she can reorganise what is
given in the text book.
18. Formulation of questions-problems and questions raised.
e.g. After reading a lesson the pupils are asked to raise related problems- questions.
19. New methods of procedure
e.g. Can you solve this mathematical problem by using another method?

Advantages of the Essay Tests:

1. It is relatively easier to prepare and administer a six-question extended- response essay


test than to prepare and administer a comparable 60-item multiple-choice test items.
2. It is the only means that can assess an examinee’s ability to organise and present his
ideas in a logical and coherent fashion.
3. It can be successfully employed for practically all the school subjects.
4. Some of the objectives such as ability to organise idea effectively, ability to criticise or
justify a statement, ability to interpret, etc., can be best measured by this type of test.
5. Logical thinking and critical reasoning, systematic presentation, etc. can be best
developed by this type of test.
6. It helps to induce good study habits such as making outlines and summaries,
organising the arguments for and against, etc.
7. The students can show their initiative, the originality of their thought and the fertility
of their imagination as they are permitted freedom of response.
8. The responses of the students need not be completely right or wrong. All degrees of
comprehensiveness and accuracy are possible.
9. It largely eliminates guessing.
10. They are valuable in testing the functional knowledge and power of expression of the
pupil.

Limitations of Essay Tests:

1. One of the serious limitations of the essay tests is that these tests do not give scope for
larger sampling of the content. You cannot sample the course content so well with six
lengthy essay questions as you can with 60 multiple-choice test items.
2. Such tests encourage selective reading and emphasise cramming.
3. Moreover, scoring may be affected by spelling, good handwriting, coloured ink,
neatness, grammar, length of the answer, etc.
4. The long-answer type questions are less valid and less reliable, and as such they have
little predictive value.
5. It requires an excessive time on the part of students to write; while assessing, reading
essays is very time-consuming and laborious.
6. It can be assessed only by a teacher or competent professionals.
7. Improper and ambiguous wording handicaps both the students and valuers.
8. Mood of the examiner affects the scoring of answer scripts.
9. There is halo effect-biased judgement by previous impressions.
10. The scores may be affected by his personal bias or partiality for a particular point of
view, his way of understanding the question, his weightage to different aspect of the
answer, favouritism and nepotism, etc.
Thus, the potential disadvantages of essay type questions are:
(i) Poor predictive validity,
(ii) Limited content sampling,
(iii) Scores unreliability, and
(iv) Scoring constraints.
The teacher can sometimes, through essay tests, gain improved insight into a student’s
abilities, difficulties and ways of thinking and thus have a basis for guiding his/her
learning.
(A) White Framing Questions:
1. Give adequate time and thought to the preparation of essay questions, so that they can
be re-examined, revised and edited before they are used. This would increase the validity
of the test.
2. The item should be so written that it will elicit the type of behaviour the teacher wants
to measure. If one is interested in measuring understanding, he should not ask a question
that will elicit an opinion; e.g.,
“What do you think of Buddhism in comparison to Jainism?”
3. Use words which themselves give directions e.g. define, illustrate, outline, select,
classify, summarise, etc., instead of discuss, comment, explain, etc.
4. Give specific directions to students to elicit the desired response.
5. Indicate clearly the value of the question and the time suggested for answering it.
6. Do not provide optional questions in an essay test because—
(i) It is difficult to construct questions of equal difficulty;
(ii) Students do not have the ability to select those questions which they will answer best;
(iii) A good student may be penalised because he is challenged by the more difficult and
complex questions.
7. Prepare and use a relatively large number of questions requiring short answers rather
than just a few questions involving long answers.
8. Do not start essay questions with such words as list, who, what, whether. If we begin
the questions with such words, they are likely to be short-answer question and not essay
questions, as we have defined the term.
9. Adapt the length of the response and complexity of the question and answer to the
maturity level of the students.
10. The wording of the questions should be clear and unambiguous.
11. It should be a power test rather than a speed test. Allow a liberal time limit so that the
essay test does not become a test of speed in writing.
12. Supply the necessary training to the students in writing essay tests.
13. Questions should be graded from simple to complex so that all the testees can answer
atleast a few questions.
14. Essay questions should provide value points and marking schemes.
(B) While Scoring Questions:
1. Prepare a marking scheme, suggesting the best possible answer and the weightage
given to the various points of this model answer. Decide in advance which factors will be
considered in evaluating an essay response.
2. While assessing the essay response, one must:
a. Use appropriate methods to minimise bias;
b. Pay attention only to the significant and relevant aspects of the answer;
c. Be careful not to let personal idiosyncrasies affect assessment;
d. Apply a uniform standard to all the papers.
3. The examinee’s identity should be concealed from the scorer. By this we can avoid the
“halo effect” or “biasness” which may affect the scoring.
4. Check your marking scheme against actual responses.
5. Once the assessment has begun, the standard should not be changed, nor should it vary
from paper to paper or reader to reader. Be consistent in your assessment.
6. Grade only one question at a time for all papers. This will help you in minimising the
halo effect in becoming thoroughly familiar with just one set of scoring criteria and in
concentrating completely on them.
7. The mechanics of expression (legibility, spelling, punctuation, grammar) should be
judged separately from what the student writes, i.e. the subject matter content.
8. If possible, have two independent readings of the test and use the average as the final
score.

QUESTION NO 5

Explain the qualities of a good test? In which situations equivalent


form of reliability can be a good measure of reliability?

ANSWER:
One of the major goals of education is to prepare students for the next step in their
future. They have to make sure that their learners have acquired enough knowledge about
the field of study. Only good tests ensure this. A good test is not only a score that learners
struggle to ace.
It’s feedback a student receives to improve his skills and knowledge and a good teacher
loves to get back to, always, to make sure their teaching strategies are on point and
whether they need development or not.
It’s also a feedback for decision-makers in all educational institutions and governmental
positions who need good data to get to the next step of the institution or the State’s
education plan.
It’s not something centric that students spend days of anxiety on, wondering how well
they will do in a given test and how well the test questions are actually written and
whether they are questions they do know the answer to or not.
Teachers used to measure students’ knowledge only by how they score in a given exam.
They give students only one chance to show their competencies without discussions or
classroom projects. Online assessment is a way through which teachers can improve
students’ learning, knowledge, beliefs, and skills. Online assessments can be behavioral,
cognitive, or communicative assessments.
Students may take the online assessment in the classroom or at home and this reduces
their stress. New tools are now introduced for instructors to set different types of
assessments.
There are four general classes of reliability estimates, each of which estimates reliability
in a different way. They are:
 Inter-Rater or Inter-Observer Reliability: Used to assess the degree to which
different raters/observers give consistent estimates of the same phenomenon.
 Test-Retest Reliability: Used to assess the consistency of a measure from one
time to another.
 Parallel-Forms Reliability: Used to assess the consistency of the results of two
tests constructed in the same way from the same content domain.
 Internal Consistency Reliability: Used to assess the consistency of results across
items within a test.

Inter-Rater or Inter-Observer Reliability


Whenever you use humans as a part of your measurement procedure, you have to worry
about whether the results you get are reliable or consistent. People are notorious for their
inconsistency. We are easily distractible. We get tired of doing repetitive tasks. We
daydream. We misinterpret.
So how do we determine whether two observers are being consistent in their
observations? You probably should establish inter-rater reliability outside of the context
of the measurement in your study. After all, if you use data from your study to establish
reliability, and you find that reliability is low, you’re kind of stuck. Probably it’s best to
do this as a side study or pilot study. And, if your study goes on for a long time, you may
want to reestablish inter-rater reliability from time to time to assure that your raters aren’t
changing.
There are two major ways to actually estimate inter-rater reliability. If your measurement
consists of categories – the raters are checking off which category each observation falls
in – you can calculate the percent of agreement between the raters. For instance, let’s say
you had 100 observations that were being rated by two raters. For each observation, the
rater could check one of three categories. Imagine that on 86 of the 100 observations the
raters checked the same category. In this case, the percent of agreement would be 86%.
OK, it’s a crude measure, but it does give an idea of how much agreement exists, and it
works no matter how many categories are used for each observation.
The other major way to estimate inter-rater reliability is appropriate when the measure is
a continuous one. There, all you need to do is calculate the correlation between the
ratings of the two observers. For instance, they might be rating the overall level of
activity in a classroom on a 1-to-7 scale. You could have them give their rating at
regular time intervals (e.g., every 30 seconds). The correlation between these ratings
would give you an estimate of the reliability or consistency between the raters.

Test-Retest Reliability
We estimate test-retest reliability when we administer the same test to the same sample
on two different occasions. This approach assumes that there is no substantial change in
the construct being measured between the two occasions. The amount of time allowed
between measures is critical. We know that if we measure the same thing twice that the
correlation between the two observations will depend in part by how much time elapses
between the two measurement occasions. The shorter the time gap, the higher the
correlation; the longer the time gap, the lower the correlation. This is because the two
observations are related over time – the closer in time we get the more similar the factors
that contribute to error. Since this correlation is the test-retest estimate of reliability, you
can obtain considerably different estimates depending on the interval.

Parallel-Forms Reliability
In parallel forms reliability you first have to create two parallel forms. One way to
accomplish this is to create a large set of questions that address the same construct and
then randomly divide the questions into two sets. You administer both instruments to the
same sample of people. The correlation between the two parallel forms is the estimate of
reliability. One major problem with this approach is that you have to be able to generate
lots of items that reflect the same construct. This is often no easy feat. Furthermore, this
approach makes the assumption that the randomly divided halves are parallel or
equivalent. Even by chance this will sometimes not be the case. The parallel forms
approach is very similar to the split-half reliability described below. The major difference
is that parallel forms are constructed so that the two forms can be used independent of
each other and considered equivalent measures. For instance, we might be concerned
about a testing threat to internal validity. If we use Form A for the pretest and Form B
for the posttest, we minimize that problem. it would even be better if we randomly assign
individuals to receive Form A or B on the pretest and then switch them on the posttest.
With split-half reliability we have an instrument that we wish to use as a single
measurement instrument and only develop randomly split halves for purposes of
estimating reliability.

Internal Consistency Reliability


In internal consistency reliability estimation we use our single measurement instrument
administered to a group of people on one occasion to estimate reliability. In effect we
judge the reliability of the instrument by estimating how well the items that reflect the
same construct yield similar results. We are looking at how consistent the results are for
different items for the same construct within the measure. There are a wide variety of
internal consistency measures that can be used.

You might also like