Course: Educational Assessment and Evaluation (8602) Level: M.Ed/ M.A. Semester: Autumn, 2022

Course: Educational Assessment and Evaluation (8602)
Level: M.Ed/ M.A. Semester: Autumn, 2022
Assignment No.2
Level: B.Ed. (1.5 Years)
Name: Ahmed Raza
ID/Roll No: 0000059227

ASSIGNMENT No. 2
Q. No. 1 Define the validity of test.
ANS.
Reliability is the extent to which test scores are consistent, with respect to one or more
sources of inconsistency—the selection of specific questions, the selection of raters,
the day and time of testing.
Reliability refers to how dependably or consistently a test measures a characteristic. If

a person takes the test again, will he or she get a similar test score, or a much different
score? A test that yields similar scores for a person who repeats the test is said to
measure a characteristic reliably.
How do we account for an individual who does not get exactly the same test score
every time he or she takes the test? Some possible reasons are the following:
Test taker's temporary psychological or physical state. Test performance can be

influenced by a person's psychological or physical state at the time of testing. For
example, differing levels of anxiety, fatigue, or motivation may affect the applicant's
test results.
Environmental factors. Differences in the testing environment, such as room

temperature, lighting, noise, or even the test administrator, can influence an
individual's test performance.
Test form. Many tests have more than one version or form. Items differ on each form,
but each form is supposed to measure the same thing. Different forms of a test are
known as parallel forms or alternate forms. These forms are designed to have similar
measurement characteristics, but they contain different items. Because the forms are
not exactly the same, a test taker might do better on one form than on another.
Multiple raters. In certain tests, scoring is determined by a rater's judgments of the

test taker's performance or responses. Differences in training, experience, and frame of
reference among raters can produce different test scores for the test These factors are
sources of chance or random measurement error in the assessment process. If there
were no random errors of measurement, the individual would get the same test score,
the individual's "true" score, each time. The degree to which test scores are unaffected
by measurement errors is an indication of the reliability of the test.
Reliable assessment tools produce dependable, repeatable, and consistent information

about people. In order to meaningfully interpret test scores and make useful
employment or career-related decisions, you need reliable tools. This brings us to the
next principle of assessment.
Interpretation of reliability information from test manuals and reviews
Test manuals and independent review of tests provide information on test reliability.
The following discussion will help you interpret the reliability information about any
test.
Types of reliability estimates
There are several types of reliability estimates, each influenced by different sources of
measurement error. Test developers have the responsibility of reporting the reliability
estimates that are relevant for a particular test. Before deciding to use a test, read the
test manual and any independent reviews to determine if its reliability is acceptable.
The acceptable level of reliability will differ depending on the type of test and the
reliability estimate used.
Validity.
Validity is the most important issue in selecting a test. Validity refers to what
characteristic the test measures and how well the test measures that characteristic.
Validity tells you if the characteristic being measured by a test is related to job
qualifications and requirements.
Validity gives meaning to the test scores. Validity evidence indicates that there is
linkage between test performance and job performance. It can tell you what you may
conclude or predict about someone from his or her score on the test. If a test has been
demonstrated to be a valid predictor of performance on a specific job, you can
conclude that persons scoring high on the test are more likely to perform well on the
job than persons who score low on the test, all else being equal.
Validity also describes the degree to which you can make specific conclusions or
predictions about people based on their test scores. In other words, it indicates the
usefulness of the test.
It is important to understand the differences between reliability and validity. Validity

will tell you how good a test is for a particular situation; reliability will tell you how
trustworthy a score on that test will be. You cannot draw valid conclusions from a test
score unless you are sure that the test is reliable. Even when a test is reliable, it may
not be valid. You should be careful that any test you select is both reliable and valid
for your situation.
A test's validity is established in reference to a specific purpose; the test may not be
valid for different purposes. For example, the test you use to make valid predictions
about someone's technical proficiency on the job may not be valid for predicting his or
her leadership skills or absenteeism rate. This leads to the next principle of
assessment.
Similarly, a test's validity is established in reference to specific groups. These groups

are called the reference groups. The test may not be valid for different groups. For
example, a test designed to predict the performance of managers in situations

requiring problem solving may not allow you to make valid or meaningful predictions
about the performance of clerical employees. If, for example, the kind of problem-
solving ability required for the two positions is different, or the reading level of the
test is not suitable for clerical applicants, the test results may be valid for managers,
but not for clerical employees.
Test developers have the responsibility of describing the reference groups used to
develop the test. The manual should describe the groups for whom the test is valid,
and the interpretation of scores for individuals belonging to each of these groups. You
must determine if the test can be used appropriately with the particular type of people
you want to test. This group of people is called your target population or target group.
Using validity evidence from outside studies
Conducting your own validation study is expensive, and, in many cases, you may not
have enough employees in a relevant job category to make it feasible to conduct a
study. Therefore, you may find it advantageous to use professionally developed
assessment tools and procedures for which documentation on validity already exists.
However, care must be taken to make sure that validity evidence obtained for an
"outside" test study can be suitably "transported" to your particular situationlevel,
cultural differences, and language barriers.
Q. No. 2 What are the general considerations in constructing essay type test
items.
ANS.
Essay test is a test containing questions that requires the examinee to write several
paragraphs in their own words. Generally, essay tests are designed to measure the
different abilities of examinees such as factual knowledge, language proficiency with
legible handwriting, organizing answer and time management.
Introduction to Essay Test:

The essay tests are still commonly used tools of evaluation, despite the increasingly
wider applicability of the short answer and objective type questions.
There are certain outcomes of learning (e.g., organising, summarising, integrating

ideas and expressing in one’s own way) which cannot be satisfactorily measured
through objective type tests. The importance of essay tests lies in the measurement of
such instructional outcomes.
An essay test may give full freedom to the students to write any number of pages.
The required response may vary in length. An essay type question requires the pupil to
plan his own answer and to explain it in his own words. The pupil exercises
considerable freedom to select, organise and present his ideas. Essay type tests
provide a better indication of pupil’s real achievement in learning. The answers
provide a clue to nature and quality of the pupil’s thought process.
That is, we can assess how the pupil presents his ideas (whether his manner of
presentation is coherent, logical and systematic) and how he concludes. In other
words, the answer of the pupil reveals the structure, dynamics and functioning of
pupil’s mental life.
The essay questions are generally thought to be the traditional type of questions which
demand lengthy answers. They are not amenable to objective scoring as they give
scope for halo-effect, inter-examiner variability and intra-examiner variability in
scoring.
Types of Essay Test:
There can be many types of essay tests:
Some of these are given below with examples from different subjects:
1. Selective Recall.
e.g. What was the religious policy of Akbar?
2. Evaluative Recall.
e.g. Why did the First War of Independence in 1857 fail?
3. Comparison of two things—on a single designated basis.
e.g. Compare the contributions made by Dalton and Bohr to Atomic theory.
4. Comparison of two things—in general.
e.g. Compare Early Vedic Age with the Later Vedic Age.
5. Decision—for or against.
e.g. Which type of examination do you think is more reliable? Oral or Written. Why?
6. Causes or effects.
e.g. Discuss the effects of environmental pollution on our lives.
7. Explanation of the use or exact meaning of some phrase in a passage or a sentence.
e.g., Joint Stock Company is an artificial person. Explain ‘artificial person’ bringing
out the concepts of Joint Stock Company.
8. Summary of some unit of the text or of some article.
9. Analysis
e.g. What was the role played by Mahatma Gandhi in India’s freedom struggle?
10. Statement of relationship.
e.g. Why is knowledge of Botany helpful in studying agriculture?
11. Illustration or examples (your own) of principles in science, language, etc.
e.g. Illustrate the correct use of subject-verb position in an interrogative sentence.
12. Classification.
e.g. Classify the following into Physical change and Chemical change with
explanation. Water changes to vapour; Sulphuric Acid and Sodium Hydroxide react to
produce Sodium Sulphate and Water; Rusting of Iron; Melting of Ice.
13. Application of rules or principles in given situations.
e.g. If you sat halfway between the middle and one end of a sea-saw, would a person
sitting on the other end have to be heavier or lighter than you in order to make the sea-
saw balance in the middle. Why?
14. Discussion.
e.g. Partnership is a relationship between persons who have agreed to share the profits
of a business carried on by all or any of them acting for all. Discuss the essentials of
partnership on the basis of this partnership.
15. Criticism—as to the adequacy, correctness, or relevance—of a printed statement

or a classmate’s answer to a question on the lesson.
e.g. What is the wrong with the following statement?
The Prime Minister is the sovereign Head of State in India.
16. Outline.
e.g. Outline the steps required in computing the compound interest if the principal
amount, rate of interest and time period are given as P, R and T respectively.
17. Reorganization of facts.
e.g. The student is asked to interview some persons and find out their opinion on the
role of UN in world peace. In the light of data thus collected he/she can reorganise
what is given in the text book.
18. Formulation of questions-problems and questions raised.
e.g. After reading a lesson the pupils are asked to raise related problems- questions.
19. New methods of procedur
e.g. Can you solve this mathematical problem by using another method?
Advantages of the Essay Tests:
1. It is relatively easier to prepare and administer a six-question extended- response

essay test than to prepare and administer a comparable 60-item multiple-choice test
items.
2. It is the only means that can assess an examinee’s ability to organise and present his
ideas in a logical and coherent fashion.
3. It can be successfully employed for practically all the school subjects.
4. Some of the objectives such as ability to organise idea effectively, ability to criticise
or justify a statement, ability to interpret, etc., can be best measured by this type of
test.
5. Logical thinking and critical reasoning, systematic presentation, etc. can be best
developed by this type
6. It helps to induce good study habits such as making outlines and summaries,
organising the arguments for and against, etc.
7. The students can show their initiative, the originality of their thought and the
fertility of their imagination as they are permitted freedom of response.
8. The responses of the students need not be completely right or wrong. All degrees of
comprehensiveness and accuracy are possible.
9. It largely eliminates guessing.
10. They are valuable in testing the functional knowledge and power of expression of
the pupil.
Limitations of Essay Tests:

1. One of the serious limitations of the essay tests is that these tests do not give scope
for larger sampling of the content. You cannot sample the course content so well with
six lengthy essay questions as you can with 60 multiple-choice test items.
2. Such tests encourage selective reading and emphasise cramming.
3. Moreover, scoring may be affected by spelling, good handwriting, coloured ink,

neatness, grammar, length of the answer, etc.
4. The long-answer type questions are less valid and less reliable, and as such they
have little predictive value.
5. It requires an excessive time on the part of students to write; while assessing,

reading essays is very time-consuming and laborious.
6. It can be assessed only by a teacher or competent professionals.
7. Improper and ambiguous wording handicaps both the students and valuers.
8. Mood of the examiner affects the scoring of answer scripts.
9. There is halo effect-biased judgement by previous impressions.
10. The scores may be affected by his personal bias or partiality for a particular point
of view, his way of understanding the question, his weightage to different aspect of
the answer, favouritism and nepotism, etc.
Thus, the potential disadvantages of essay type questions are:
(i) Poor predictive validity,
(ii) Limited content sampling,
(iii) Scores unreliability, and
(iv) Scoring constraints.
Suggestions for Improving Essay Tests:

The teacher can sometimes, through essay tests, gain improved insight into a student’s
abilities, difficulties and ways of thinking and thus have a basis for guiding his/her
learning.
(A) White Framing Questions:
1. Give adequate time and thought to the preparation of essay questions, so that they
can be re-examined, revised and edited before they are used. This would increase the
validity of the test.
2. The item should be so written that it will elicit the type of behaviour the teacher
wants to measure. If one is interested in measuring understanding, he should not ask a
question that will elicit an opinion; e.g.,
“What do you think of Buddhism in comparison to Jainism?”
3. Use words which themselves give directions e.g. define, illustrate, outline, select,
classify, summarise, etc., instead of discuss, comment, explain, etc.
4. Give specific directions to students to elicit the desired response.
5. Indicate clearly the value of the question and the time suggested for answering it.
6. Do not provide optional questions in an essay test because—
(i) It is difficult to construct questions of equal difficulty;
(ii) Students do not have the ability to select those questions which they will answer
best;
(iii) A good student may be penalised because he is challenged by the more difficult
and complex questions.
7. Prepare and use a relatively large number of questions requiring short answers
rather than just a few questions involving long answers.
8. Do not start essay questions with such words as list, who, what, whether. If we
begin the questions with such words, they are likely to be short-answer question and
not essay questions, as we have defined the term.
9. Adapt the length of the response and complexity of the question and answer to the
maturity level of the students.
10. The wording of the questions should be clear and unambiguous.
11. It should be a power test rather than a speed test. Allow a liberal time limit so that
the essay test does not become a test of speed in writing.
12. Supply the necessary training to the students in writing essay tests.
13. Questions should be graded from simple to complex so that all the testees can
answer atleast a few questions.
14. Essay questions should provide value points and marking schemes.
(B) While Scoring Questions:
1. Prepare a marking scheme, suggesting the best possible answer and the weightage
given to the various points of this model answer. Decide in advance which factors will
be considered in evaluating an essay response.
2. While assessing the essay response, one must:
a. Use appropriate methods to minimise bias;
b. Pay attention only to the significant and relevant aspects of the answer;
c. Be careful not to let personal idiosyncrasies affect assessment;
d. Apply a uniform standard to all the papers.
3. The examinee’s identity should be concealed from the scorer. By this we can avoid
the “halo effect” or “biasness” which may affect the scoring.
4. Check your marking scheme against actual responses.

5. Once the assessment has begun, the standard should not be changed, nor should it
vary from paper to paper or reader to reader. Be consistent in your assessment.
6. Grade only one question at a time for all papers. This will help you in minimising
the halo effect in becoming thoroughly familiar with just one set of scoring criteria
and in concentrating completely on them.
7. The mechanics of expression (legibility, spelling, punctuation, grammar) should be

judged separately from what the student writes, i.e. the subject matter content.
8. If possible, have two independent readings of the test and use the average as the
final score.
Q. No. 3 Write the steps of constructing frequency distribution tables.
ANS’
Frequency distribution tables are used to organize and summarize large amounts of
data into more manageable chunks. They help to identify patterns, trends, and insights
that may not be immediately apparent when looking at the raw data. Constructing a
frequency distribution table involves several steps, which we will outline below:
1. Determine the range of values: The first step is to determine the range of values
that your data set covers. This can be done by finding the minimum and
maximum values in your data set. For example, if your data set consists of the
ages of 50 individuals, you might find that the youngest person is 18 years old
and the oldest person is 60 years old, giving you a range of 42 years.
2. Determine the class intervals: Once you have the range of values, the next step
is to determine the class intervals. Class intervals are ranges of values that will
be used to group the data into categories or bins. The size of the class intervals
will depend on the range of values and the number of categories you want to
create. A common method for determining the class intervals is to use the
square root of the total number of data points. For example, if you have 50 data
points, you would take the square root of 50 (which is approximately 7) and
use this as the number of categories. If your data range was 18 to 60, you could
create class intervals of 5 years, resulting in 9 categories (18-22, 23-27, 28-32,
etc.).
3. Count the frequencies: Once you have established your class intervals, the next
step is to count the frequencies. The frequency is the number of data points that
fall within each class interval. To do this, you would go through your data set
and count how many data points fall into each category. For example, if you
had 50 data points and your class intervals were 5 years, you might find that
there are 6 people between the ages of 18-22, 12 people between the ages of
23-27, and so on.
4. Create the table: The final step is to create the frequency distribution table. This
table will show the class intervals, the frequency (number of data points) that
falls within each interval, and the percentage of the total data set that falls
within each interval. You may also want to include cumulative frequencies,
which show the total number of data points that fall within each interval and all
previous intervals. Here is an example of what a frequency distribution table
might look like:

Age Group Frequency Percentage Cumulative Frequency
18-22 6 12% 6
23-27 12 24% 18
28-32 10 20% 28
33-37 8 16% 36
38-42 7 14% 43
43-47 4 8% 47
48-52 2 4% 49
53-57 1 2% 50
Age Group Frequency Percentage Cumulative Frequency
58-60 0 0% 50
In this example, we can see that the most common age range is 23-27, which accounts
for 24% of the total data set. We can also see that 70% of the data falls within the age
Q. No. 4 Discuss how to inform parents about their children performa?
ANS.
When it comes to informing parents about their children's performance, it is important
to provide accurate and objective information that can help parents understand their
child's strengths and weaknesses. Here are some steps that can be taken to effectively
communicate with parents:
1. Schedule a meeting: It is best to schedule a meeting with the parents to discuss
their child's performance. This allows for a face-to-face interaction and an
opportunity to discuss any concerns or questions they may have. The meeting
should be scheduled at a mutually convenient time for both the parents and the
teacher.
2. Prepare for the meeting: Before the meeting, the teacher should prepare by
gathering information on the child's performance, including test scores, grades,
attendance, and any behavioral concerns. It is also helpful to have examples of
the child's work to share with the parents.
3. Start with positive feedback: It is important to begin the meeting with positive
feedback on the child's strengths and accomplishments. This helps to build a
positive relationship with the parents and can increase their receptiveness to
any constructive feedback.
4. Share specific areas of concern: After providing positive feedback, the teacher
should discuss specific areas of concern. It is important to be specific and
objective in discussing any issues, and to provide evidence to support any
claims. It is also helpful to discuss any strategies or resources that can be used
to help address any issues.
5. Listen to the parents' perspective: It is important to listen to the parents'
perspective and any concerns they may have. The teacher should acknowledge
their concerns and work with them to develop a plan to address any issues.
6. Develop an action plan: Together with the parents, the teacher should develop
an action plan to help the child improve their performance. This plan should
include specific goals, strategies, and timelines. The teacher should also
provide regular updates to the parents on their child's progress.

7. Follow up: It is important to follow up with the parents after the meeting to
ensure that they understand the information that was shared and to answer any
additional questions they may have. The teacher should also continue to
provide regular updates on the child's progress and any changes to the action
plan.
In summary, effective communication with parents about their child's performance
involves scheduling a meeting, preparing for the meeting, starting with positive
feedback, sharing specific areas of concern, listening to the parents' perspective,
developing an action plan, and following up regularly. By following these steps,
teachers can help parents understand their child's performance and work together to
support their academic and personal growth.
Q. No. 5 Write a note on advantages and disadvantages of Norm-refernce
testing.
Norm-referenced testing is a type of assessment that compares an individual's
performance to the performance of a larger group or normative sample. While norm-
referenced testing can be useful in certain situations, it also has its advantages and
disadvantages.
Advantages of norm-referenced testing:
1. Standardization: Norm-referenced tests are often standardized, which means
that they are administered and scored consistently. This can help ensure that the
results are reliable and valid.

2. Comparison: The use of a normative sample allows for comparisons to be made
between individuals or groups. This can help identify strengths and
weaknesses, and can be useful in making decisions about placement or
selection.
3. Objectivity: Norm-referenced testing is often based on objective criteria, such
as multiple-choice questions or scoring rubrics. This can help reduce bias and
ensure that the results are fair and accurate.
Disadvantages of norm-referenced testing:
1. Limited information: Norm-referenced tests typically only provide information
on how an individual's performance compares to others in the normative
sample. This can be useful in some contexts, but may not provide a complete
picture of an individual's abilities or potential.
2. Pressure to perform: Because norm-referenced tests compare individuals to
others, there can be pressure to perform well. This can create anxiety and may
not accurately reflect an individual's true abilities.
3. Lack of individualization: Norm-referenced tests are typically not designed to
assess individual differences or unique strengths and weaknesses. This can
limit their usefulness in certain situations.
4. Limited scope: Norm-referenced tests are often focused on specific areas, such
as language or math skills. This can be limiting for individuals who have
strengths or interests in other areas.

In conclusion, while norm-referenced testing can be useful in certain contexts, it is
important to consider its advantages and disadvantages when selecting an assessment
tool. It may be more appropriate to use a different type of assessment, such as
criterion-referenced testing or performance-based assessments, depending on the
situation and the goals of the assessment.

Course: Educational Assessment and Evaluation (8602) Level: M.Ed/ M.A. Semester: Autumn, 2022

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Course: Educational Assessment and Evaluation (8602) Level: M.Ed/ M.A. Semester: Autumn, 2022

Uploaded by

Copyright:

Available Formats

Course: Educational Assessment and Evaluation (8602)

Level: M.Ed/ M.A. Semester: Autumn, 2022

Course: Educational Assessment and Evaluation (8602)

Level: B.Ed. (1.5 Years)

Name: Ahmed Raza

ID/Roll No: 0000059227

Q. No. 1 Define the validity of test.

Reliability refers to how dependably or consistently a test measures a characteristic. If

Test taker's temporary psychological or physical state. Test performance can be

Environmental factors. Differences in the testing environment, such as room

Multiple raters. In certain tests, scoring is determined by a rater's judgments of the

Reliable assessment tools produce dependable, repeatable, and consistent information

Interpretation of reliability information from test manuals and reviews

Types of reliability estimates

It is important to understand the differences between reliability and validity. Validity

Similarly, a test's validity is established in reference to specific groups. These groups

example, a test designed to predict the performance of managers in situations

Using validity evidence from outside studies

Introduction to Essay Test:

There are certain outcomes of learning (e.g., organising, summarising, integrating

Types of Essay Test:

There can be many types of essay tests:

e.g. What was the religious policy of Akbar?

e.g. Why did the First War of Independence in 1857 fail?

3. Comparison of two things—on a single designated basis.

4. Comparison of two things—in general.

e.g. Discuss the effects of environmental pollution on our lives.

7. Explanation of the use or exact meaning of some phrase in a passage or a sentence.

8. Summary of some unit of the text or of some article.

10. Statement of relationship.

e.g. Why is knowledge of Botany helpful in studying agriculture?

11. Illustration or examples (your own) of principles in science, language, etc.

e.g. Illustrate the correct use of subject-verb position in an interrogative sentence.

13. Application of rules or principles in given situations.

15. Criticism—as to the adequacy, correctness, or relevance—of a printed statement

e.g. What is the wrong with the following statement?

The Prime Minister is the sovereign Head of State in India.

17. Reorganization of facts.

18. Formulation of questions-problems and questions raised.

19. New methods of procedur

Advantages of the Essay Tests:

1. It is relatively easier to prepare and administer a six-question extended- response

3. It can be successfully employed for practically all the school subjects.

9. It largely eliminates guessing.

Limitations of Essay Tests:

2. Such tests encourage selective reading and emphasise cramming.

3. Moreover, scoring may be affected by spelling, good handwriting, coloured ink,

5. It requires an excessive time on the part of students to write; while assessing,

6. It can be assessed only by a teacher or competent professionals.

8. Mood of the examiner affects the scoring of answer scripts.

9. There is halo effect-biased judgement by previous impressions.

Thus, the potential disadvantages of essay type questions are:

(i) Poor predictive validity,

(ii) Limited content sampling,

(iii) Scores unreliability, and

(iv) Scoring constraints.

Suggestions for Improving Essay Tests:

(A) White Framing Questions:

“What do you think of Buddhism in comparison to Jainism?”

4. Give specific directions to students to elicit the desired response.

6. Do not provide optional questions in an essay test because—

(i) It is difficult to construct questions of equal difficulty;

10. The wording of the questions should be clear and unambiguous.