Professional Documents
Culture Documents
Assessment 1. Module.. 2022.sir Bem
Assessment 1. Module.. 2022.sir Bem
[Document subtitle]
Abstract
[Draw your reader in with an engaging abstract. It is typically a short summary of the document.
When you’re ready to add your content, just click here and start typing.]
Firdauz
[Email address]
lOMoARcPSD|11364589
Introduction:
This course focuses on the principles, development and utilization of conventional assessment tools to
improve the teaching- learning process. It emphasizes the use of assessment of, as, and for, in
measuring knowledge, comprehension and other thinking skills in the cognitive, psychomotor or
affective domains. It allows the students to go through the standard steps in test construction and
development and application in grading systems
Course Outcomes:
1. Discuss the characteristics of the different concepts in assessment of learning
2. Distinguish measurement, evaluation, and assessment given varied classroom setting
3. Characterize assessments that will influence instructional decisions
4. Cite differences between standardized testing and classroom assessments
5. Recall the taxonomy of educational objectives by Bloom
6. Distinguish desirable qualities of good test instruments as bases for judging the quality of
classroom assessment
7. Plan a design of a table of specifications for the (major) subject to be tested
8. Describe the differences between a norm- referenced and criterion- referenced interpretation of
assessment performance in terms of how the scores are reported
9. Explain the meaning and function of the measures of central tendency and variability
10. Explain the meaning of normal and skewed score distribution
Course Outline:
MODULE 1 – BASIC CONCEPTS IN ASSESSMENT OF LEARNING
MODULE 2 - PRINCIPLES OF HIGH-QUALITY CLASSROOM ASSESSMENT
MODULE 3 – DEVELOPMENT OF CLASSROOM TOOLS FOR MEASURING KNOWLEDGE AND
UNDERSTANDING
MODULE 4: DESCRIPTION OF ASSESSMENT DATA
MODULE 5: INTERPRETATION AND UTILIZATION OF TEST RESULTS
1
lOMoARcPSD|11364589
Prelim
Topics
MODULE 1 – BASIC CONCEPTS IN ASSESSMENT OF LEARNING
MODULE 2 - PRINCIPLES OF HIGH-QUALITY CLASSROOM ASSESSMENT
Pre-Assessment Learning Activities:
Answer the following. Write your answer in a clean sheet of paper
1. What is the concept of Assessment?
2. What is the purpose of assessment?
3. What is assessment in education?
Lesson Discussion:
ASSESSMENT OF LEARNING 1
Assessment –refers to the process of gathering, describing or quantifying information about the student
performance. It includes paper and pencil test, extended responses (example essays) and performance
assessment are usually referred to as‖authentic assessment‖ task (example presentation of research work)
Evaluation- it refers to the process of examining the performance of student. It also determines whether
or not the student has met the lesson instructional objectives.
Test –is an instrument or systematic procedures designed to measure the quality, ability, skill or
knowledge of students by giving a set of question in a uniform manner. Since test is a form of assessment,
tests also answer the question‖how does individual student perform?
Testing-is a method used to measure the level of achievement or performance of the learners. It also
refers to the administration, scoring and interpretation of an instrument (procedure) designed to elicit
information about performance in a simple of a particular area of behavior.
Types of Measurement
There are two ways of interpreting the student performance in relation to classroom instruction. These are
the Norm-reference tests and Criterion-referenced tests.
Norm-reference test is a test designed to measure the performance of a student compared with other
students. Each individual is compared with other examinees and assigned a score-usually expressed as
percentile, a grade equivalent score or a stanine. The achievement of student is reported for broad skill
areas, although some norm referenced tests do report student achievement for individual.
The purpose is to rank each student with respect to the achievement of others in broad areas of knowledge
and to discriminate high and low achievers.
2
lOMoARcPSD|11364589
Criterion- referenced test is a test designed to measure the performance of students with respect to some
particular criterion or standard. Each individual is compared with a pre determined set of standard for
acceptable achievement. The performance of the other examinees are irrelevant. A student’s score is
usually expressed as a percentage and student achievement is reported for individual skills,
The purpose is to determine whether each student has achieved specific skills or concepts. And to find out
how mush students know before instruction begins and after it has finished.
Other terms less often used for criterion-referenced are objective referenced, domain referenced, content
referenced and universe referenced.
3
lOMoARcPSD|11364589
According to Robert L. Linn and Norma E. gronlund (1995) pointed out the common characteristics and
differences of Norm-Referenced Tests and Criterion-Referenced Tests
TYPES OF ASSESSMENT
There are four type of assessment in terms of their functional role in relation to classroom instruction.
These are the placement assessment, diagnostic assessment, formative assessment and summative
assessment.
A. Placement Assessment is concerned with the entry performance of student, the purpose of
placement evaluation is to determine the prerequisite skills, degree of mastery of the course
objectives and the best mode of learning.
B. Diagnostic Assessment is a type of assessment given before instruction. It aims to identify the
strengths and weaknesses of the students regarding the topics to be discussed. The purpose of
diagnostic assessment:
Assessment can be made precise, accurate and dependable only if what are to be achieved are
clearly stated and feasible. The learning targets, involving knowledge, reasoning, skills, products
and effects, need to be stated in behavioral terms which denote something which can be observed
through the behavior of the students.
Cognitive Targets
Benjamin Bloom (1954) proposed a hierarchy of educational objectives at the cognitive level.
These are:
Application – transfer of knowledge from one field of study to another of from one concept to
another concept in the same discipline
Analysis – breaking down of a concept or idea into its components and explaining g the concept
as a composition of these concepts
Synthesis – opposite of analysis, entails putting together the components in order to summarize
the concept
Evaluation and Reasoning – valuing and judgment or putting the ―worth‖ of a concept or
principle.
Cognitive
Affective
Psychomotor
Written-Response Instruments
Objective tests – appropriate for assessing the various levels of hierarchy of educational
objectives
Essays – can test the students’ grasp of the higher level cognitive skills
Checklists – list of several characteristics or activities presented to the subjects of a study, where
they will analyze and place a mark opposite to the characteristics.
Used to rate products like book reports, maps, charts, diagrams, notebooks, creative
endeavors
Need to be developed to assess various products over the years
Oral Questioning – appropriate assessment method when the objectives are to:
3. VALIDITY
Types of Validity
MORE ON VALIDITY
Face validity – outward appearance of the test, the lowest form of test validity
4.RELIABILITY
Something reliable is something that works well and that you can trust.
A reliable test is a consistent measure of what it is supposed to measure.
Questions:
Can we trust the results of the test?
Would we get the same results if the tests were taken again and scored by a different
person?
Tests can be made more reliable by making them more objective (controlled items).
Reliability is the extent to which an experiment, test, or any measuring procedure yields
the same result on repeated trials.
Equivalency reliability is the extent to which two items measure identical concepts at an
identical level of difficulty. Equivalency reliability is determined by relating two sets of
test scores to one another to highlight the degree of relationship or association.
Internal consistency is the extent to which tests or procedures assess the same
characteristic, skill or quality. It is a measure of the precision between the observers or of
the measuring instruments used in a study.
Interrater reliability is the extent to which two or more individuals (coders or raters)
agree. Interrater reliability addresses the consistency of the implementation of a rating
system.
Split-half method
Calculated using the following: Spearman-Brown prophecy formula and Kuder-
Richardson – KR 20 and KR21
Consistency of test results when the same test is administered at two different time
periods such as Test-retest method and Correlating the two test results.
5. FAIRNESS
Opportunity to learn
Prerequisite knowledge and skills
Avoiding teacher stereotype
Avoiding bias in assessment tasks and procedures
6. POSITIVE CONSEQUENCES
Learning assessments provide students with effective feedback and potentially improve their
motivation and/or self-esteem. Moreover, assessments of learning gives students the tools to
assess themselves and understand how to improve positive consequence on students, teachers,
parents, and other stakeholders
Questions:
Tests can be made more practical by making it more objective (more controlled items)
8. ETHICS
Informed consent
Anonymity and Confidentiality
1. Gathering data
2. Recording Data
3. Reporting Data
Learning References:
[1] Navaro, R.L., Santos, R.G., and Corpuz, B.B. (2019). Assessment of Learning,
OBE & PPST Based, Fourth Edition. Lorimar Publishing, Inc.
[2] Bartlett, J. (2015). Outstanding Assessment for Learning in the Classroom. Routledge, Taylor &
Francis Group
[3] Frey, N. and Fisher, D. (2011). The Formative Assessment Action Plan. USA: ASCD
[4] Lewin, and Shoemaker, B. (2011) Great Performances: Creating Classroom-Based Assessment
Task 2nd Edition. USA ASCD
[5] Ecclestone, K. et. al. (2010) Transforming Formative Assessment in Lifelong Learning. UK:
McGraw-Hill Open University Press
[6] Airasian, Peter W. (2005). Classroom Assessment Concept and Applications. New York:
McGraw-Hill Companies, Inc.
[7] Shermis, Mark D. and Di Vesta, Francis J. (2011) Classroom assessment in action. Rowman &
Littlefield Publishers, Inc.
[8] Anderson, Lorin W. (2003). Classroom Assessment: Enhancing the Quality of
Teaching Decision Making. Lawrence Erlbaum Associates, Publishers
Post-Assessment Learning Activities: Answer the following. Write your answer in a clean sheet of paper
1. Summarize the coverage topic in Prelim
2. In your own understanding, what is assessment all about?
3. What is the difference between norm-reference and criterion reference?
4. Differentiate validity and reliability?
5. Give at least 2 common characteristics of norm-reference and criterion reference, then discuss
lOMoARcPSD|11364589
Midterm
Topics: MODULE 3 – DEVELOPMENT OF CLASSROOM TOOLS FOR MEASURING KNOWLEDGE AND
UNDERSTANDING
Pre-Assessment Learning Activities: Answer the following. Write your answer in a clean sheet of paper
1. What are classroom tools? Discuss your answer
Lesson Discussion:
Aims to measure
students intelligence
or mental ability in a
large degree without
reference to what
the students has
learned
Measures the
Purpose
intangible
characteristics of an
individual (e.g.
Aptitude Tests,
Personality Tests,
Intelligence Tests)
Survey
Result is interpreted
by comparing one
student’s
performance with
lOMoARcPSD|11364589
Educational tive
Measures
Aim fundamental
s to skills and
mea abilities
sure Typically
the constructed by
resul the teacher
t of Criterion-Referenced
instr
uctio Result is
ns interpreted by
and comparing
learn student’s
ing performance
(e.g. based on a
Perf predefined
orm
ance
Test
s)
Mastery
C
o
v
e
r
s
a
s
p
e
c
if
i
c
o
b
j
e
c
lOMoARcPSD|11364589
performance standard
Some will really pass All or none may pass
There is competition There is no
for a limited competition for a
percentage of high limited percentage of
scores high score
Describes pupil’s Describes pupil’s
performance mastery of course
compared to others objectives
Interpretation
Verbal Non-verbal
Words are used by Students do not use
students in attaching words in attaching
meaning to or meaning to or in
responding to test responding to test
items items (e.g. graphs,
Language numbers, 3-D
Mode subjects)
Standardized Informal
Constructed by a Constructed by a
professional item classroom teacher
writer
Covers a broad range Covers a narrow
of content covered in range of content
a subject area
Uses mainly multiple Various types of
choice items are used
Items written are Teacher picks or
screened and the best writes items as
Construction items were chosen for needed for the test
the final instrument
Can be scored by a Scored manually by
machine the teacher
Interpretation of Interpretation is
results is usually usually criterion-
norm-referenced referenced
Individual Group
Power Speed
initial structure is limited and not detailed like the completion test. For e.g.: 2 cartoons are
given and a dialogue is to be written.
4. Expression Techniques - : In this the people are asked to express the feeling or
attitude of each other people.
d. At least 4 glasses
Better: What is the daily minimum required amount of milk a 10-year old child
should drink?
a. 1 glass
b. 2 glasses
c. 3 glasses
d. 4 glasses
9. When possible, present alternatives in some logical order (chronological, most to
least, alphabetical).
At 7 a.m. two trucks leave a diner and travel north. One truck averages 42 miles
per hour and the other truck averages 38 miles per hour. At what time will be 24 miles apart?
Undesirable Desirable
a. 6 p.m. a. 1 a.m.
b. 9 a.m. b. 6 a.m.
c. 1 a.m. c. 9 a.m.
d. 1 p.m. d. 1 p.m.
e. 6 a.m. e. 6 p.m.
10. Be sure there is only one correct or best response to the item.
Poor: The two most desired characteristics in a classroom test are validity and
a. Precision
b .Reliability*
c. Objectivity
d. Consistency*
Best: The two most desired characteristics in a classroom test are validity and
a. Precision
b. Reliability*
c. Objectivity
d. Standardization
11. Make alternative approximately equal in length.
Poor: The most general cause of low individual incomes in the US is
a. Lack of valuable productive services to sell*
b. Unwillingness to work
c. Automation
d. Inflation
Better: What is the most general cause of low individual incomes in the US?
a. A lack of valuable productive services to sell*
b. The population’s overall unwillingness to work
c. The nation’s increase reliance on automation
lOMoARcPSD|11364589
In general matching items consists of a column of stimuli presented on the left side
of the exam page and a column of responses placed on the right side of the page. Students are
required to match the response associated with a given stimulus.
Advantages of Using Matching Test Items
1. Require short period of reading and response time allowing the teacher to cover
more content.
2. Provide objective measurement of student achievement or ability.
3. Provide highly reliable test scores.
4. Provide scoring efficiency and accuracy.
Disadvantages of Using Matching Test Items
1. Have difficulty measuring learning objectives requiring more than simple recall
or information.
2. Are difficult to construct due to the problem of selecting a common set of stimuli
and responses.
Suggestions for Writing Matching Test items
1. Include directions which clearly state the basis for matching the stimuli with the
responses. Explain whether or not the response can be used more than once and indicate
where to write the answer.
Poor: Directions: Match the following.
Better: Directions: On the line to the left of each identifying location and
characteristics in Column 1, write the letter on the country in column III that is best defined.
Each country in Column may be used more than once.
2. Use only homogeneous material in matching items.
Poor: Directions: Match the following.
1. Water A. NaCI
2. Discovered Radium B. Fermi
3. Salt C. NH3
4. Year of the First Nuclear Fission by man D. 1942
5. Ammonia E. Curie
Better: Directions: On the line to the left of each compound in column I, write
the letter of the compound’s formula presented in column II. Use each formula once.
Column I Column II
1. Water A.H2SO4
2. Salt B. HCI
3. Ammonia C. NaCI
4. Sulfuric Acid D. H2O
E. H2HCI
lOMoARcPSD|11364589
Principle 3: Balanced
- A balanced assessments sets target in all sets in domains of learning (cognitive,
effective, and psychomotor) or domains of intelligence (verbal-linguistics, logic
mathematical, bodily kinaesthetic, visual-spatial, musical-rhythmic, intrapersonal-social,
intrapersonal-introspection, physical world-natural-existential-spiritual)
- A balanced assessment makes use of both traditional and alternative assessment.
Principle 4. Validity
Validity – is a degree to which the assessment instrument measures what it intends
to measure.
It is also refers to the usefulness of the instrument for a given purpose.
It is the most important criterion of a good assessment instrument
Ways in Establishing Validity
1. Face Validity- is done by examining the physical appearance of the
instrument.
2. Content Validity- is done through a careful and critical examination of the
objectives of assessment so that it reflects the curricular objectives.
3. Criterion-related Validity- is established statistically such that a set of scores
revealed by the measuring instrument IS CORRELATED with the scores obtained in another
EXTERNAL PREDICTOR OR MEASURE.
It has two purposes:
a. Concurrent Validity- describe the present status of the individual by correlating the sets of
scores obtained FROM TWO MEASUREs GIVEN CONCURRENTLY.
Example: Relate the reading test result with pupils’ average grades in reading given by the
teacher.
b. Predictive Validity- describes the future performance of an individual by
correlating the sets of scores obtained from TWO MEASURES GIVEN AT A LONGER TIME
INTERVAL.
Example: The entrance examination scores in a test administered to a freshmen class at the
beginning of the school year is correlated with the average grades at the end of the school year.
4. Construct Validity- Validity established by analysing the activities and processes
that correspond to a particular concept; is established statistically by comparing psychological
traits or factors that theoretically influence scores in a test.
a. Convergence validity helps to establish construct validity when you use two different
measurement procedures and research methods (e.g., participant observation and a survey) in
your study to collect data about a construct (e.g., anger, depression, motivation, task
performance).
lOMoARcPSD|11364589
b. Divergent validity helps to establish construct validity by demonstrating that the construct
you are interested in (e.g., anger) is different from other constructs that might be present in
your study (e.g., depression).
Table of specification is a device for describing test items in terms of the content and the process
dimensions. That is, what a student is expected to know and what he or she is expected to do with that
knowledge. It is described by combination of content and process in the table of specification.
Sessions Distribution
TOTAL 20 40 40
Example :
=2x40
20
Number of items= 4
1.Definition of linear 2 1 1 1 1 4
function
2.Slope of a line 2 1 1 1 1
Learning References:
[9] Navaro, R.L., Santos, R.G., and Corpuz, B.B. (2019). Assessment of Learning, OBE
& PPST Based, Fourth Edition. Lorimar Publishing, Inc.
[10]Bartlett, J. (2015). Outstanding Assessment for Learning in the Classroom. Routledge, Taylor &
Francis Group
[11]Frey, N. and Fisher, D. (2011). The Formative Assessment Action Plan. USA: ASCD
[12]Lewin, and Shoemaker, B. (2011) Great Performances: Creating Classroom-Based Assessment
Task 2nd Edition. USA ASCD
[13]Ecclestone, K. et. al. (2010) Transforming Formative Assessment in Lifelong Learning. UK:
McGraw-Hill Open University Press
[14]Airasian, Peter W. (2005). Classroom Assessment Concept and Applications. New York:
McGraw-Hill Companies, Inc.
[15] Shermis, Mark D. and Di Vesta, Francis J. (2011) Classroom assessment in action. Rowman &
Littlefield Publishers, Inc.
[16]Anderson, Lorin W. (2003). Classroom Assessment: Enhancing the Quality of
Teaching Decision Making. Lawrence Erlbaum Associates, Publishers
Final
Topics: MODULE 4: DESCRIPTION OF ASSESSMENT DATA
MODULE 5: INTERPRETATION AND UTILIZATION OF TEST RESULTS
Lesson Discussion:
MODULE 4: DESCRIPTION OF ASSESSMENT DATA
ITEM ANALYSIS
Item analysis refers to the process of examining the student’s responses to each item in the test.
According to Abubakar S. Asaad and William M. Hailaya (Measurement and Evaluation Concepts &
Principles) Rex Bookstore (2004 Edition), there are two characteristics of an item. These are desirable
and undesirable characteristics. An item that has desirable characteristics can be retained for subsequent
use and that with undesirable characteristics is either be revised or rejected.
a. Difficulty if an item
b. Discriminating power of an item
c. Measures of attractiveness
Difficulty index refers to the proportion of the number of students in the upper and lower groups who
answered an item correctly. In a classroom achievement test, the desired indices of difficulty not lower
than 0.20 nor higher than 0.80. the average index difficulty form 0.30 or 0.40 to maximum of 0.60.
DF = PUG + PLG
2
0.21-0.40 Difficult
0.61-0.80 Easy
Index of Discrimination
Discrimination Index is the differences between the proportion of high performing students who got the
item and the proportion of low performing students who got an item right. The high and low performing
students usually defined as the upper 27% of the students based on the total examination score and the
lower 27% of the students based on total examination score. Discrimination are classified into positive
Discrimination if the proportion of students who got an item right in the upper performing group is
greater than the students in the upper performing group. And Zero Discrimination if the proportion of the
students who got an item right in the upper performing group and low performing group are equal.
Maximum Discrimination is the sum of the proportion of the upper and lower groups who answered the
item correctly. Possible maximum discrimination will occur if the half or less of the sum of the upper and
lower groups answered an item correctly.
Di = discrimination index
DM – Maximum discrimination
DE = Discriminating Efficiency
Formula:
Di = PUG – PLG
DE = Di
DM
Example: Eighty students took an examination in Algebra, 6 students in the upper group got the correct
answer and 4 students in the lower group got the correct answer for item number 6. Find the
Discriminating efficiency
Given:
27% of 80 = 21.6 or 22, which means that there are 22 students in the upper performing group and 22
students in the lower performing group.
Di = PUG- PLG
= 27%- 18%
Di= 9%
DM = PUG +PLG
= 27% + 18%
DM= 45%
DE = Di/DM
= .09/.45
DE = 0.20 or 20%
This can be interpreted as on the average, the item is discriminating at 20% of the potential of an item of
its difficulty.
Measures of Attractiveness
To measure the attractiveness of the incorrect option ( distracters) in multiple-choice tests, we count the
number if students who selected the incorrect option in both upperand lower groups. The incorrect
option is said to be effective distracter if there are more students in the lower group chose that
incorrect option than those students in the upper group.
1. Rank the scores of the students from highest score to lowest score.
2. Select 27% of the papers within the upper performing group and 27% of the papers within the
lower performing group.
3. Set aside the 46% of papers because they will not be used for item analysis.
4. Tabulate the number of students in the upper group and lower group who selected each
alternative.
5. Compute the difficulty of each item
6. Compute the discriminating powers of each item
7. Evaluate the effectiveness of the distracters
lOMoARcPSD|11364589
We shall discusse different statistical technique used in describing and analyzing test results.
4. Skewness
Measures of Central Tendency it is a single value that is used to identify the center of the data, it is taught
as the typical value in a set of scores. It tends to lie within the center if it is arranged form lowest to
highest or vice versa. There are three measures of central tendency commonly used; the mean, median
and mode.
The Mean
The Mean is the common measures of center and it also know as the arithmetic average.
Sample Mean = ∑x
n
X= individual scores
n = number of scores
45
35
48
60
44
39
47
55
58
54
∑x = 485
n= 10
Mean = ∑x
n
= 485÷ 10
Mean = 48.5
Properties of Mean
lOMoARcPSD|11364589
1. Easy to compute
2. It may be an actual observation in the data set
3. It can be subjected to numerous mathematical computation
4. Most widely used
5. Each data affected by the extremes values
6. It is easily affected by the extremes values
7. Applied to interval level data
The Median
The median is a point that divides the scores in a distribution into two equal parts when the scores are
arranged according to magnitude, that is from lowest score to highest score or highest score to lowest
score. If the number of score is an odd number, the value of the median is the middle score. When the
number of scores is even number, the median values is the average of the two middle scores.
First , arrange the scores from lowest to highest and find the average of two middle most scores since the
number of cases in an even.
35
39
44
45
47
48
54
55
58
60
Mean = 47 + 48
2
= 47.5 is the median score
The median value is the 5th score which is 47. Which means that 50% of the scores fall below 47.
Properties of Median
The Mode
The mode refers to the score or scores that occurred most in the distribution. There are classification of
mode: a) unimodal is a distribution that consist of only one mode. B) bimodal is a distribution of scores
that consist of two modes, c) multimodal is a score distribution that consist of more than two modes.
Properties of Mode
Example 1. Find the mode of the scores of students in algebra quiz: 34,36,45,65,34,45,55,61,34,46
Example 2. Find the mode of the scores of students in algebra quiz: 34,36,45,61,34,45,55,61,34,45
Mode = 34 and 45, because both appeared three times. The distribution is called bimodal
Measures of Variability
Measures of Variability is a single value that is used to describe the spread out of the scores in
distribution, that is above or below the measures of central tendency. There are three commonly used
measures variability, the range, quartile deviation and standard deviation
The Range
Range is the difference between highest and lowest score in the data
set. R=HS-LS
Properties of Range
Example: scores of 10 students in Mathematics and Science. Find the range and what subject has a greater
variability?
Mathematics Science
35 35
33 40
45 25
55 47
62 55
34 35
54 45
lOMoARcPSD|11364589
36 57
47 39
40 52
Mathematics Science
HS = 62 HS =57
LS= 33 LS= 25
R = HS-LS R= HS-LS
R= 62-33 R= 57-25
R= 29 R= 32
Based form the computed value of the range, the scores in Science has greater variability. Meaning,
scores in Science are more scattered than in the scores in Mathematics
Quartile Deviation is the half of the differences the third quartile (Q3) and the first quartile (Q1). It is
based on the middle 50% of the range, instead the range of the entire set
QD = Q3-Q1
2
=50.25 – 25.4
2
QD= 12.4
lOMoARcPSD|11364589
The value of QD =12.4 which indicates the distance we need to go above or below the median to include
approximately the middle 50% of the scores.
The standard deviation is the most important and useful measures of variation, it is the square root of the
variance. It is an average of the degree to which each set of scores in the distribution deviates from the
mean value. It is more stable measures of variation because it involves all the scores in a distribution
rather than range and quartile deviation.
SD = √∑( x-mean)2
n-1
Example: 1. Find the standard deviation of scores of 10 students in algebra quiz. Using the given data
below.
X (x-mean)2
45 12.25
35 182.25
48 0.25
60 132.25
44 20.5
39 90.25
47 2.25
55 42.25
58 90.25
54 30.25
N= 10
lOMoARcPSD|11364589
Mean = ∑x
= 485
10
SD= √∑(x-mean)2
n-1
SD= √ 602.5
10-1
SD= √ 66.944444
Example 2: Find the standard deviation of the score of 10 students below. In what subject has greater
variability
Mathematics Science
35 35
33 40
45 25
55 47
62 55
lOMoARcPSD|11364589
34 35
54 45
36 57
47 39
40 52
35 82.81
33 123.21
45 0.81
55 118.81
62 320.41
34 102.01
54 98.01
36 65.61
47 8.41
40 16.81
SD= √∑(x-mean)2
n-1
= √ 936.9
lOMoARcPSD|11364589
10-1
√
= 104.1
36 64
40 9
25 324
47 16
55 144
35 64
45 4
57 196
39 16
52 81
Mean =430
10
Mean= 43
SD= √∑(x-mean)2
n-1
= √ 918
10-1
= √ 102
SD= 10.10 for science subject
lOMoARcPSD|11364589
The standard deviation for mathematics subject is 10.20 and the standard deviation foe science subject is
10.10, which means that mathematics scores has a greater variability than science scores. In other words,
the scores in mathematics are more scattered than in science.
When the value of standard deviation is large, on the average, the scores will be far form the mean. On
the other hand, If the value of standard deviation is small, on the average, the score will be close form the
mean.
Coefficient of Variation
Coefficient of variation is a measure of relative variation expressed as percentage of the arithmetic mean.
It is used to compare the variability of two or more sets of data even when the observations are expressed
in different units of measurement. Coefficient of variation can be solve using the formula.
( )
CV = SD x 100%
Mean
The lower the value of coefficient of variation, the more the overall data approximate to the mean or more
the homogeneous the performance of the group
A 87 8.5
B 90 10.25
= 8.5 x 100%
87
CV Group A=9.77%
= 10.25 x 100%
90
CV Group B=11.39%
The CV of Group A is 9.77% and CB of Group B is 11/39%, which means that group A has homogenous
performance.
Percentile Rank
The Percentile rank of a score is the percentage of the scores in the frequency distribution which are
lower. This means that the percentage of the examinees in the norm group who scored below the score of
interest. Percentile rank are commonly used to clarify the interpretation of scores on standardized tests.
Z- SCORE
Z- score (also known as standard score) measures how many standard deviations an observations is
above or below the mean. A positive z-score measures the number of standard deviation a score is above
the mean, and a negative z-negative z-score gives the number of standard deviation a score is below the
mean.
X= is a raw score
EXAMPLE:
Math Analysis 88 10 95
Natural Science 85 5 80
EXAMPLE:A study showed the performance of two Groups A and B in a certain test given by a
researcher. Group A obtained a mean score of 87 points with standard deviation of 8.5 points, Group B
obtained a mean score of 90 points with standard deviation of 10.25 points. Which of the two group has a
more homogeneous performance?
10
85
Z natural Science= -1
7.5
James Mark had a grade in Math Analysis that was 0.70 standard deviation above the mean of the Math
Analysis grade, while in Natural Science he was -1.0 standard deviation below the mean of Natural
Science grade. He also had a grade in Labor Management that was 0.27 standard deviation above the
mean of the Labor Management grades. Comparing the z scores, James Mark performed best in
Mathematics Analysis while he performed very poor in Natural Science in relation to the group
performance.
T-score
T-score can be obtained by multiplying the z-score by 10 and adding the product to 50. In symbol, T-
score = 10z +50
Using the same exercise, compute the T-score of James Mark in Math Analysis, Natural Science and
Labor Management
= 57
= 40
=52.7
Since the highest T-score us in math analysis = 57, we can conclude that James Mark performed best in
Math analysis than in natural science and labor management.
Stanine
Stanine also known as standard nine, is a simple type of normalized standard score that illustrate the
process of normalization. Stanines are single digit scores ranging form 1 to 9.
2 3 4 5 6 7 8 9
Stanines 1
Skewness
Describes the degree of departures of the distribution of the data from symmetry.
The degree of skewness is measured by the coefficient of lsewness, denoted as SK and computed as,
SK= 3(mean-media)
SD
Normal curve is a symmetrical bell shaped curve, the end tails are continuous and asymptotic. The mean,
median and mode are equal. The scores are normally distributed if the computed value of SK=0
Positively skewed when the curve is skewed to the right, it has a long tail extending off to the right but a
short tail to the left. It increases the presence of a small proportion of relatively large extreme value SK˃0
lOMoARcPSD|11364589
When the computed value of SK is positive most of the scores of students are very low, meaning to say that
they performed poor in the said examination
Negatively skewed when a distribution is skewed to the left. It has a long tail extending off to the left but a
short tail to the right. It indicates the presence of a high proportion of relatively large extreme values SK˂0.
When the computed value of SK is negative most of the students got a very high score, meaning to say that
they performed very well in the said examination
Learning References:
[17]Navaro, R.L., Santos, R.G., and Corpuz, B.B. (2019). Assessment of Learning,
OBE & PPST Based, Fourth Edition. Lorimar Publishing, Inc.
[18]Bartlett, J. (2015). Outstanding Assessment for Learning in the Classroom. Routledge, Taylor &
Francis Group
[19]Frey, N. and Fisher, D. (2011). The Formative Assessment Action Plan. USA: ASCD
[20]Lewin, and Shoemaker, B. (2011) Great Performances: Creating Classroom-Based Assessment
Task 2nd Edition. USA ASCD
[21]Ecclestone, K. et. al. (2010) Transforming Formative Assessment in Lifelong Learning. UK:
McGraw-Hill Open University Press
[22]Airasian, Peter W. (2005). Classroom Assessment Concept and Applications. New York:
McGraw-Hill Companies, Inc.
[23] Shermis, Mark D. and Di Vesta, Francis J. (2011) Classroom assessment in action. Rowman
& Littlefield Publishers, Inc.
[24]Anderson, Lorin W. (2003). Classroom Assessment: Enhancing the Quality of
Teaching Decision Making. Lawrence Erlbaum Associates, Publishers