You are on page 1of 127

DEPARTMENT OF EARLY CHILDHOOD STUDIES

P.O. Box 342-01000


Thika
Email: Info@mku.ac.ke
Web: www.mku.ac.

UNIT CODE: BEM 4102

INTRODUCTION TO STATISTICS,
MEASUREMENTS TESTS AND EVALUATION

i
BEM 4102: Introduction to Statistics, Measurements and Evaluation
Credit Hours: 3:
Pre-requisites: None
Purpose: To use tests and measurements appropriately
Course objective:
By the end of the unit the learner should be able to
i) Analyze data using appropriate statistical measures in education
ii) Discuss various statistical methods available
iii) Explain the nature of educational tests
iv) Use locally acceptable measurement techniques
v) Appreciate the use of central tendency measures in tests and measurement
Course content
Meaning of educational measurement, Philosophy and nature of educational testing and
measurement; Reliability and validity; Forms of evaluation; Discreet and none discreet data;
Central tendency measurement; Measuring variance; the concept of the normal curve;
Correlational and Regressional tests.

COURSE OUTLINE
WEEK 1
CHAPTER ONE: INTRODUCTION TO MEASUREMENT, TESTS AND EVALUATION
• Meaning and purpose of Tests and Measurement
• Meaning of Evaluation
• Purpose/Functions of Educational Evaluation
• Principles of Evaluation
• Types of Evaluation used in classroom instruction
WEEK 2 & 3
CHAPTER TWO: INSTRUCTIONAL OBJECTIVES, TEST DEVELOPMENT AND
TEST IMPROVEMENT
• Definitions
• Type of objectives
• Test Development
• Planning the test and steps to ensure successful test planning by the teacher
• General guidelines in test construction
• Test improvement

ii
WEEK 4
CHAPTER THREE: TYPES OF TESTS
• The Essay tests
• Merits (advantages) and Demerits (limitations) of essays tests
• Suggestions to reduce limitations of essay tests
• Objective test
• Advantages and Disadvantages of objective tests
• Supply item tests;
• Selection item tests; True – False tests; Matching – item tests; Multiple-choice-test tests
and Pictorial – item tests
• Rank – order test items
WEEK 5
CHAPTER FOUR: CHARACTERISTICS OF GOOD TESTS
• Reliability of a Test
• Factors impinging on test on reliability
• Methods of Assessing Reliability
• Validity of a Test
• Types of test validity
• Factors threatening test validity
• Other characteristics of a good test
WEEK 6
CHAPTER FIVE: INTRODUCTION TO STATISTICS
• Introduction; Importance and Limitations of Statistics
• Subdivisions in statistics
• Scales of Measurements
• Variables and their Classification
• Discrete and Non-discrete data
• Sources of data

iii
WEEK 7
CHAPTER SIX: DATA COLLECTION, ORGANIZATION AND PRESENTATION
• Collection and Presentation of Data
• Organizing Data
• Presentation of Data;
• Bar charts; Multiple Bar Charts; Composite Bar Charts and Pie Charts
• General Rules of Forming Frequency Distribution
• Histograms
• Cumulative Frequency
WEEK 8 & 9
CHAPTER SEVEN: MEASURES OF CENTRAL TENDENCY
• The Mean and its computation
• Median and its computation
• Mode and its computation
• Merits and demerits of the Measures of Central Tendency
WEEK 10
CHAPTER EIGHT: MEASURES OF DISPERSION
• Definition of Dispersion
• Properties of a good measure of dispersion
• Significance of measures of dispersion
• Range and its computation
• Standard Deviation and its computation
• Variance and its computation
• Importance of variance and standard deviation
• Relative Dispersion (Coefficient of Variation)
WEEK 11
CHAPTER NINE: SKEWNESS AND KURTOSIS

• Symmetrical distribution
• Skewness
• Skewed to the right
• Skewed to the left
• Kurtosis

iv
WEEK 12 & 13
CHAPTER TEN: CORRELATION AND REGRESSION ANALYSIS

• Introduction to Correlation Analysis


• Pearson’s product moment correlation coefficient
• Coefficient of Determination
• Spearman Rank Correlation coefficient
• Regression Analysis
• Simple Linear Regression Model
• Formulas for the Regression line
• Method of Least square
• Differences between Correlation and Regression
Teaching / Learning Methodologies
Group discussions; Lecturing; Individual assignment; Micro-teaching
Instructional Materials and Equipment
Chalk board; Overhead Projectors
Course Assessment
Examination - 70%; Continuous Assessments (Exercises and Tests) - 30%; Total - 100%
Recommended Text Books for further Reading

Gronlund, N.E & Linn, R.L (1990). Measurement and evaluation in teaching. (6th Ed). New
York: Macmillan Publishing Company.
Kithuka, M. (2004). Educational measurement and evaluation: A guide to teachers. Egerton,
Kenya: Egerton University Press.
Ministry of Education (1987); A Handbook for Teachers of English in Secondary School; Jomo
Kenyatta Foundation- Nairobi
Saleemi N. A. (1997). Statistics simplified. Acme Press (Kenya) Ltd. Nairobi, Kenya

Richard A. J and Gouri K. B. (2010). Statistics: Principles and methods (6th Edition), John
Wiley and sons Inc. USA

v
TABLE OF CONTENT

Page
COURSE OUTLINE .................................................................................................................... ii
CHAPTER ONE:INTRODUCTION TO MEASUREMENT,TESTS AND EVALUATION 1
1.1 Meaning of Tests .............................................................................................................. 1
1.2 Measurement .................................................................................................................... 2
1.3 Purpose of Tests and Measurements ................................................................................ 2
1.4 Meaning of Evaluation ..................................................................................................... 3
1.5 Purpose/Functions of Educational Evaluation ................................................................. 4
1.6 Principles of Evaluation ................................................................................................... 4
1.7 Types of Evaluation used in classroom instruction.......................................................... 5
1.7.1 Formative Evaluation ............................................................................................... 5
1.7.2 Summative Evaluation ............................................................................................. 5
Review Questions ..................................................................................................................... 6
CHAPTER TWO: INSTRUCTIONAL OBJECTIVES, TEST DEVELOPMENT AND
TEST IMPROVEMENT .............................................................................................................. 7
2.1 Introduction ...................................................................................................................... 7
2.2 Definitions ........................................................................................................................ 7
2.3 Type of objectives ............................................................................................................ 7
2.4 Test Development ............................................................................................................ 9
2.4.1 Planning the test ....................................................................................................... 9
2.4.2 Steps to ensure successful test planning by the teacher ......................................... 10
2.5 General guidelines in test construction .......................................................................... 13
2.6 Test improvement........................................................................................................... 13
2.6.1 Test Tryout ............................................................................................................. 14
2.6.2 Establishing Test Reliability and Validity.............................................................. 17
Review Questions .................................................................................................................... 19
CHAPTER THREE: TYPES OF TESTS ................................................................................. 20
3.1 Introduction .................................................................................................................... 20
3.2 The Essay tests ............................................................................................................... 20
3.2.1 Merits (advantages) of essay tests .......................................................................... 20
3.2.2 Demerits (limitations) of essays tests ..................................................................... 21
3.2.3 Suggestions to reduce limitations of essay tests ..................................................... 21

vi
3.3 Objective test .................................................................................................................. 22
3.3.1 Advantages of objective tests ................................................................................. 23
3.3.2 Disadvantages of objective test .............................................................................. 23
3.4 Supply item tests ............................................................................................................ 23
3.5 Selection item tests ......................................................................................................... 24
3.5.1 True – False tests .................................................................................................... 24
3.5.2 Matching – item tests ............................................................................................. 25
3.5.3 Multiple-choice-test tests ....................................................................................... 26
3.5.4 Pictorial – item tests ............................................................................................... 28
3.6 Rank – order test items ................................................................................................... 29
3.7 Summary of the suggestions for all testing techniques .................................................. 29
Review Questions .................................................................................................................... 30
CHAPTER FOUR: CHARACTERISTICS OF GOOD TESTS ............................................ 31
4.1 Introduction .................................................................................................................... 31
4.2 Reliability of a Test ........................................................................................................ 31
4.2.1 Factors impinging on test reliability ....................................................................... 32
4.2.2 Methods of Assessing Reliability ........................................................................... 32
4.3 Validity of a Test ............................................................................................................ 34
4.3.1 Types of test validity .............................................................................................. 34
4.3.2 Factors threatening test validity ............................................................................. 36
4.4 Other characteristics of a good test ................................................................................ 37
4.4.1 Administrability...................................................................................................... 37
4.4.2 Scorability .............................................................................................................. 37
Review Questions .................................................................................................................... 38
CHAPTER FIVE: INTRODUCTION TO STATISTICS ....................................................... 39
5.1 Introduction .................................................................................................................... 39
5.2 Importance of Statistics .................................................................................................. 40
5.3 Limitations of Statistics.................................................................................................. 40
5.4 Subdivisions in statistics ................................................................................................ 40
5.5 Scales of Measurements ................................................................................................. 41
5.6 Variables: meaning and classification ............................................................................ 42
5.6.1 Classification of variables ...................................................................................... 43
5.6.2 Discrete and Non-discrete data............................................................................... 43

vii
5.7 Sources of data ............................................................................................................... 44
Review Questions ................................................................................................................... 46
CHAPTER SIX: DATA COLLECTION, ORGANIZATION AND PRESENTATION ..... 47
6.1 Collection and Presentation of Data ............................................................................... 47
6.2 Organizing Data ............................................................................................................. 47
6.3 Presentation of Data ....................................................................................................... 49
6.3.1 Bar charts ................................................................................................................ 49
6.3.2 Multiple Bar Charts ................................................................................................ 50
6.3.3 Composite Bar Charts ............................................................................................ 51
6.3.4 Pie Charts ............................................................................................................... 52
6.4 General Rules of Forming Frequency Distribution ........................................................ 55
6.5 Histograms ..................................................................................................................... 56
6.6 Cumulative Frequency ................................................................................................... 59
Review Questions ................................................................................................................... 61
CHAPTER SEVEN: MEASURES OF CENTRAL TENDENCY.......................................... 63
7.1 Introduction .................................................................................................................... 63
7.2 The Arithmetic Mean ..................................................................................................... 64
7.2.1 Mean from Ungrouped data ................................................................................... 64
7.2.2 Mean from Frequency Distribution ........................................................................ 65
7.2.3 Mean from Grouped Data....................................................................................... 66
7.2.4 Weighted mean ....................................................................................................... 68
7.2.5 Combined mean ...................................................................................................... 69
7.2.6 Adjusting mean for a wrong entry .......................................................................... 71
7.3 The Median – Meaning and computation ...................................................................... 72
7.4 The Mode – meaning and computation .......................................................................... 75
7.5 Merits and demerits of the Measures of Central Tendency ........................................... 77
Review Questions ................................................................................................................... 79
CHAPTER EIGHT: MEASURES OF DISPERSION ............................................................ 80
8.1 Definition of Dispersion ................................................................................................. 80
8.2 Properties of a good measure of dispersion ................................................................... 80
8.3 Significance of measures of dispersion .......................................................................... 81
8.4 Range: Meaning and computation ................................................................................. 81
8.5 Variance as a measure of dispersion .............................................................................. 82

viii
8.6 Standard Deviation ......................................................................................................... 82
8.7 Relative Dispersion (Coefficient of Variation) .............................................................. 87
8.8 Importance of variance and standard deviation ............................................................. 88
Review Questions ................................................................................................................... 89
CHAPTER NINE: THE CONCEPT OF THE NORMAL CURVE ...................................... 91
9.1 Introduction .................................................................................................................... 91
9.2 The concept of Skewness ............................................................................................... 92
9.2.1 Skewed to the right ................................................................................................. 92
9.2.2 Skewed to the left ................................................................................................... 93
9.3 The concept of Kurtosis ................................................................................................. 94
Review Questions .................................................................................................................... 95
CHAPTER TEN: CORRELATION AND REGRESSION ANALYSIS ............................... 96
10.1 Introduction Correlation Analysis .................................................................................. 96
10.1.1 Pearson’s product moment correlation coefficient ................................................. 99
10.1.2 Coefficient of Determination................................................................................ 104
10.1.3 Spearman Rank Correlation coefficient ............................................................... 106
10.2 Regression Analysis ..................................................................................................... 108
10.2.1 Simple Linear Regression Model ......................................................................... 108
10.2.2 Formulas for the Regression line.......................................................................... 109
10.2.3 Method of Least square ........................................................................................ 110
10.3 Differences between Correlation and Regression ........................................................ 111
Review Questions ................................................................................................................. 112
Appendix 1: Sample Test Papers ............................................................................................... 113

ix
1 CHAPTER ONE: INTRODUCTION TO MEASUREMENT, TESTS AND
EVALUATION

Concern for the quality of education pupils are receiving in relation to the money being spent on
education has been a major factor in the current demand for accountability. Procedures for
holding educators accountable for effective educational Programme tend to be supported by
citizens but opposed by educators.
Parents complain that their children are unable to read and write effectively after primary
education. Many secondary school leavers find it difficult to write or express themselves well in
their language of communication. These problems point to the need to overhaul the educational
system as a whole; but, before doing so, we must be well acquainted with measurement and
evaluation procedures through which reliable data about the status of the educational system can
be objectively determined. In this sense it implies both: the process of collecting and ordering the
information, and the result of this information.

Learning Objectives

By the end of this chapter the learner should be able to:


i) Define the terms measurement, tests and evaluation
ii) Distinguish between Measurements and Evaluation
iii) Discuss the purpose of Tests and Measurements in education
iv) Outline the Purpose/Functions of Educational Evaluation
v) Explain the types of Evaluation used in classroom instruction

1.1 Meaning of Tests


A test is an assessment tool used to measure a learner’s attribute such as academic achievement.
It is device for obtaining a sample of an individual’s behaviour. In the narrowest sense’s test
connotes the presentations of a standard set of questions to be answered. As a result of a person’s
answers to such a series of questions, we obtain a measure of an attribute of that person. A useful
test measures accurately some property or behaviour. To evaluate the usefulness of a test, we
need to have the meaning of the term measurement.
Testing is usually associated with student achievement relative to specified classroom objectives.
In other words, it is a measuring device concerned with specific achievement of a student in
terms of given instructional objectives.

1
1.2 Measurement
Measurement refers to the means of using a scale to determine the degree or level of
achievement or learner’s attribute. Through measurement you may be able to establish the
learners’; intelligence, reading readiness and level, ability to comprehend and ability to use
language. In education the tools used for measurement are tests, experiments and examinations.
Numbers are assigned to learners according to a carefully prescribed, repeatable procedure. The
numbers are also assigned so that the differences between scores represent differences in the
property of characteristic being measured.

Measurement has one main goal; the ability to describe, explain and predict the performance of a
person, process or system in a precise manner. To a large extent it is concerned with finding out
how well students are performing in terms of specific objectives. In other words, the process of
measurement is secondary to that of defining objectives. The ends to be achieved must first be
formulated clearly. Then measurement procedures can be sought as tools for appraising the
extent to which those ends have been achieved.

NOTE: A test given to determine how much the students have learnt is generally referred to as
an achievement test or an attainment test. The daily, weekly, end of term and end of year tests
are all examples of achievement tests. An achievement test is therefore only relevant if it
determines how much the students have learnt. If it does not then it has been misused. The
following principles should be considered by a teacher or examiner in order for an achievement
test to be valid;
i. Questions should be set from all parts of the syllabus;

ii. The number of questions set in each of the syllabus section must reflect the relative
importance of these sections.

1.3 Purpose of Tests and Measurements


1. Instructional Decisions
Tests and Measurements can help both the teacher and the learner. They can help the teacher by;

a) Providing knowledge concerning the students’ entry behaviour.


b) Setting, refining and clarifying realistic goals for each learner.
c) Evaluating the degree to which the objectives have been achieved.
d) Determining, evaluating and refining the teacher’s instructional techniques.
Tests and Measurements aid the learner by;

a) Communicating the teacher’s goals and determining how much the learner has learnt and
also his difficulties.
b) Increasing motivation
c) Encouraging good study habits

2
d) Providing feedback that identifies strengths and weaknesses.
NOTE: The goals/objectives of instruction should be communicated to learners in advance
before any evaluation.

2. Guidance Decisions
Students need to be guided on their vocational choices, in their educational programmes and in
their personal problems. For students to make sound decisions in their areas, they need accurate
information. Tests provide students with data about significant characteristics which can help
them understand themselves better. Results of tests help teachers to guide students on subject
choices that determine long term career placement.

3. Administrative decisions
Administrative decisions include selection, classification and placement decisions. In selecting
decisions, one decides whether to accept or reject a person for a particular programme. In
classification one decides the type of programme suitable for oneself when for example enrolling
in a college of education programmes such as Arts, Law, Medicine or Engineering and the level
at which the programme is offered.
Other Administrative/Supervisory functions of tests and measurement include;

• To maintain standards and to set up norms of performance

• To classify or select for special purposes

• To determine teachers efficiency, effectiveness of methods, strategies used


(strengths, weaknesses, needs); standards of instruction

• To serve as basis or guide for curriculum making and developing

• To serve as guide in educational planning of administrators and supervisors

• To inform parents of their children’s progress in school


4. Research Decisions
Research decisions are made whenever information is gathered as prelude to decision making.
Tests can provide this necessary information/data.

1.4 Meaning of Evaluation


This is the process of gathering the necessary information that is used to judge the performance,
suitability, effectiveness and impact of a programme, a project, curriculum or institution. It
involves judging the worth of something often in terms of its cost, adequacy or effectiveness.
In brief evaluation is a measuring device used to determine the value or worth of a programme
under prevailing conditions (i.e. instructional conditions) relative to specified objectives.

3
Evaluation is likely to use tests and measurements as tools and also to include other informal
types of evidence, and undertakes to integrate these into a value judgment of the effectiveness of
an educational enterprise. Since evaluative judgments are usually data-based, measurement is
included in the evaluation process as a functional sub-component, hence the credibility of the
measures used.

1.5 Purpose/Functions of Educational Evaluation


Educational evaluation is carried out from time to time for the following purposes;

a) Evaluation assesses or makes appraisal of Educational objectives, programs, curricula,


instructional materials and facilities.
b) To make reliable decisions about educational planning
c) To ascertain the worth of time, energy and resources invested in a programme.
d) To determine the effectiveness of the programme in terms of student behavioural output.
e) To predict the general trend in the development of the teaching-learning process.
f) To provide a just basis for determining at what level of education a possessor of a certain
certificate should enter a career.
g) To provide an objective basis for determining the promotion of students from one
class/grade to another as well as the award of certificate.
h) To ensure an economical and efficient management of scarce educational resources.
i) To identify students’ growth or lack of growth in acquiring desirable knowledge, skills,
attitude and social values.
j) To help motivate students to want to learn more as they discover their progress or lack of
progress in given tasks or area of study.
k) To help teachers determine the effectiveness of their teaching techniques and learning
resources.
l) To acquaint parents or guardians with their children performance.
m) To identify problems that might hinder or prevent the achievement of set goals.
n) To encourage students to develop a sense of discipline and systematic study habits.
o) Evaluation conducts research

1.6 Principles of Evaluation


Evaluation should be;

a) Based on clearly stated objectives


b) Comprehensive
c) Cooperative
d) Continuous and integral part of the teaching – learning process

4
1.7 Types of Evaluation used in classroom instruction
1.7.1 Formative Evaluation
This refers to the evaluation that continues as the project implementation goes on. It is conducted
throughout the stages of project implementation. It is diagnostic in nature for the purpose of
improving the effectiveness and appropriateness of the whole project.

In education, formative evaluation helps a teacher to identify learners’ weaknesses and thus
enable implementation of remedial measures. It provides feedback regarding the student’s
performance in attaining instructional objectives. It identifies learning errors that need to be
corrected and it provides information to make instruction more effective.

Formative evaluation aims at ensuring acquisition and development of knowledge and skills by
students. The purpose is to find out whether after learning experience students are able to do
what they were previously unable to do. Formative evaluation therefore provides the evaluator
with useful information about the strengths or weaknesses of the student within an educational
context.
Common forms of Formative evaluation used in many educational institutions include the
Continuous Assessment tests and the End of term examination.

1.7.2 Summative Evaluation


This refers to the type of evaluation carried out at the end of the project. The main purpose is to
ascertain whether the objectives were achieved or not. It determines the extent to which
objectives of instruction have been attained and is used for assigning grades/marks and to
provide feedback to students.
Summative evaluation is primarily concerned with purposes, progress and outcomes of the
teaching – learning process. It attempts as far as possible to find out to what extend the broad
objectives of a programme (in this case curriculum) have been achieved.

In Kenyan education system Summative evaluation is administered at the end of an educational


cycle. Examples include the Kenya Certificate of Primary Education administered at the end of
eight years of pupils’ Primary Education and the Kenya Certificate of Secondary Education
administered at the end of four years of students’ Secondary Education

Note; Formative evaluation is guidance – oriented while summative evaluation is judgmental in


nature.

5
Review Questions
Activity 1
Categorize the following as either measurement or evaluation.

1. A standard VI teacher constructs a mathematics test, gives it to her class and


determines the number of correct responses of each pupil.

2. In view of a student’s score on a science test and his score on an academic


aptitude test reported in the record file the teacher decided the student
achievement was below his potential.

3. Based on her observations of the students’ work with balances, a teacher


concludes that a majority of the pupils have learnt to determine satisfactorily the
mass of an object.
4. A standard VIII teacher reports to a parent that during the first three months of
the school year, his child had become more willing to express his judgments freely
with less fear of the peers’ possible disapproval.

Activity 2
1. Define the following terms
i. Measurement
ii. Tests
iii. Evaluation
2. Distinguish between Measurements and Evaluation
3. Discuss the purpose of Tests and Measurements as related to educational setting
4. Outline the Purpose/Functions of Educational Evaluation
5. Explain the types of Evaluation used in classroom instruction

References for further reading


Gronlund, N.E & Linn, R.L (1990). Measurement and evaluation in teaching. (6th Ed). New
York: Macmillan Publishing Company.
Good, C.V (1973). Dictionary of education. New York: McGraw-Hill Company
Ogunniyi, M.B (1991). Educational measurement and evaluation. Ikeja, Nigeria: Longman

6
2 CHAPTER TWO: INSTRUCTIONAL OBJECTIVES, TEST DEVELOPMENT AND
TEST IMPROVEMENT

2.1 Introduction
Traditionally, educational measurement has been very helpful in determining the degree to which
certain objectives have been achieved. If education is to be effective, frequent assessment must
be made of the extent to which the desired behavioral changes have been produced. This
evaluation of students’ achievement is based on clearly defined instructional objectives. There is
a spiral relationship between objectives, instruction and evaluation. This means that any testing
programme needs to be based on the existing educational objectives.

Learning Objectives
By the end of this chapter the learner should be able to:
i. State the types of instructional objectives and explain the importance of writing
objectives.
ii. Describe the Bloom’s Taxonomy for classifying educational objectives
iii. Classify cognitive behaviour in six levels
iv. Explain the steps in test development
v. Prepare a table of specification for a test of a given content area
vi. State the guidelines in test construction
vii. Describe the criteria for test improvement

2.2 Definitions

• An instructional objective is a statement of performance to be demonstrated by each


student in class, and which is derived from the instructional goal, stated in measurable
and observable terms.
• An instructional goal is defined as a statement of performance expected of each student in
a class, stated in general terms without criteria of achievement.
Teachers are always encouraged to state instructional objectives whenever they are planning.

2.3 Type of objectives


In the taxonomy of educational objectives developed by Bloom and his associates, educational
objectives are classified into three categories; in other words, instructional objectives, including
behavioral objectives, can be written for any of the three domains of instruction.

7
1. Cognitive domain
This is rational learning that calls for thinking. Its emphasis is upon knowledge, using the mind,
and intellectual abilities. It is often referred to as Instructional or Behavioral Objectives that
begin with VERBS. This is what we know as Bloom’s Taxonomy

2. Affective domain
This deals with emotional learning and has much to do with feelings. It is concerned with
attitudes, appreciations, interests, values and adjustments.

3. Psychomotor skills
This is the Physical learning that is characterized by doing. It emphasizes speed, accuracy,
dexterity (agility), and physical skills

The cognitive taxonomy


Bloom and his associates developed taxonomy for classifying educational objectives in the
cognitive domain. The taxonomy of cognitive domain is widely used and followed. Cognitive
learning is classified into the following six major categories/levels;

1. Knowledge
This involves recall or recognition in an appropriate context of material whether it is specific
facts, universal principals, methods, process, patterns, structures or settings. Little is required
besides bringing to mind appropriate materials; e.g. recall of major facts about particular
cultures. Verbs applicable include; Arrange, Define, List

2. Comprehension
This is the lowest level of what is commonly called “understanding” and requires that the
individual be able to paraphrase knowledge accurately, to explain or summarize it in his own
words, or to show logical extensions in terms of complications or corollaries; e.g. skill in
translating verbal descriptions of mathematical material into symbolic statements and vice versa.
Verbs applicable include; Classify, Describe, Discuss

3. Applications
This is the ability to select a given abstractions (idea, rule of procedure, or generalized method)
appropriate for a new situation and to correctly apply it; e.g. the ability to predict the probable
effect of a change in a factor, such as an educational Programme on a social situation previously
at equilibrium. Verbs applicable include; Apply, Choose, Write

4. Analysis
This is the ability to break apart a communication or concept into its constituent elements to
show the hierarchy or other internal relation of ideas, to show the basic for organization, and to

8
indicate how it conveys its effects; e.g. the ability to recognize form and pattern in literary and
artistic works as a way of understanding their meaning. Verbs applicable include; Compare,
Contrast, Analyze.

5. Synthesis
This is the arrangement and combination of pieces, parts, elements, etc., in such a way as to
constitute a pattern or structure not there before e.g. ability to tell a personal experience
effectively. Verbs applicable include; Construct, Create, Design

6. Evaluation
This is the qualitative and quantitative judgment about the extent to which material and methods
satisfy criteria determined by teacher or student; e.g. the ability to compare a work with the
highest known standards in its field-especially with other work of recognized excellence. Verbs
applicable include; Appraise, Defend, Judge

2.4 Test Development


Teachers cannot do without measuring and evaluating the progress of their instruction. For that
purpose, they usually make their own test and evaluation measurements which, thus, constitute
the major basis for evaluating the students’ progress in school. Indeed, it is difficult to think of
an educational system where learners are not exposed to teacher-made tests.
Despite the fact that the specific purpose of the tests and the intended use of the result may vary
from one school to another or from one teacher to another, it is essentially to recognize the value
that tests can play in the life of the students, parents, teacher, counselor, school administrators
and other educators. Teachers should therefore provide their students with the best evaluation.
This implies that they must have some procedures whereby they can reliably and validly evaluate
how effectively their students have been taught. The classroom achievement test is one such tool.
How will the teacher develop such tool?

2.4.1 Planning the test


Good tests do not just happen. They require adequate and extensive planning so that the goals of
instruction, the teaching strategy to be employed, the textual materials, the goals of instruction
and the evaluative procedure are all related in some meaningful fashion.

If most teachers do recognize the importance of having some systematic procedure of


ascertaining the extent to which their instructional objectives have been realized, they yet
commit one major error, that of inadequate planning. Too often, teachers feel that they begin
thinking about tests until the last possible moment. This is not favourable because:

• The test produced that way generally contains items that are poorly conceived, poorly
worded, ambiguous, and sometimes grammatically incorrect.

9
• Furthermore, the test may contain items that are either not scorable or have more than one
correct answer.

• It may be either too easy or too difficult and may be measuring trivial details rather than
the more important pervasive outcomes of learning.

Writing items that are valid, reliable and objectively scorable requires times, energy, and
adequate planning. Processional item writers are seldom able to write more than ten good items
per day. So it is unrealistic to expect the ordinary classroom teacher to be able to prepare a 100 –
item test if he begins writing the test only a few days before it is scheduled. The solution to the
problem lies in adequate planning and in spreading out the item writing over a long period of
time.

Ideally, every test should be reviewed critically by other teachers to minimize deficiencies. In
that case, the teacher should prepare the test in sufficient time to permit a critical, independent
review.

2.4.2 Steps to ensure successful test planning by the teacher


Step I: Determining the objectives of the test
Classroom achievement tests serve a variety of purposes, such as;
i. Judging the students’ mastery of certain essential skills and knowledge;
ii. Measuring growth over time;
iii. Ranking students in terms of their achievement of particular instructional objectives;
iv. Diagnosing students’ difficulties;
v. Evaluating the teacher’s instructional method;
vi. Ascertaining the effectiveness of the curriculum; and
vii. Motivating students.
For instance, if the teacher wishes to use his test to ascertain whether each of his students has
mastered certain essentially knowledge and skills, the properties of the items might differ from
the items used when he is interested in his students on those same objectives. For the former, the
teacher would be more interested in having items that most students can answer correctly and he
would want his test to have a rather narrow or restricted sampling content. For the latter, the
teacher would want the majority of the items to be of average difficulty and to have a test that
samples a wider range of subject- matter content.

Undoubtedly, the most difficult step in the test planning is the specification of objectives, yet,
this is essential; for without objectives, the teacher will not know what is to be measured.

10
Step II: Preparing the Table of Specification
The second major question that the classroom teacher (who has become a test constructor) must
ask him/herself is;
“What is it that I wish to measure?”

Thus, the teacher must know what he wants to measure. For instance, should the teacher test for
factual knowledge or should he test the extent to which students are able to apply their factual
knowledge or should he test the extent to which students are able to apply their factual
knowledge? The answer to this question depends upon the teacher’s instructional objectives and
what has been stressed in class. If the teacher emphasized the recall of names, places, and dates,
he should test for this. On the other hand, if in chemistry, he had stressed the interpretation of
data, then his test, in order to be a valid measure of his teaching, should emphasize the
measurement of interpretation of data.

In this stage of thinking about the test, the teacher must consider the relationships among his
objectives, teaching, and testing. Once the course content and instructional objectives have been
specified, the teacher is ready to integrate them in some meaningful way so that the test, when
completed, will be a measure of the student’s knowledge.
Kithuka (2004) defines a table of specification as “a two –dimensional table that describes the
nature of items (to be included in a test). It shows whether the item will be testing knowledge,
comprehension, application, analysis, synthesis, or evaluation.”

There are different ways of preparing a table of specifications, depending on the areas being
tested. Generally, tables of specifications have some commonalities. Among them are course
content, behaviour, number of test items and percentage of items.
Kithuka (2004) suggests the following as the steps in constructing a table of specifications:
i. List the general behavioral objectives at the top of the matrix table;
ii. List the content taught on the left hand side of the matrix;
iii. Decide on the length of the test in terms of the number of questions.
iv. Decide on the weighting of the objectives guided by the level of learners.
v. Decide on the weighting of the content taught guided by the amount of time spent on it.

vi. Distribute items in the different cells based on the weighting. Instead of the number of
questions in each cell, the particular item numbers (e.g. Q1, Q2, Q3 etc.) can be written in
the cells. This helps better other users of the test to determine whether or not the items
classified against each cognitive skill truly belongs where it is placed.

11
The table below is an illustrative example of the table of specifications

Content Category (Bloom Taxonomy)


Knowledge Comprehension Application Analysis Evaluation Total Percent
Synthesis number of items
of items
Measurement and 10 7 3 4 - 24 24%
evaluation
Characteristics of 17 10 7 5 3 42 42%
a good test
Preparation of 7 14 7 4 2 34 34%
instructional
objectives
Total Number of 34 31 17 13 5 100
items
Percent of items 34% 31% 17% 13% 5% - 100%

Step III: Selecting the appropriate item format


When the teacher has decided on the purpose of the test and what he is interested in measuring
both in terms of objectives and content, he must decide on the best way of measuring his
instructional objectives.

There are various items formats to select from. Some are less appropriate than others for
measuring certain objectives. For instance, if the objectives to be measured are stated as
“students will be able to organize his ideas and write them in a logical and coherent manner.” It
would be inappropriate to have him select his answer from a series of possible answers. If the
objectives are about recalling names, places, dates and events, it would not be efficient to use a
lengthy essay question. Although there are instances where the instructional objectives can be
measured by different item formats, the teacher should use the least complicated one.
In taking the final decision as to the item format(s) to be used, the test constructor should be
governed by such factors as:-
i. The purpose of the test;
ii. The time available to prepare and score;
iii. The number of students to be tested;
iv. The physical facilities available for reproducing the test;
v. The teacher’s skill in writing the different types of items;

12
2.5 General guidelines in test construction
Zulueta (2006) suggests that the following fundamental principles should be observed to guide
teachers when they construct their evaluation tests:
1. Measure all instructional objectives; teachers should construct tests to measure clearly
the prescribed learning objectives that have been communicated and imparted to the
learners. The test is designed as an operational control to guide the learning sequences
and experience and should be in harmony with the teacher’s instructional objectives.

2. Cover all important learning tasks; a good test focuses and measures a representative
sample of learned tasks.

3. Use appropriate test items; a good test usually includes items that are most appropriate
for a particular objective to check on learner achievement. Some test questions are better
for measuring recall of specific information while other type are good for tapping higher
level thinking process and skills.
4. Make test reliable and valid; tests that are clearly written and minimize guessing are
more reliable than ambiguous statement. Tests that contain a fairly large number of items
or questions are generally more reliable than those with just a few questions or items.
Tests that are well planned and cover a wide range of objectives and topics and that are
well executed will most likely ensure validity. No matter what type of test the teacher
may use, it should be reliable and valid.

5. Use tests to improve learning; this principle reminds teachers that even though tests may
be used primarily to diagnose or evaluate learners’ achievement, in effect they can also
be a learning experience.

2.6 Test improvement


Very often, teachers prepare, administer, and score a test, return the test papers to their students
(some teachers avoid to do the last for unknown reasons), possibly discuss the test, and then
either file or discard the test.

However, one of the common mistakes of teachers is that they do not check on the effectiveness
of their tests. The probable reasons for the behavior include:-
a) Teachers feel that test analysis is too time-consuming;
b) Teachers are not aware of the methods of analyzing tests;
c) Teachers do not always understand the importance of accurate evaluation.
This section presents some procedures in analyzing test items and interpreting the results.

13
In improving the quality of tests, two main steps are generally followed: trying out the test and
establishing the test reliability and validity.

2.6.1 Test Tryout


A test cannot be considered good unless it is tried out. The main purpose of the tryout is for item
analysis – That is, the process of examining the students’ responses to each test item.
Specifically, what one looks for is the difficulty and discriminating ability of the item as well as
the effectiveness of each alternative.
Good (1973) defines item analysis as any one of several methods used in test validation or
improvement to determine how well a given question or item discriminates among individuals of
different degrees of ability, or among individuals differing in some other characteristics.

There are a variety of item analysis procedures, but most of the procedures provide essentially
the same information. One method that can be used for item analysis is the U – L Index Method.
The steps of the U-L Index Method are:-
1. Score and rank the papers from the highest to lowest according to the total score.
2. Separate the top 27% and the bottom 27% of the papers
3. Tally responses made to each test item by each individual in the upper 27% group.
4. Tally responses made to each test item by each individual in the lower 27% group.
5. Compute the percentage of the upper group that got the item right and call it U.
6. Compute the percentage of the lower group that got the item right and call it L.
7. Average U and L percentage and the result is the difficulty index of the item.
8. Subtracted the L percentage from the U percentage and the result is the Discrimination
index of the item.

By difficulty index, it is meant the percentage of the student who got the item right. It can
also be interpreted as how easy or how difficult an item is.

Good (1973) states that a discrimination index is an indication of the degree to which
individual test items discriminate among students in designated criterion groups. It is
sometimes called differential index or validity index. A discrimination index separates the
bright students from the poor ones. Thus a good test item separates the bright from the poor
students.

14
After item analysis, the following table of equivalents can be used in interpreting the
difficulty indexes:

Index range Decision on item Difficulty


0.00 – 0.20 Very difficulty
0.21 – 0.80 Moderately difficult
0.81 – 1.00 Very easy

Likewise, after item analysis, the following table of equivalents can be used in interpreting
the discrimination indexes:

Index range Decision of item Discrimination


0.00 – 0.20 Poor discrimination
0.21 – 0.80 Moderate discrimination
0.81 – 1.00 Too discriminating

When a teacher prepares test items, he/she aims to have average difficulty. So item analysis
helps in selecting the items that are of average difficulty; thus, the results of an item analysis
tell if the teacher needs to revise items that are too difficult or too easy.
However, care and caution must be taken in using the above table in interpreting the results
of an item analysis. Judgment of the test constructor is still very important. For example,
what will be done with an item having difficulty index of 0.16 and a discrimination index of
0.11? Using the table above, that particular item should be revised: when that particular item
is the only item left to test a very important concept. So, the teacher has no other choice but
to revise or improve it.
On the other hand, what will be done with an item having a difficulty index of 0.50 and a
discrimination index of 0.48? Normally that item should be retained because it has very good
indices. But there will also be an instance when that kind of item may be rejected or
discarded. That will happen if there are already enough items to test the particular concept or
skill that is assessed.

15
Example: The table below shows the result of a tryout of a 10 –item test in mathematics done
by sixty (60) students.

Table of results of the item analysis

Item Upper Lower Difficulty Discrimi- Decision Justification of decision taken


Number 27% 27% index nation A-Rejected
index
B-Retained
C-Revised
1 14 12 0.81 0.13 C
2 10 6 0.51 0.25 B
3 11 7 0.57 0.25 B
4 9 2 0.35 0.43 B
5 12 6 0.57 0.37 B
6 6 14 0.63 -0.50 A
7 13 4 0.53 0.56 B
8 3 10 0.41 -0.44 A
9 13 12 0.78 0.06 A
10 8 6 0.44 0.12 C

Assignment;

1. In the table above ;

a) What do the numbers in column ‘Upper 27%’ and ‘Lower 27%’ mean?

b) How many items were Rejected, Retained or Revised? Give your justification in the
table above

16
2. Complete the table below (100 students tested)

Item Upper Lower Difficulty Discrimi- Decision Justification of decision taken


Number 27% 27% index nation A-Rejected
index B-Retained
C-Revised
1 20 18
2 19 12
3 17 11
4 10 20
5 21 11
6 9 2
7 24 14
8 18 13
9 9 19
10 22 15
11 26 24
12 25 13
13 6 3
14 23 12
15 11 19

After analyzing the results of the first tryout, test items are usually revised for improvement.
After revising those items which need revision, another tryout is necessary. The revised form of
the test is administered to a new set of samples. The same conditions as in the first tryout are
followed. After the tryout, another item analysis is done. This is to find out if the test items
revised improved in terms of difficulty and discrimination indexes.

Usually, after two revisions, the test is considered ready to be in its final form. The test is now
good in terms of the difficulty and discrimination indices and, therefore, it is also ready to be
tested for reliability and validity.

2.6.2 Establishing Test Reliability and Validity


This step of establishing test reliability and validity is very crucial. Methods of establishing
reliability and validity have been developed in chapter two. Here below is a brief summary of
those methods.

17
a. Establishing Test Reliability

Methods of Types of Reliability Procedure


Estimating measure
Reliability
Test-retest Measure of stability Give a test twice to the same group with a time
method interval between tests Correlate the test resulting
using correlate the test results using Pearson r.
Equivalent forms Measure of Give two forms of a test to the same group in close
method equivalence succession. Correlate the test results using Pearson r
Test-retest with Measure of stability Give two forms of the best to the same group with
equivalent forms and equivalence increased time intervals between forms. Correlate test
results using Pearson r
Split-half method Measure of internal Give a test once. Score equivalent halves of the test
consistency (e.g., odd-and even numbered items). Correlate the
two sets of scores (on odd and even-numbered items)
using Pearson r. correct reliability coefficient to fit
the whole test by the spearman-Brown formula.

b. Establishing Test Validity

Types of Meaning Procedure


validity
Content How well the sample test Compare test task with test specification
validity bar task represent the describing the task domain under consideration
domain of tasks to be (non-statistical)
measured
Criterion- How well test Compare test score with a measure of performance
related performance predicts (e.g. grades) obtained at a later date (for
validity future performance or prediction) or with another measure of
estimates current performance obtained concurrently (for estimating
performance on some present status)
valued measure other than (primarily statistical)
the test itself Correlate test results with outside criterion using
Pearson r.
Construct How test performance can Experimentally determine what factors influence
validity be described scores on that test. The procedure may be logical
psychologically and statistical using correlations and other
statistical methods.

18
Review Questions
1. State and explain the types of instructional objectives
2. Describe the Blooms Taxonomy for classifying educational objectives
3. Explain the steps involved in test development
4. Prepare a table of specification for a test in a given content area of your choice
5. State the guidelines in test construction
6. Describe the criteria for test improvement

References for further reading


Good, C.V (1973). Dictionary of education. New York: McGraw-Hill Company

Kithuka, M. (2004). Educational measurement and evaluation: A Guide to Teachers. Egerton,


Kenya: Egerton University Press.
Zulueta, F.M. (2006). Principles and methods of teaching. Philippines: National book Store.

19
3 CHAPTER THREE: TYPES OF TESTS

Learning Objectives
By the end of this chapter the learner should be able to:

1. Identify and describe the various types of tests

2. Outline the merits and demerits of the various types of tests

3. Give suggestions for reducing the limitations of the various types of tests

3.1 Introduction
There are basically two broad categories of tests: essay and objective. Essay tests allow students
to express themselves freely in their answers to particular questions. To a large extent, the
emphasis is on students’ overall understanding of the subject in question. In an objective test,
however students’ response are restricted to a number of symbols, words, phrases or simple
sentences, one of which is considered to be the best answer out of several possible alternatives.

3.2 The Essay tests


Essay tests used to be the main traditional way of testing students’ understanding of a subject.
Essay tests consist of a list of questions for which students are required to write out the answers
for all or some of the questions. Generally there are two types of essay tests: long and short
essays.

 A Long Essay is commonly used with older and more mature students who have
mastered the language to be used at certain degree of proficiency. Terms used in a long
essay include discuss, explain, apply, express your view etc.
 In a Short Essay test, the student is required to treat the subject as briefly as possible.
Terms generally associated with short essay tests include such words as describe, define,
compare and contrast, classify, illustrate etc.

3.2.1 Merits (advantages) of essay tests


Essay tests have a number of advantages. They provide opportunities for students to:
a. Demonstrate the degree to which a learner can analyze the problem.
b. Creativity selects relevant information.
c. Present evidence of substantial understanding of the subject in question.
d. Organize his answer in a logical and comprehensive manner.
e. Demonstrate as much as he knows where there is no absolutely wrong or right answer.
f. Improve his skills in writing and logical organization of thoughts.

20
3.2.2 Demerits (limitations) of essays tests
Very often essay tests suffer from various limitations:
1. They measure no more than the ability of the student to recall the information.

2. They suffer from content validity because of inadequate sampling of the course content.
Very often items sampled contain only a limited number of questions, many vital areas
being excluded.
3. They are highly subjective;

i. Very often they suffer from scoring unreliability-that is scoring depends


on the score state of mind and is hence highly subjective and inconsistent.
ii. Essay tests are also highly subjective because they generally involve
extraneous and irrelevant factors such as teacher’s moods or his
impression of the literary skill and handwriting, spelling, grammar,
composition fluency of the student.
4. They are time consuming both in answering and grading.
5. They produce responses which can only be effectively graded by a competent scorer.
6. Kithuka (2004) underlines also the carryover effects essay tests suffer from;
i. The item-to-item carryover effect: that is if the examiner marks script by script
rather than item by item, he is likely to acquire some impression of the students’
knowledge on the initial item and this will contaminate his judgment on the
second item.

ii. The test-to-test carryover effect: that is, the fact that essays of average quality
often are rated higher when proceeded by poor essays and rated lower when
preceded by very good essays.
iii. The order effect: that is the fact that the orders in which papers are marked affect
the score. Some researchers found that papers read earlier tend to receive higher
ratings than those read nearer the end of the sequence.

3.2.3 Suggestions to reduce limitations of essay tests


In order to improve validity and quality of essay tests, the following guidelines are suggested:
1) Limit essay tests to objectives that are best achievable through an essay testing technique.
2) Reserve as much as possible, essay tests;
(i) For evaluating complex or controversial areas of a subject or to situations where
students are expected to apply the acquired knowledge to novel situations.

21
(ii) To the measurement of the content and objectives of higher order skills.
3) Questions should be structured in such a way that an overall understanding of students can be
assessed. Structure items that will measure the ability to apply generalization or principles,
identify relationships by starting with such terms as explain why, show relationship,
compare, associate, interpret, give reasons for, analyze how, distinguish etc.
4) Prepare in advance a scoring scheme based on criteria
5) Score one question at a time for all who attempted to it.
6) Allow sufficient time for students to answer the questions.
7) Score every objective that is to be measured independently.
8) Mark an essay test when you are physically sound and mentally alert and in an environment
with fewest distractions that is conducive to an intellectual work
9) Give instructions for the test that are explicit and well written out.

10) Mark students’ copies using students’ numbers instead of their names and score the answer
question by question instead of script by script. This helps increase objectivity of marking
scheme.

11) Provide enough questions and question variety. This practice tends to improve both the
validity and reliability of the test.

3.3 Objective test


The main idea behind the introduction of objective tests into the classroom or as part of public
examinations is to overcome the weaknesses (demerits) of essay tests.
Objective tests can be divided into at least three main groups;
i. Supply item tests
ii. Selection item tests
iii. Rank order item tests
The selection item test can further be sub divided into sub-categories such as;
i. True-false item
ii. Matching item
iii. Multiple-choice item
iv. Pictorial item

22
These sub-categories are shown in the table below;

Objective tests
Supply Rank Selection item
item order item True-false Matching Multiple Pictorial
choice

Critics of objective tests mention the following advantages and demerits of objective tests.

3.3.1 Advantages of objective tests


1. This type of test permits extensive coverage of topics
2. Answers to this form of test can be stored quickly and more objectively than other forms
3. Students are able to respond quickly to questions
4. These tests encourage students to pay close attention to what they are studying

3.3.2 Disadvantages of objective test


1. Objective test tends to measure partial and superficial knowledge rather than broad
conceptual understanding.
2. Setting the test is time consuming and costly in terms of paper and man power
3. They give room for guessing

4. They are not effective in testing students’ ability to organize their thoughts or to write
coherently

5. They tend to test recall of factual information items and do not provide for self-
expression, creativity and comments on the part of the student.

3.4 Supply item tests


In supply item or completion or fill-in-blank tests, the students are required to provide missing
information with a word a phrase or symbol. The purpose of this form of test is to determine a
student’s ability to recall or recognize the appropriate term, concept or phrase to complete a
statement. At times, a number of words or phrases may be placed below the question, from
which the student will pick the most suitable to complete the sentence in question.
If not well structured a supply item question can be confusing to the student. This is the case for
the following questions:
i. _________ is the capital of ________ in east Africa.
ii. _________ is the square root of ________

23
Questions of this kind will elicit different types of answers among students. Some students may
correctly fill the two blanks in question (1) with any capital city and any east African country.
Likewise, question (2) elicits an infinity of correct couples of numbers such as (1; 1) (2; 4) (-3;
9) and so on.
Suggestion for designing and improving supply items test include the following:
1. The wording must be clear and specific enough to avoid ambiguous and unexpected
responses.
2. There should be only one possible correct answer.
3. Too many blank spaces in the same sentences should be avoided since they tend to
confuse students.
4. Specify in what unit (kg, m, inches, etc.) or value a numerical answer is to be given.
5. Do not make the answer too obvious.

6. Instructions for each sentence should be brief and clearly stated. In addition giving
examples of how information is to be supplied reduces students’ anxiety and saves time.
The direction “fill in the blanks”, is usually sufficient, but the student should be informed
about how detailed the answer should be.
7. The use of lengthy and tortuous statements and highly technical terms should be avoided.

8. Do not use statements that are copied from the text book or workbook, since this
encourages memorization.

3.5 Selection item tests


In selection item tests, the student is required to select the most suitable response from several
plausible (probable) alternatives presented to him. As stated above, selection item tests can be
categorized into; True-false item, Matching-item, Multiple-choice item and Pictorial item tests.

3.5.1 True – False tests


In this case items in the test require alternative responses. True/False questions are generally
used to assess knowledge level thinking. One advantage is that they can be prepared and graded
relatively quickly.
Some of the limitations of this type of test include;
a. The tester too often copies statements word for word from the text book or course
content, with perhaps insertion of some negative terms to falsify the items.
b. Generally students and teachers feel that a True/False test does not reflect a true picture
of what they know about the topic.

24
c. The chance of guessing the correct answer is very high, and this impinges on the primary
purpose of the test. To measure what the students know and not how lucky they are.
Suggestions for designing and improving true/false tests;
1. Use statements that are absolutely true or false in the student’s environment.

2. Do not use items that will provide clues about the right response. This include words such
as never, always, purely, none, all in statements that are likely to be false and sometimes,
may, usually, could, perhaps in statements that are likely to be true.
3. The intended correct answer should be clear only to a knowledgeable student.
4. Avoid using rhetorical statements. E.g. Water boils at 100oC, does it not? true/false
5. Avoid long and torturers statements and those that are both partly true and partly false.

6. Avoid statements with double negations, e.g. is it not true that a negative number cannot
be a square root of a positive number? true/false

7. Unless carefully worded synonyms tend to make the choice of answer unnecessarily
difficult.
8. Avoid the use of highly technical terms and overlapping statements. These tend to
distract the students unnecessarily.
3.5.2 Matching – item tests
In Matching-item tests, a choice is to be made from among the same set of alternatives. Students
are required to associate/match two or more related words or phrases. Thus the items might
consist of several terms to be defined, while the alternatives could consist of definitions of such
terms. Usually this test has two columns of items which are to be associated directly. The items
could be simple or complex depending on the level of the students.
Limitations of this type of test include;
a. Often too many items are included in each column, thus requiring too much scrutiny on
the part of the student. The student wastes time and makes more mistakes as he becomes
fatigued. The test does not thus evaluate cognitive ability but endurance.

b. It is restricted to a limited area of the subject in question, since it is difficult to find a


sufficient number of related items in that subject.
c. Good items that are not too obvious are quite difficult to construct.

25
Suggestions for designing and improving Matching-item tests;
1. Specify the directions as clearly as possible in order to avoid confusion. Directions
should the basis for matching the items in the two columns.
2. The items in the two columns should be randomly distributed and should give no clues.

3. The entire matching question should appear on a single page. Running the questions in
two pages could be confusing and distracting to students.
4. Wording items in column A should be shorter than those in B. This permits the student to
scan the test quickly.

5. Column A should contain not less than 5 items and not more than 10 items. Longer lists
confuse students.

6. Column A items should be numbered as they will be graded as individual questions and
Column B items should be lettered.

7. Column A items should be presented in a logical order, say alphabetically or


chronologically. This ensures easy scanning by the students.

8. Items in both columns should be similar in terms of content, form, grammar and length.
Dissimilar alternatives in column B result in irrelevant clues that can be used to eliminate
items or guess answers by the test-wise student.
9. Negative statements in either column should be avoided.

3.5.3 Multiple-choice-test tests


These are the most popular objective test items. One of the basic characteristics is that the
introductory statement itself contains the criterion by which the best answer is to be selected out
of the several alternatives. In most cases four or five alternatives are given.
The multiple-choice item has two parts;
a) The stem (introductory statement) which poses the problem

b) The alternatives or choices of which one is the correct answer and the others are incorrect
answers known as detractors, decoys or distractors
The stem is usually in the form of a complete problem or a question on which the central issue is
stated. If one of the options of responses is the answer, the purpose of detractors is to
discriminate between knowledgeable students from the less knowledgeable ones.

26
Some merits of this type of test include;
a) This type of test has the capacity to test not only knowledge and comprehension but also
high level thinking abilities.
b) They can be adapted to a variety of subject matter content and
c) They can be scored easily and objectively.

According to Kithuka (2005) and Zulueta (2006), the following are essential guidelines in the
construction of multiple-choice item tests;

1. Ensure clarity of the task: the statement of the stem must be worded carefully in order to
avoid vagueness and direct interpretation.
2. Strive for absolute rather than relative correctness of the answer. The intended response
should admit no difference of opinion from experts. The stem must have a definite
answer.
3. Avoid inserting unnecessary information/preamble in the question.
4. Avoid introducing unintended hints about the correct response by repetition of key
words, synonyms or response length.

5. Avoid writing items in the negative form unless there is absolutely no other way of
testing the concept.

6. Include in the stem any word that might otherwise be repeated in the alternative
responses.
7. Use numbers to label the stem and letters for choices.

8. Avoid using items directly from the text books or past papers since this practice
encourages memorization.

9. Alternatives should be parallel in content, form, length and grammar. Avoid making the
correct answer different from others in form, length or grammar.

10. Correct answers should be in random order. Do not use one letter more than others or
create a pattern.
11. All decoys should be plausible and attractive to students who do not know the answer, yet
should be clearly incorrect.
12. Avoid absolute terms (always, never, none) especially in alternatives. A test-wise person
usually avoids answers that include them.
13. The alternatives ‘all of the above’, ‘none of the above’ should be used sparingly.

27
14. Arrange alternatives in some logical order such as alphabetically or chronologically.
15. A multiple choice must not be too long, or else it becomes an endurance test rather than a
test of ability.

3.5.4 Pictorial – item tests


This type of objective test is used to measure such objectives as ability to recall information,
complete parts missing in an object or process, interpret information, identify relationships or
dissimilarities, apply generalizations, organize thought etc. for example, pictorial representation
can be very useful in helping Biology students to identify structures or functions of organisms.
Several studies have measured the cognitive, affective and psychomotor development of pupils
through pictures and models.
Pictorial representation is particularly useful for students with limited reading ability since very
few words are needed to understand the items. Such a test can stimulate students’ interest, clarify
ambiguity and make learning more concrete and realistic.
Limitations of Pictorial – item tests include the following;

a) Because students especially at primary level may have difficulty in perceiving depth, they
tend to see two instead of the three dimensions that the picture represents.
b) Students with limited socio-economic background may have less contact with materials
used in the school and hence may be disadvantaged.

c) Complex pictures tend to confuse students, and sketchy diagrams might not contain
enough information to answer a given set of questions.
Suggestions for designing Pictorial – item tests include the following;

1) Pictures used in test must be clear so as to enable the students to recognize the item in
question.
2) Avoid shading as this tends to complicate the diagram beyond recognition.
3) Use pictures that portray the object or event in its simplest form.
4) Teachers with poor drawing ability can obtain good pictures from books and magazines,
which they can either copy or trace. Many biological, chemical and physical processes
are available in form of charts which can be very useful.

28
3.6 Rank – order test items
In this case students are expected to indicate the appropriate order (serial, chronological, logical
etc.) of the items presented to him. Examples;
1. Arrange the names below in alphabetical order;
(a) Ojo (b) Ade (c) Mulopo (d) Aquab (e) Wahab (f) Josef
2. What is the chronological order of the following countries in getting independence?
(a) Angola (b) Kenya (c) Togo (d) Ghana (e) Namibia (f) Somalia

Ability to order things or events depends to a large extent on exposure to and familiarity with
relevant learning materials and the level of logical development. It also involves both the ability
to identify similarities or differences among objects as well as the ability to discriminate between
relevant and irrelevant attributes of such objects.

3.7 Summary of the suggestions for all testing techniques


Here are important principles to apply when using testing techniques:
1. Check test items in relation to the objective of the unit.

2. Gauge item difficulty to ensure that the items are suitable for the students for whom they
are intended. It is essential to know the background of the students. It must not be
assumed that simply the items are clear to the teacher or some students; they must be so
for the rest of the students.

3. Check the literacy level of each item so that the students who are being evaluated in
physics, for instance, are not penalized for reading problems.

4. In order to save students’ time and to encourage them to complete the test, arrange test in
order of difficulty, or else give enough time.
5. State the instructions as clearly as possible and avoid the use of unfamiliar terms. The
students must be made aware by reading the instructions what they are supposed to do.
6. Develop weighting for the test that reflects the objectives.

7. Keep records of tests scores as a reference for evaluating students’ progress or difficulties
and as a means for monitoring the effectiveness of teaching methods as well as the test
itself.

8. Avoid irrelevant sources of difficulty. Instead of using compound words like “socio-
politic-economical equilibrium”, simply state “social, political and economical stability”.
9. The duration of the test should not be above students’ attention span.

29
Review Questions
1) Distinguish between Essay and Objective tests
2) Describe the various types of objective tests giving the merits and demerits of each
3) Discuss the essay tests and outline their advantages and disadvantages
4) Give suggestions for reducing the limitations of the various types of tests
5) Distinguish between the Rank – order type of test items and the supply item tests

References for further reading


Gay, R.L, Mills, G.E & Airasian, P. (2006). Educational research: Competencies for analysis
and application (8th Ed.). Columbia, Ohio: Pearson, Merrill Prentice Hall.

Kithuka, M. (2004). Educational measurement and evaluation: A guide to teachers. Egerton,


Kenya: Egerton University Press.
Zulueta, F.M. (2006). Principles and methods of teaching. Philippines: National book Store.

30
4 CHAPTER FOUR: CHARACTERISTICS OF GOOD TESTS

Learning Objectives
By the end of this chapter the learner should be able to:
1. Distinguish between reliability and validity of a test
2. Outline the factors that impinge on test reliability
3. Explain the methods of assessing the reliability of a test
4. Describe the types of test validity
5. Outline the factors that threaten the validity of a test
6. Describe administrability and scorability as characteristics of good tests.

4.1 Introduction
There are a number of factors that affect student’s performance in any test. Such factors include
variables such as; Social-economic background, anxiety, interest, mood, cultural values, teacher
characteristics, nature of learning materials, instructional techniques, time of the day etc.
In addition to these factors, other variables must be considered. The evaluator must seek answers
to the following questions,

i. What is the objective of the test? Will it bring out a student’s understanding,
ability, industry, skill?

ii. Does the test cover the subject adequately? Is the content of the test adequate and
relevant to what has been taught?

iii. What acceptable criteria guide the tester in the identification, selection, and
weightings of items?
iv. Is the test reliable, valid, administrable, scorable, interpretable and economical?

This chapter pays a particular attention to the last question and discusses reliability, validity,
administrability and scorability as characteristics of good tests.

4.2 Reliability of a Test


Reliability is a measure of the degree to which instruments yield consistent results or data after
reported trials. It refers to the consistency of the scores obtained by a learner when re-examined
with the same test on different occasions or with different set of equipment.
Synonyms for reliability are: dependability, stability, consistency, predictability, accuracy. Test
reliability deals with suitability, accuracy or constituency of scores (data) collected from a given

31
test. The test scores of students should be reproducible and dependable. Thus if repeated more
than one or two times under similar situations, a reliable test should yield, produce identical
results. If a student scores 90% in math test on Monday and gets 40 on the same on a similar test
on Friday both scores cannot be relied upon.

4.2.1 Factors impinging on test reliability


There are several factors that may affect the reliability of a test. These factors include; (the list is
not exhaustive)
1) The relationship between the objective of the tester and that of the student
2) The clarity and specificity of the items of the test
3) The certainty and information sought by the tester
4) The significance of the test to the student
5) The consistency of the cognitive processes of the tester (i.e. marker’s reliability) and
testee over time.
6) Familiarity of the testee with the subject matter
7) The tester’s ability to identify and monitor the cognitive processes in question, and
current ability to translate such processes into question form
8) Interest of the testee in the subject
9) Disposition of the testee at the time he or she is tested
10) Chance factors

11) Length of the test (e.g. the longer the test the more reliable it is generally and the more
the test’s reliability index approaches 1)
12) Level of difficulty of the test items
13) Socio-cultural variables.
14) Practice and fatigue effects.

4.2.2 Methods of assessing Reliability


1) The test – retest technique
This involves administering the same test twice to the same group of learners. There is usually a
time lapse between the two tests which should not be too long.
Steps involved
a) Select the group of learners

32
b) Administer the test to the learners
c) Keeping all the initial conditions constant, administer the same test to the same learners
say after two to four weeks.
d) Correlate the scores from both tests.

The correlation coefficient (Pearson product moment or Spearman rank correlation coefficient)
obtained is referred to as “coefficient of Reliability” if the coefficient is high (equal or higher
than 0.7) then the instrument is said to be more reliable.

Disadvantages: - The subjects may be influenced by the first test and hence tend to remember
their responses during the second test. In addition students do change or get bored with
repetition.

2) The Equivalent - form technique


This technique uses two equivalent tests. Specific items in each instrument are different but
designed to measure the same concept. They are the same in number, structure and level of
difficulty.
Steps involved
a) Sample different items in the test.
b) Divide the items sampled into two groups.
c) Administer one form to a group of learners randomly selected.
d) After a period of time, the second form is administered to the same learners.
e) Correlate the scores obtained from the two scores.
If the correlation coefficient is high, the test is said to yield data that have a high reliability.
Disadvantage: Construction of two different tests measuring the same concept is difficult.

3) Split – Half Technique


This requires one testing session – Here, a test is designed in such a way that there are two parts.
The scores from part 1 are correlated with scores from Part 2.
Steps involved
a) Sample items that measure the variable.
b) Administer the test to the sample group.
c) At random, divide the scores into two groups’ e.g. odd numbers and even numbers.
d) Get total score from the two groups.
e) Correlate the scores.

33
Advantage; This method eliminates chance error due to different test conditions as in the first
two methods.

Disadvantage; The reliability computed may not be for the whole test since the method
correlates one half of the items against the other half in the same test.

4.3 Validity of a Test


Validity is the accuracy and meaningfulness of inferences, which are based on the test results. In
other words, validity is the degree to which results obtained from the analysis of the data actually
represent test intended to measure. Validity is in fact, the most important characteristic of a good
test. The commonest definition of validity is epitomized, rendered by the question:
“Are we measuring what we think we are measuring?”
Validity therefore refers to the extent to which the test serves its purpose or the efficiency with
which it measures what it intends to measure. In other words a valid test is a test which actually
measure what it is supposed to measure. The validity of a test shows the relationship between the
data obtained and the purpose for which the data was collected. In other words it shows
whether the test accomplishes what it is supposed to accomplish.
NOTE:
Reliability is a necessary condition for quality measurement, but it alone is not sufficient.
It is important to note that a test may reveal accurate or consistent scores; but if it is not
useful for the purpose it is intended then it is not valid. On the other hand, if it is not
consistent in measuring whatever it is measuring then it cannot be valid for any purpose.
Therefor a test may have perfect measure but if it does not serve the purpose for which it
is intended then it has no validity. In other words; in order to obtain a valid test one must
have reliable test.
In summary a valid test is reliable but a reliable test is not necessarily valid.

4.3.1 Types of test validity


There are four kinds of validity in tests:
i. Face validity
ii. Construct validity,
iii. Content validity and
iv. Criterion-related validity

34
a) Face validity
Face validity means that a test appears valid on the face of it. It is established when on the
examination of a test a person concludes that it measures the relevant trait. It is sometimes called
expert validity or validation by consensus. The examiner of the test may be an expert or a novice
in test construction. However every test must have face validity. This is particularly important
for tests used for screening job applications. If such tests lack Face validity, there can be an
outcry against the firm using them. Face validity is important from a public relations point of
view.

b) Construct validity
Construction validity is measure of the degree to which data obtained from a test meaningfully
and accurately reflects or represents a theoretical concept. For example, would a score of 90
points on a reading test actually reflect the true reading ability of pupil, or would a score on a
series of mathematics items truthfully reflect the mathematical aptitude of a student? This
approach is often used where no criteria or domain of content is generally accepted as an
adequate measure of a concept. Concepts such as level of management, creativity, self-esteem,
motivation, etc. are all abstract, hypothetical concepts. They cannot be directly observed but their
effects on the behavior of learners can be observed.

c) Content validity
Content validity is a measure of the degree to which data collected using a particular test
represents a specific domain of indicators or content of a particular concept. For example, a test
of arithmetic for standard four pupils would not yield content valid data if items do not include
all four operations – addition, subtraction, multiplication, and division. In designing a test that
will yield content-valid data, the teacher must first specify the domain of indicators which are
relevant to the concept being measured. Theoretically, a content-valid measure should contain
all possible items that should be used in measuring the concept. The usual procedure in
assessing the content validity of a test is to use professionals or experts in the particular field.

d) Criterion-related validity
Criterion-related validity refers to the use of a test in assessing learners’ behavior in specific
situations. If a test purports to measure performance in a job, the subjects who score high on the
test must also perform well in their jobs. Two types of criterion-related validity are recognized:
predictive and concurrent.

Predictive validity refers to the degree to which obtained data predict future behavior of
subjects. An engineering firm, for example, may advertise posts for two graduates in mechanical
engineering. Ten graduates may apply for the post and during the interview, they are given a test.
The test scores are supposed to assess the graduates’ performance on the job once they are
employed. The extent, to which such measures determine the performance on the job of the

35
selected graduates in the future, is the predictive validity of the instrument. If the data obtained
using the tool has predictive validity, the graduates’ scores on the test would correlate highly
with a measure of their future performance on the job.
Concurrent validity, on the other hand, refers to the degree to which data are able to
predict the behavior of subjects in the present and not in the future. An example of this may be
found in medical studies, particularly in psychiatry. A psychiatrist might use a measure to
establish whether a patient is schizophrenic. In this case, a patient’s scores on a psychiatric test
would correlate highly with his or her present behavior if the instrument does indeed yield data
that accurately represent this type of mental illness.
4.3.2 Factors threatening test validity
Once again, a test may show no validity, some validity, or perfect validity. There are several
factors that may threaten, or diminish the validity of test and instruments. These factors as
outlined by Gronlund (1985) include;
1. Unclear test direction
2. Confusing and ambiguous test items
3. Appropriateness of test items
4. Difficult items
5. Objectivity of the test
6. Using vocabulary too difficult for test takers
7. Overly difficult and complex sentence structures
8. Inconsistent and subjective scoring methods
9. Untaught items included on achievement tests
10. Failure to follow (standardized) test administration procedures
11. Cheating, either by participants or someone teaching the correct answers to the specific
test items; or identifiable pattern of answers
12. Improper length Test (too short/long) of the test and arrangement of items
13. In appropriate level of difficulty of the test items
14. Poorly constructed test items
15. Test items inappropriate for the outcomes being measured

36
4.4 Other characteristics of a good test
Other characteristics of a good test include administrability, scorability, interpretability, and
economy

4.4.1 Administrability
Administrability is a characteristic of measuring instrument involving

• The ease with which an examiner may understand and present the instructions for the test

• The availability of alternate forms of the measuring instrument

• The ease in which the individuals tested comprehend how they are to proceed

• The efficiency with which the test may be scored (Good, 1973)
A good test is administered with ease, clarity and uniformity. Test procedures are standardized so
as to achieve uniformity of procedures in administering the test. Testing conditions are controlled
in such a way that they are the same for all examiners. This is done so that the scores obtained by
them are comparable.
In order to secure uniformity of testing conditions the test contractor provides detailed directions
of administering the test. Time limits, oral instructions to the examinees and sample items for
demonstration are specified. Definite provision for preparation, distribution and collection of test
materials and other factors are made.
To ensure administrability of test, directions should be made simple, clear and concise. Test
items should be introduced by sample items and illustrated by practice exercise. The test format
should not be difficult to read, recording their answers or in moving from one page or part of the
test to the next. The size of the page, length of line size and style of type or illustration should be
made such that they facilitate test administration.

4.4.2 Scorability
Scorability is a criterion used in judging tests. It refers to degree of objectivity possible,
directions provided, time involved and simplicity of procedures (Good, 1973, p.519)

A good test is easy to score. Test results should be easily available to both the student and the
teacher so that proper remedial and follow up measures and curricular adjustment can be made.
However if test results become available only after a considerable time they lose their usefulness
for both learner and teacher. Tests are easy to score when the directions for scoring are simple
and clear.

37
Review Questions
1. Distinguish between reliability and validity of a test
2. Outline the factors that impinge on test reliability
3. Explain the methods of assessing the reliability of a test
4. Describe the types of test validity
5. Outline the factors that threaten the validity of a test
6. Describe administrability and scorability as characteristics of good tests.

References for further reading


Gay, R.L, Mills, G.E & Airasian, P. (2006). Educational research: Competencies for analysis
and application. (8th Ed.). Columbia, Ohio: Pearson, Merrill Prentice Hall.

Gronlund, N.E & Linn, R.L (1990). Measurement and evaluation in teaching. (6th Ed). New
York: Macmillan Publishing Company.
Oriondo, L.L & Dallo-Antonio, E.M. (1989). Evaluating educational outcomes: Tests,
measurement and evaluation. Manila, Philippines: REX book Store.

38
5 CHAPTER FIVE: INTRODUCTION TO STATISTICS

Learning Objectives
By the end of this chapter the learner should be able to:
1. Give the meaning of statistics and state the importance and limitations of statistics
2. Explain the sub-divisions of statistics

3. Identify the scales of measurement and determine which one is applicable for given
variables
4. Give the meaning of Discrete and Non-discrete data/variables citing relevant examples.
5. Identify Primary and Secondary data and describe the methods of data collection

5.1 Introduction
Definitions of statistics
1) Statistics is the science of collecting, organizing, presenting, analyzing, and interpreting data
to assist in making more effective decisions.

2) Statistics is also the science of data which involves collecting, classifying, summarizing,
organizing, analyzing and interpreting numerical information.

3) Statistics is the branch of scientific inquiry, which provides for collecting data (sampling and
experimental design), organizing and summarizing data (graphs and tables), and statistical
inference (making generalizations to a larger population based on observations from a
sample).
Statistical approach to a problem may be broadly summarized as follows:

a) Collection of facts – this is the first stage in the statistical treatment of the problem.
Assembling of facts is thus a very important process. Always ensure that data collected is
accurate, reliable and thorough.
b) Organization of facts – at times data can be large and therefore there is need to condense
it through organization, classification, tabulation and presentation of the data in a suitable
form.

c) Analysis of facts – statistical analysis include measurement of central tendencies e.g.


mean, mode and median and measurement of dispersion e.g. variance, range, standard
deviation etc.

d) Interpretation of facts – this is done through judgment and inference drawn from the
sample.

39
5.2 Importance of Statistics
i. Statistics permits summarization and presentation of large quantities of information
ii. It helps to undertake and understand research in our areas of interest

iii. Used in government to formulate policies and administration. E.g. the National
Population and Housing census.
iv. Help businesses in decision making by making future estimates and expectations
v. It enables us to formulate and test a hypothesis (statistically assess a statement)
vi. Can you think of more?

For example, when two people are in love and taking time to know each other (courtship), they
are collecting data of each other to arrive at a conclusion. We may find our partner to be kind,
honest hearted, etc. This is all data, which comes in different forms and it is collected for various
reasons and purposes.

5.3 Limitations of Statistics


a) Statistics deals with only those subjects of inquiry which are capable of being
quantitatively measured and numerically expressed. Not all subjects can be expressed in
numbers e.g. poverty, health, intelligence, skin colour, colour of eyes etc. are not suitable
for statistical analysis.

b) Statistics deals with aggregate of facts and no importance is attached to individual items.
It is always suitable to problems where group characteristics are desired to be studied.

c) Statistical data is only appropriately and not mathematically correct. Sampling techniques
allows observation of a limited number of items hence gives an estimate to desired
results. Thus statistics fails when exactness is essential.

d) Statistics can be used to establish wrong conclusions and thus can be used only by
experts.
e) Liable to be misused – for example, opinion polls

5.4 Subdivisions in statistics


The study of statistics is usually divided into two broad areas namely;
1. Descriptive and inferential statistics
2. Deductive and inductive statistics

40
1. Descriptive statistics and inferential statistics.
Descriptive statistics involves description of data. It deals with methods of organizing,
summarizing, and presenting data (by use of graphs, charts such as bar charts and pie charts) in
an informative way.

Inferential statistics deals with methods used to find out something about a population, based on
results from a sample. This is where generalizations are made about the entire population on the
basis of the sample of results. Inferential statistics is more emphasized than descriptive statistics
because it is important in decision making and that it acknowledges the potential for errors that
may be involved in making generalizations.

2. Deductive and inductive statistics


Deductive statistics involves use of probability to determine the chance of getting a particular
kind of sample result.
Inductive statistics involves the logic of making statistically varied conclusions about the
population on the basis of a sample.

5.5 Scales of Measurements


The field of statistics deals with measurement of both quantitative and qualitative variables. The
measurements are the actual numerical values of a variable. In this section, we shall consider the
four generally used scales of measurements.

1. Nominal Scale
In the nominal scale of measurement, numbers are used simply as labels for groups or classes. If
our data set consists of red, orange, yellow green and blue items, we may designate red as 1,
orange as 2, yellow as 3 green as 4 and blue as 5. In this case the numbers 1, 2, 3, 4 and 5 stand
only for the category for which a data point belongs. “Nominal” stands for “name” of category.
The nominal scale of measurement is used for qualitative rather than quantitative data: red,
orange, yellow, green, blue; male, female; professional classification; geographic classification,
and so on.

2. Ordinal Scale
In the ordinal scale of measurement, data elements may be ordered according to their relative
size or quality. For example five products may be ranked by a consumer as 1, 2, 3, 4 and 5;
where 5 is the best and 1 is the worst. In this scale, we do not know how much better one product
is than others, we only know that it is better.

3. Interval Scale
In the interval scale of measurement, we can assign meaning to distances between any two
observations. The data are in the interval of numbers, and distances between elements can be

41
measured in units. For example in 2003, the mean KCSE score for Kamwamu Secondary School
was 8.345. In 2004 it was 8.764. These numbers are in an interval scale since they provide
ranking of the performance and the arithmetic operation of addition and subtraction are
meaningful.

4. Ratio Scale:
The ratio scale is the strongest scale of measurement. Here not only do the distances between
pairs of observations have a meaning, but also there is a meaning to ratios of distances. Salaries
are measured on a ratio scale; a salary of Kshs. 85,000 is twice as large as a salary of Kshs.
42,500. Such a comparison is not possible with temperatures; which are on an interval scale but
not a ratio scale (we cannot say that 30oC is twice as warm as 15oC). The ratio scale contains a
meaningful zero (0oc is not meaningful in this respect). Typical business data, such as revenue,
cost, and profit fall into the group of ratio data.

Self-test exercise
1. What is the scale of measurement for each of the following variables?
a) Student grade point averages
b) Distance students travel to class
c) Students’ scores on the first statistics test
d) A classification of students by district of birth
e) A ranking of students by year of study
2. What is the scale of measurement for these items related to the Newspaper business?
a) The number of papers sold each Sunday during 2004
b) The number of employees in each of the departments, such as; editorial, advertising,
sports, etc.
c) A summary of the number of papers sold by District
d) The number of years each employee has been working for the institution.

5.6 Variables: meaning and classification


A variable is a symbol such as X, Y, Z etc. which assumes any of a prescribed set of values. The
set of values which a variable can assume is called a Domain of the variable e.g. Poverty has
research variables such as residence, income/ employment, level of education etc.

A variable is a characteristic or property of an individual population unit. The name “variable”


is derived from the fact that any particular characteristic may vary among the units in a
population.

42
If a variable can assume only one characteristic then it is called a Constant.

5.6.1 Classification of variables


Variables can either be:

 Qualitative

- A qualitative variable is a symbol whose range consists of attributes or non-quantitative


characteristics of people, objects, or events. For example, sex (male, female), race
(African, other), grades in an exam (A, B, C, D, E), etc.

- Categories of a qualitative variable are non-overlapping (an element cannot be in more


than one category). They may or may not suggest an order or rank.

 Quantitative

- Observations are numbers representing an amount or count of a certain characteristic like


height, weight, etc. For example, the weight of 10 packets of maize flour.

- Quantitative variables can be discrete or continuous

5.6.2 Discrete and Non-discrete data


A discrete variable (data) arises out of enumeration or counting and assumes only countable
number of values. It assumes only specific values.

For example, family size, Number of students in a class, Number of passengers in a bus, Number
of worshippers in a church etc.

A continuous variable assumes values on an interval. It arises out of measurement or


experiment. It assumes any value between two given values.
For example, weight of a baby born at the hospital: 0<x<4 kg, etc. meaning that the weight of the
baby is within the range between 0Kg and 4Kg. The actual weight can be determined through
measurement.

43
A summary of the classification of data is shown in the figure I below;

Data

Qualitative Quantitative

Examples: Gender, race, colour of


cars, etc

Discrete Continuous

Examples: No. of children in a Examples: Weight of a person,


class, number of employees, etc temperature on a certain day, etc

Figure I: Classification of data

5.7 Sources of data


There are two sources of data:
1) Primary data
2) Secondary data
Primary data is data in its raw form or data as collected by a researcher directly from the
respondents. It can be collected through an experiment being performed to obtain necessary data.

Secondary data refers to data collected from existing published or unpublished sources such as
official document, government publications, text books, journals etc.
There are 3 classes of secondary data:

- Continuous or regular data - this refers to data from regular publications such as monthly
data on rainfall, monthly data on treasury bills, monthly data on inflation etc.

- Periodical data – this is data collected and published over a period of time e. g
population census.

- Irregular data – this data cannot be predicted on the basis of time, e.g. publication of
journals, thesis, books etc.

44
Comparison between primary and secondary data
Primary data is preferred to secondary data due to the following:
i) Compiling errors might be made while collecting secondary data.
ii) In case of secondary data, one may use data out of context.

Reasons for collecting data


o To assist in decision making
o To provide needed information in a study or survey
o To satisfy our curiosity
o To assist in making informed decisions
o To measure performance of an ongoing service or production process

How to collect primary data


1. Indirect oral investigation e. g through interviewing friends, correspondence etc.
2. Direct oral investigation – interviewing correspondence

3. Mailed questionnaire method – this is posting questionnaires to correspondence so as to


fill and return.
4. Participants observation – here the researcher makes all observations
5. Experiments – this involves actual testing in a laboratory setup

45
Review Questions
1. Give the meaning of statistics and state its importance and limitations.

2. Five ice cream flavors are rank ordered by preference. What is the scale of
measurement?
3. Which of the following variables are discrete and which are continuous?
a). Time taken to complete a project
b). Length of a safari to a game reserve
c). Number of rooms in a house
d). Age of a building
e). Volume of water in a tank
4. Explain the various sub-divisions of statistics
5. Identify the scales of measurement and determine their applicablility for given
variables
6. Give the meaning of Discrete and Non-discrete data/variables citing relevant
examples.
7. Identify Primary and Secondary data and describe the methods of data collection

References for further reading


Allan G. Bluman (1995). Elementary statistics: A step by step approach (2nd Edition) Wn. C.
Brown Publishers Melbourne, Australia

Harper W. M (1991). Statistics (6th Ed.) Harlow England: Longman Group UK Ltd.

Saleemi N. A. (1997). Statistics simplified. Acme Press (Kenya) Ltd. Nairobi, Kenya

46
6 CHAPTER SIX: DATA COLLECTION, ORGANIZATION AND PRESENTATION

Learning Objectives
By the end of this chapter the learner should be able to:

1. Appreciate the significance of organizing and presenting data using appropriate


techniques;
2. Organize and Present data using appropriate techniques;
3. State the rules of forming frequency distributions;
4. Present data by use of histograms, frequency polygons and ogive.

6.1 Collection and Presentation of Data


Once data has been collected they have to be organized and analysed. Data can be collected in
many ways e.g. scientists obtain data through observation or results of experiments. Data can
also be collected from records e.g. registrar’s office. Economists and social workers obtain data
by use of questionnaires.

6.2 Organizing Data


When data is first collected it is usually in raw form and hence there is need to organize it. One
way is to either arrange the data in either ascending or descending order e.g. the weight of five
students was measured to be 50kg, 72kg, 65kg, 45kg and 64kg.

Arrange in ascending and descending order;

Ascending order; 45, 50, 64, 65, 72

Descending order; 72, 65, 64, 50, 45

This method is not convenient when data involves too many values to be listed.

Another method is using a Tally table. For example, in a statistics exam the performance of 40
students was as follows:

53, 52, 51, 56, 58, 60, 52, 54, 52, 63, 52, 56, 56,

66, 70, 52, 52, 70, 68, 52, 56, 52, 51, 59, 62, 63, 76,

52, 54, 56, 52, 50, 51, 56, 52, 60, 68, 72, 52, 54

47
Tally table

Marks Tally No. of students


50 / 1
51 /// 3
52 //// //// // 12
53 / 1
54 /// 3
56 //// / 6
58 / 1
59 / 1
60 // 2
62 / 1
63 // 2
66 / 1
68 // 2
70 // 2
72 / 1
76 / 1
Total 40

Exercise

The data below shows marks out of 30 for 50 students in a History Examination’s Continuous
Assessment Test. Construct a tally table for this data

10 12 11 15 16 15 14 20 21 19 18 17 16 14 15 13 15

11 21 19 12 14 15 20 19 18 16 16 17 18 18 16 14

15 14 12 11 17 19 13 12 9 10 17 16 18 19 16 17 9

48
6.3 Presentation of Data
Having collected data then next is to present in tables. For example, the marks obtained by 40
students in a Statistics examination can be presented in table form as follows:

Marks 50 51 52 53 54 56 58 59 60 62 63 66 68 70 72 76

No. of 1 3 12 1 3 6 1 1 2 1 2 1 2 2 1 1
students

However, one easier way to understand data is through diagrammatic presentation which
indicates very clearly any trends and patterns in the data. The most common diagrams are pie
and bar charts.

6.3.1 Bar charts


They are the most common types of diagrams in practice. It’s a thick line whose width is merely
shown for attention. The bar chart makes comparison by means of parallel bars of equal width
placed side by side either horizontally or vertically.

The gap between one bar and another should be uniform. In bar charts, it is the length of the bar
that matters and not the width.
Example;

The table below shows production of coffee in thousands of tonnes for the period 1970 to 1976.
Construct a bar chart;

Year 1970 1971 1972 1973 1974 1975 1976

No. of 120 140 160 200 170 180 150


Tonnes

49
6.3.2 Multiple Bar Charts
This type of bar chart is used when comparisons are to be made between more than one
characteristic. Bars representing the different characteristics are placed side by side.
Example;

The table below shows foreign exchange earnings in million shillings between Tourism and
Agriculture sectors of the economy for the period of year 2000 and year 2005.
Present this information by use of multiple bar charts.

Year 2000 2001 2002 2003 2004 2005

Agriculture 300 320 270 250 280 320

Tourism 350 380 150 200 240 300

50
6.3.3 Composite Bar Charts
This type of chart is used when we want to compare the same characteristics from different
sources. In this case a bar chart for the total is drawn and then broken down into components.
Example;

The table below shows earnings in million shillings for Domestics and Foreign Tourism. Present
this information in a composite bar chart.

Year 2002 2003 2004 2005

Domestic (Kshs millions) 50 80 140 130

Foreign (Kshs millions) 150 170 260 190

Total 200 250 400 320

51
6.3.4 Pie Charts
Information in a pie chart is represented in a circular figure with angles representing the various
values.
Example: A student spent Kshs. 1000 within a term in the following manner

Item Soap Fare Sugar Toothpaste Beverage

Amount 190 350 280 80 100


(Kshs)

Represent the information in a pie chart


Total amount spent = Kshs. 1000 A circle has a total of 3600
The portions of the circle represented by the different items are as follows;

190
so a p = x 3 6 0 = 6 8 .4 o
1000
350
fa re = x360 = 126o
1000
280
su g a r = x 3 6 0 = 1 0 0 .8 o
1000
80
to o th p a s te = x 3 6 0 = 2 8 .8 o
1000
100
b e v e ra g e = x360 = 36o
1000

52
The pie chart is as given below;

Grouping Data
One way of organizing data is by means of classification. Classification is the grouping of related
facts into different classes.
Facts in one class differ from those in another class with respect to some characteristics called a
Basis of classification
There are four basis of classification namely:

i) Geographical
Data is classified on the basis of geographical or locational differences between the various items
e.g. production of maize may be classified according to counties or districts.

ii) Chronological
Refers to data that has been observed over a period of time e.g. a company may classify sales
figures according to years.

53
iii) Qualitative
Data is classified on the basis of some characteristics that is not measurable on a quantitative
scale or cannot be expressed numerically e.g. sex, marital status, colour of hair etc.

iv) Quantitative
Refers to classification of data according to some characteristics that can be measured or
enumerated e.g. Height, weight, income, sales, age etc.

Grouped data (Numerical illustration)


The table below shows the frequency distribution for salaries of 80 employees at Mountain View
Academy. The salary is given in thousands shillings;

Salary ( Kshs ‘000) No. of Employees ( Frequency )

5-9 12

10 - 14 16

15 - 19 20

20 - 24 14

25 - 29 10

30 - 34 6

35 - 39 2

TOTAL 80

A symbol defining a class such as 25-29 in the table above is called a class interval. The end
numbers i.e. 25 and 29 are referred to as the class limits.
Since the salaries are in thousands then a salary of 24501 belongs to this class 25-29. Also
somebody earning 24600 or 24800 will also belong to the class of 25-29. i.e. 24501- 29500
belong to 25-29.

The two dividing points between the class of 10-14 and 15-19 is 14.5. Such dividing lines such
as 14.5, 19.5, 24.5, and 29.5 etc. are called class Boundaries, or true class limits and can be
obtained as follows;

14 + 15 29 29 + 30 50
= = 14.5 = = 29.5
2 2 2 2

54
The lower class boundary of class 25-29 is 24.5 while the upper class boundary is 29.5. The class
size or width of any class interval is the difference between the upper and the lower class
boundaries of the class.
For the class 25-29 the upper and the lower class boundaries are;
Lower = 24.5 upper = 29.5
The difference between the upper and lower class boundaries is the class width; that is

29.5 − 24.5 = 5.0

The class mark or midpoint is the mid points of the class. For the 25-29 and 10 - 14 class
intervals, the midpoints are obtained as follows;
25 + 29 54
= = 27
2 2

10 + 14 24
= = 12
2 2

Data organized and summarized as in table above is referred to as grouped data.

6.4 General Rules of Forming Frequency Distribution


Consider the example of marks obtained by 40 students in a Statistics examination as presented
in table below:

Marks 50 51 52 53 54 56 58 59 60 62 63 66 68 70 72 76

No. of 1 3 12 1 3 6 1 1 2 1 2 1 2 2 1 1
students

1. Determine the largest and the smallest numbers in the raw data. In this case 50 and 76,
and compute the range as follows;
Range = largest value - smallest value = 76 -50 =26

2. Divide the range into convenient number of class intervals having the same size. In
normal practice we use classes of between 5 and 20

26
Number of intervals = = 5.2 6
5
3. Determine the number of observations falling into each class interval and form a grouped
frequency distribution using class size of 5

55
Marks Tally No. of students

50 - 54 //// //// //// //// 20

55 -59 //// /// 8

60 - 64 //// 5

65 -69 /// 3

70 - 75 /// 3

75 -79 / 1

Having organized data in a frequency distribution as shown above then the next presentation of
data is by use of diagrams namely histograms and frequency polygons.

6.5 Histograms
Frequency histogram consists of a set of rectangles having

a) Basis on a horizontal axis with centres at the class mark and length equal to the class
interval size.

b) Areas proportional to class frequencies. For example, the table below gives a distribution
of masses of 64 students in a town college. Present this by use of a histogram and a
frequency polygon.

Masses 50-54 55-59 60-64 65-69 70-74

frequency 8 20 22 10 4

I; determine the mid-points of the masses as follows;

Masses frequency Mid-points

50-54 8 52

55-59 20 57

60-64 22 62

65-69 10 67

70-74 4 72

56
II: The resulting histogram and frequency polygon are as shown below;

Histogram
20

16

Frequency Polygon

12

Frequency

50-54 55-59 60-64 65-69 70-74


Mass (kg)

Amended frequency

The table below gives a frequency distribution for wages of 45 employees of KKV Company.
Present the information by use of a histogram and a frequency polygon

Wages (‘000) 15 - 19 20 - 29 30 - 34 35 - 39

frequency 5 18 8 4

Class 20-29 is of size 10 while other classes have size 5. Since the size is bigger we have to
amend the frequency for this class.

frequency of class to be amended x least class size


Amended frequency =
Size of class to be amended

18 x5 90
Amended frequency = = =9
10 10

57
Wages frequency Mid-points

15 – 19 5 17

20 – 29 9 25

30 – 34 8 32

35 - 39 4 37

The resulting histogram and frequency polygon will thus be as shown below

16

12

Frequency

15-19 20-29 30-34 35-33

Mass (kg)

58
6.6 Cumulative Frequency
The total frequency of all values less than the upper class boundary of a given class interval is
called the cumulative frequency up to and including class intervals.
From the previous example involving 45 employees of KKV Company, cumulative frequency
can be tabulated as shown below;

Wages (‘000) 15-19 20-29 30-34 35-39

Frequency 5 20 15 5

Cumulative frequency 5 25 40 45

The table presenting cumulative frequency is called a frequency Distribution or Cumulative


Frequency table or simply cumulative frequency

Ogive (Cumulative frequency polygon)


A graph showing the cumulative frequency less than an upper class boundary plotted against the
upper class boundary is called a cumulative frequency polygon or ogive.

Refer to the previous example of 64 students in a town college as shown below;

Masses 50-54 55-59 60-64 65-69 70-74

frequency 8 20 22 10 4

A cumulative frequency table can be obtained as below.

Masses frequency Upper limit Cumulative


frequency

45-49 0 49.5 0

50-54 8 54.5 8

55-59 20 59.5 28

60-64 22 64.5 50

65-69 10 69.5 60

70-74 4 74.5 64

59
The resultant cumulative frequency polygon or ogive is as shown below;

60
Review Questions
1. The table below shows the areas in millions of square kilometers of the oceans of the world.
Graph the data using a bar chart
Ocean Pacific Atlantic Indian Antarctic Arctic
Area Million KM2 183.4 106.7 73.8 19.7 12.4

2. The following table shows the numbers of agricultural and non-agricultural workers in ADC
Farm for the years, 2000–2007.
Year 2000 2001 2002 2003 2004 2005 2006 2007
Agricultural workers 3.7 4.9 6.2 6.9 8.6 9.9 10.9 11.6
millions
Non-Agricultural 1.7 2.8 4.3 6.1 8.8 13.4 18.2 25.8
workers millions
Graph the data using
i) Bar charts
ii) A composite bar chart.
3. The following table shows the birth and death rates per 1000 people in Copa land.
Graph the data using an appropriate type of graph.
Year 2002 2003 2004 2005 2006 2007
Birth rate per 1000 people 25.0 25.0 23.7 21.3 18.9 16.9
Death rate per 1000 people 13.2 13.2 13.0 11.7 11.3 10.9
4. Based on sales during a recent year, the following data represent the market shares (in
percent) held by the leading producers of soft drinks sold in mango Republic.
Soft Drink Producers Market Share (%)
Coca-Cola 39.6
Pepsi-Cola 29.4
7 – Up 6.0
D. Pepper 6.1
Royal Crown 4.5
Crush 1.4
Softa 0.9
All others 12.1
Present this information in a pie chart.

61
5. The data below shows the ages (in years) of 50 employees of a small company
26 57 41 38 19 20 37 58 33 37 24 29 40 30 23 27 27 25 48 32 28 43 62 27 54 42 23 35 18 31
49 34 46 47 52 36 28 36 19 29 40 44 42 37 21 31 39 34 32 39
Construct a frequency table with classes of size 5 starting from 15 years.
6. The monthly salaries of 87 employees of a supermarket were rounded to the nearest Kenya
pounds. They ranged from a low of Kenya pound 1,041 to a high of Kenya pounds 2,348.
(a) Suppose we want to condense the data into 7 classes. Using the same interval for each class
determines a suggested class interval.
(b) What class interval would be easier to work with?
(c) What are the class limits for the first class? The next class?
7. The following table shows the diameters in millimeters of a sample of 60 ball bearings
manufacture by a company. Construct a frequency distribution of the diameters using
appropriate class intervals.
7.38 7.28 7.45 7.33 7.35 7.32 7.31 7.39 7.32 7.25
7.29 7.37 7.36 7.30 7.32 7.37 7.36 7.34 7.35 7.26
7.43 7.36 7.42 7.32 7.35 7.31 7.33 7.27 7.44 7.36
7.40 7.35 7.40 7.30 7.27 7.46 7.39 7.36 7.40 7.35
7.36 7.24 7.28 7.39 7.34 7.35 7.41 7.30 7.41 7.29
7.33 7.38 7.34 7.32 7.35 7.34 7.37 7.35 7.42 7.38
(b) Construct
i. A histogram
ii. A frequency polygon
iii. A relative frequency histogram
iv. A relative frequency polygon
v. A cumulative frequency distribution

References for further reading


Prem S. M and Christopher J. L (2010). Introducing statistics (7th Edition), John Wiley and
Sons Inc. USA

Roger E. K. (2008). Statistics: An introduction, Thomson Wadworth, Belmont, USA

Saleemi N. A. (1997). Statistics simplified. Acme Press (Kenya) Ltd. Nairobi, Kenya

62
7 CHAPTER SEVEN: MEASURES OF CENTRAL TENDENCY

Learning Objectives
By the end of this chapter the learner should be able to:
1. Define the three measures of central tendency
2. Compute the mean, median and mode for ungrouped, frequency distributions and
grouped data
3. Determine the weighted and combined mean from given data.
4. Outline the merits and demerits of each of the three measures of central tendency

7.1 Introduction
For the purpose of statistical decision making, it is essential to extract from the data important
facts which summarize essential information. This summarizing of numbers is called Statistics if
the information is from a sample.
Consider the following;

Let X 1 , X 2 , X 3 ,K , X k be k observations. Then,

X 1 is the 1st observation


X2 is the 2nd observation



Xk is the Kth observation
The sum of these K observations is obtained by adding the 1st, 2nd, 3rd
etc. up to the last Kth

observation. The sum is written as, ∑X k

This is read as Summation of X k observations with values from 1 to k

∑X k = X 1 + X 2 + X 3 ,K , X k

∑ is a Greek sign for SIGMA meaning summation.

63
A measure of central tendency is a single value within the data that is used to represent all the
values in the population.
There are three common measures of central tendency, namely; Mean, Mode and Median.

7.2 The Arithmetic Mean


Definition

The arithmetic mean or simply the mean of a set of N values X 1 , X 2 , X 3 ,K , X N is denoted by


X (read “X bar ") and is defined as;

sum of the observations


mean = X =
Number of observations

X 1 + X 2 + X 3 ,K , X N
Mean = X =
N

7.2.1 Mean from Ungrouped data

For Ungrouped data the Arithmetic mean, X of a set of n observations

sum of the observations


mean = X =
Number of observations

X 1 + X 2 + X 3 ,K , X N
Mean = X =
N

∑X
Mean = X =
N
Examples:
Determine the mean of the following set of numbers
a) 40 45 70 65 68
b) 21 25 33 45 12 54 34

64
Solution;
40 + 45 + 70 + 65 + 68 288
a) X = = = 57.6
5 5
21 + 25 + 33 + 45 + 12 + 54 + 34 224
b) X = = = 32
7 7
7.2.2 Mean from Frequency Distribution

In case of a frequency distribution; suppose the observations; x1, x2,... xk occur with frequencies
f1, f2,..., fk respectively, then their mean is given by;

Mean = x =
∑xf =
∑ xf
n ∑f
Example 1
The masses of 20 students were found as follows:

Mass ( x ) 58 59 60
No. of students ( f ) 6 10 4

Calculate the mean

x f xf
58 6 348
59 10 590
60 4 240

∑ 20 1178

Mean = x =
∑xf
n
1178
X= = 58.9
20

Example 2
Find the mean, x for the data in the table below;
x 455 560 490 516 534 552
f 9 10 11 8 20 5

65
Solution;
x f xf
455 9 4095
560 10 5600
490 11 5390
516 8 4128
534 20 10680
552 5 2760

∑ 63 32653

Mean = x =
∑xf thus; X=
32653
= 518.3016
n 63

7.2.3 Mean from Grouped Data


In case of a grouped frequency distribution, all the formulas defined above are used except that
the term ‘x’ will be the class mark or the mid-point.

Illustrations;
1. The table below shows the heights of 100 seedlings in millimeters.

Height in mm ( x ) 0–4 5–9 10 - 14 15 - 19


No. of seedlings ( f ) 40 48 10 2

Calculate the mean

Height in mm Mid-point x f xf
0–4 2 40 80
5–9 7 48 336
10 - 14 12 10 120
15 - 19 17 2 34

∑ 100 570

Mean = x =
∑xf thus; X=
570
= 57
n 100

66
2. The frequency distribution for marks obtained by 45 pupils of Yala Pre-school is as
shown below;

Marks (x) 60-64 65-69 70-74 75-79 80-84

No. of pupils (f) 6 15 12 8 4

Determine the mean of the class;


Marks Mid-point x f xf
60-64 62 6 372
65-69 67 15 1005
70-74 72 12 864
75-79 77 8 616
80-84 82 4 328

∑ 45 3185

∑ Xf 3185
Mean = X = X= = 70.78
N 45

3. The temperature outside a factory was monitored at regular interval on 80 occasions. The
frequency distribution is as follows

Temperature, x (oC) Frequency, f


30.0 – 30.2 6
30.3 – 30.5 12
30.6 – 30.8 15
30.9 – 31.1 20
31.2 – 31.4 13
31.5 – 31.7 9
31.8 – 32.0 5

Calculate the Mean;

67
Solution
Temperature, x (oC) Frequency, f Mid points ( x ) xf
30.0 – 30.2 6 30.1 180.6
30.3 – 30.5 12 30.4 364.8
30.6 – 30.8 15 30.7 460.5
30.9 – 31.1 20 31.0 620.0
31.2 – 31.4 13 31.3 406.9
31.5 – 31.7 9 31.6 284.4
31.8 – 32.0 5 31.9 159.5
80 2476.7

Mean = x =
∑xf thus; x =
2476.7
= 30.959
n 80
7.2.4 Weighted mean

Sometimes we associate the numbers x1 , x2 , x3 ,K , xk with certain weighting factors or weights


w1 , w2 , w3 ,K , wk depending on the significance or importance attached to the numbers. In such a
case;

w1 x1 + w2 x2 + w3 x3 +,K , + wk xk
Mean = X = Is called the weighted mean
w1 + w2 + w3 +,K , wk

Illustration
The table below shows the performance of a candidate in an end of semester examination at a
local University. The associated weight of each unit is also given.

Unit Math 213 Math 214 Math 216 Phy. 201 Phy. 204 Educ. 262 Educ. 266

Marks 72 64 82 75 80 88 90

Weight 1 0.8 0.6 0.5 0.7 0.4 0.4

Calculate the weighted mean


w x + w2 x2 + w3 x3 +,K , + wk xk
Mean = X = 1 1
w1 + w2 + w3 +,K , wk

68
1x72 + 0.8 x64 + 0.6 x82 + 0.5 x75 + 0.7 x80 + 0.4 x88 + 0.4 x90
Mean = X = = 76.614
1 + 0.8 + 0.6 + 0.5 + 0.7 + 0.4 + 0.4
7.2.5 Combined mean
Quite often we may wish to compute the mean for a large group from the means of small groups,
which make up the large group. For example in a school having several streams per class, if we
know the mean mark in a particular subject for each stream, we may wish to compute the mean
mark in that subject for all the students in the whole class. The formula for the combined mean is
given as;
The combined mean of two populations is given by

NAX A + NB X B
XC =
NA + NB

Where
NA, is the size of the population A
NB, is the size of the population B

X A , is the mean of the population A

X B , is the mean of population B

X C , is the combined mean of population A & B

NA + NB, is the total size of the two populations A & B


Example 1
Three Standard I classes (A, B and C) at Heshima Primary had populations of 45, 49 and 56. In a
given test the classes scored means of 62, 66 and 45 respectively.

Calculate the combined mean of the entire Form I Class.

N A X A + N B X B + NC X C
Combined mean = X =
N A + N B + NC

45 x62 + 49 x66 + 56 x 45 8544


X= = = 56.96
45 + 49 + 56 150

69
Example 2
A class of 250 students was divided into groups A and B. The 2 groups were given the same test.
The mean of group A of 100 students was 15, but the mean of the whole group was 15.6.
What was the mean of group B?

Solution

(100 × 15) + (150 × X B )


15.6 = ,
(100 + 150)

(15.6 × 250) − 1500 = 150 X B

Thus, X B = 16

Example 3
In a school with 4 streams in Form One Class i.e. A, B, C and D each having the following
population A = 45, B = 40, C = 40 and D =50. The average performance (mean score) of each
stream in a term test was A = 82, B =72, C = 76 and D =50.
Calculate the mean performance of the entire form one class.
Solution;
class Mean, x n n. x
A 82 45 3690
B 72 40 2880
C 76 40 3040
D 50 50 2500
175 12110

N A X A + N B X B + N C X C + N D X D 12110
Combined Mean, X = = = 69.2
N A + N B + NC + N D 175

NOTE: If N1 number has mean X 1 , N 2 number has X 2 and N K has X K then the overall mean of
all the numbers is;

N1 X 1 + N 2 X 2 + ... + N K X K
X=
N1 + N 2 + ... + N K

=
∑N X
∑N
70
Activity

The mean monthly salary paid to all employees in a company was Kshs 5,000. The monthly
salary of male and female employee’s averaged Kshs 5200 and Kshs 4200 respectively.
Determine the percentage of males and females employed by the company.

7.2.6 Adjusting mean for a wrong entry


Illustration;

Mrs. Akelo a primary school teacher was hired by Mt. Kenya University to compute the mean
mark of 32 second year students in a Statistics Examination. She found the mean to be 72.24.
However she later realized that she had erred by entering 90 and 88 marks instead of 40 and 38
respectively.

Calculate the correct Mean mark;

x =
∑x n = 32
n

72.24 =
∑X
32

Wrong sum = 72.24 x 32 = 2314.24

Correct sum = ∑ X = [72.24 x 32] - [90+88] + [40 + 38] = 2211.68

2211.68
Thus; correct mean, X = = 69.115
32

Self Exercise
1. Compute the mean hourly wage paid to carpenters who earned the following wages
Kshs. 154.00, Kshs. 201.00, Kshs. 187.50, Kshs. 227.60, Kshs. 306.70, and Kshs. 180.00.

2. The Kaimba Power and Light Company selected 20 residential customers at random. The
following are the amounts in Kshs, the customers were charged for electricity last month
54750, 482.30, 587.35, 507.25, 252.70
475.80 758.40, 464.60, 601.20 706.85
679.10 682.30, 395.65, 355.40, 561.20
659.90 329.80 628.15 654.85, 673.35
Compute the mean.

71
3. The Toro Construction Company pays its hourly employees Kshs. 65, Kshs. 75 or Kshs. 85 per
hour. There are 26 hourly employees, 14 are paid at the Kshs. 65 rate, 10 at the Kshs. 75 rate,
and 2 at the Kshs. 85 rate. What is the mean hourly rate paid to the 26 employees?
4. The next monthly incomes of a sample of large importers of motor vehicles were organized
into the following table:
Net income in Kshs. Millions Number of Importers
2–5 1
6-9 4
10 – 13 10
14 – 17 3
18 - 21 2
(a) What is the table called?
(b) Based on the distribution, what is the estimate of the arithmetic mean net income?

7.3 The Median – Meaning and computation


The median is the midpoint of the values after they have been ordered from the smallest to the
largest, or the largest to the smallest. Fifty percent of the observations are above the median and
fifty percent below the median.

It is therefore the middle value of observations when arranged either in ascending or descending
order if the number of observations N is odd.
If N is even, then median is the arithmetic mean of the two middle values
For example the set
a) 4, 5, 7, 7, 9, 10, 11, 12, 15 has median 9

8 + 10
b) 6, 7, 8, 10, 11, 12 has median = =9
2
Now the median is the value below which half of the values lie and above which the other half of
the values lie.
For grouped data where the raw data has been organized into frequency distribution, median is
obtained from the formulae;

72
N −( f )
Median = L1 + 2
∑ mc
fm

Where
N , is the total frequency
L1 , is the lower class boundary of the median class
c , is the size of the median class interval
f m – Frequency of the median class

(∑ f ) m
- Sum of frequencies of all classes lower than the median class

Illustration 1

The table below gives a frequency distribution for salaries of 50 workers at K & K Company.

Salary (000’ Shillings) 10-19 20-29 30-39 40-49 50-59

frequency 4 12 20 9 5

Cumulative frequency 4 16 36 45 50

Compute the median salary.

N −( f )
Median = L1 + 2
∑ mc
fm

Median class is 30 - 39

N = 50 L1 = 29.5; c = 10; f m = 20; (∑ f )


m
= 16

 50 
 2 − 16 
median = 29.5 +   x10 = 34
 20 
 

73
Illustration 2
The temperature outside a factory was monitored at regular interval on 80 occasions. The
frequency distribution is as follows;
Temperature, 30.0-30.2 30.3-30.5 30.6-30.8 30.9-31.1 31.2-31.4 31.5-31.7 31.8-32.0
x (oC)

Frequency, f 6 12 15 20 13 9 5

State the Median class: 30.9 – 31.1 (the class at the middle)
Solution;

Temperature, x (oC) Frequency, f Cumulative Frequency

30.0 – 30.2 6 6

30.3 – 30.5 12 18

30.6 – 30.8 15 33

30.9 – 31.1 20 53

31.2 – 31.4 13 66

31.5 – 31.7 9 75

31.8 – 32.0 5 80

80

N −( f )
Median = L1 + 2
∑ mc
fm

Median class is 30.9 – 31.1; N = 80; c = 0.3 (30.55 – 30.25);

fm = 20; L1 = 30.85; (Σf)m =33

80 − 33
Median = 30.85 + 2 0.3
20

= 30.85 + 0.105
= 30.955

74
Self Exercise
1. The following is the percentage change in net income from 1997 to 1998 for a sample of 12
construction companies in Kenya 5, 1, -10, -6, 5, 12, 7, 8, 2, 5, -1, 11. Determine the median
change.

2. A sample of daily production of meat at KMC was organized into the following distribution.
Estimate the median daily production
Daily Production Frequency
80 – 89 5
90 – 99 9
100 - 109 20
110 -119 8
120 - 129 6
130 - 139 2

7.4 The Mode – meaning and computation


The mode of a set of numbers is that value which occurs with the greatest frequency, i.e. the
most common value. The mode may not exist, and even if it does it may not be unique.
For example the set
a) 2, 2, 5, 9, 8,9,11, 9,10 has one mode 9 and is called unimodal

b) 10, 12, 13.2, 13.2, 14, 16, 16, 17, 19 has two modes 13.2 and 16 and is called
bimodal.
c) 4, 8, 5, 8, 9, 5, 6, 2, 9, 3,2, 7 has three modes 5,8and 9 and is called trimodal
d) 3, 5, 6, 7, 8, 9 has no mode
In the case of grouped frequency data the mode will is obtained using the formula

 ∆1 
Mode = L1 +  c Where;
 ∆1 + ∆ 2 

L1 , is the lower class boundary of the modal class

∆1 , is the excess (difference) of modal class frequency over frequency of the lower class.

∆ 2 , is the excess of a modal class frequency over frequency of the next higher class

c , is the size of the modal class interval

75
Example
The temperature outside a factory was monitored at regular interval on 80 occasions. The
frequency distribution is as follows

Temperature, x (oC) Frequency, f Cumulative Frequency


30.0 – 30.2 6 6
30.3 – 30.5 12 18
30.6 – 30.8 15 33
30.9 – 31.1 20 53
31.2 – 31.4 13 66
31.5 – 31.7 9 75
31.8 – 32.0 5 80
Calculate the mode
The modal class is 30.9 – 31.1 (with the highest frequency)

c = 0.3; L1 = 30.85; ∆1 = 5 (20-15); ∆ 2 = 7 (20-13)

 ∆1 
Mode = L1 +  c
 ∆1 + ∆ 2 

5
= 30.85 + x 0.3 = 30.85 + 0.125 = 30.975
12

Note: Mode value should lie in the modal class.

Self exercise
1. The net monthly salary of lecturers in 15 selected universities in a certain country is as shown
below in Kshs.
35,000, 49,100, 60,000, 60,000, 40,000, 58,000, 60,000, 60,000, 40,000, 65,000, 50,000, 60,000,
71,400, 60,000, 55,000.
What is the modal monthly salary?
2. The number of work stoppages in the automobile industry for selected months is 6, 0, 10, 14, 8
and 0. What is the modal number of work stoppages?

76
3. Listed below are the total automobile sales in (thousands) in Kenya for the last 14 years. What
are the modes during this period; 9.0, 8.5, 9.1, 10.3, 11.0, 11.5, 10.3, 10.5, 9.8, 9.3, 8.2, 8.2, and
8.5?
4. An automatic machine that fills containers appears to be performing erratically. A check of
weights of the contents of a number of cans revealed
Weigh (grams) Number of cans
130 - < 140 2
140 - < 150 5
150 - < 160 20
160 - < 170 15
170 - < 180 9
180 - < 190 7
190 - < 200 3
200 - < 210 2
Obtain an estimate of the modal weight.

7.5 Merits and demerits of the Measures of Central Tendency


Merits of arithmetic mean
1. It is easy to calculate and understand
2. It is based on all observations
3. Of the three averages, mean is least affected by fluctuations of sampling
4. Suitable for further mathematical treatment
5. It is unique i.e. every distribution has only one mean

Demerits
1. It is affected by extreme observations
E.g. the salaries of 3 employees is as follows; 45,000, 60,000 and 940,000. The mean salary is

45, 000 + 60, 000 + 940, 000


Mean salary = = 348,333
3
2. Mean cannot be used in case of open ended classes because of difficulties in determining
class midpoints
3. Mean cannot be used when dealing with qualitative characteristics e.g. intelligence

77
Merits of median
1. It is easy to identify and calculate.
2. It is not affected by extreme values
3. It is suitable for distribution with open ended classes
4. Can be used for qualitative characteristics

Demerits
1. Sometimes the median is not exact e.g. when there are even observations
2. It is not based on each and every item in the distribution
3. It is affected by fluctuations in sampling

Merits of mode
1. Not affected by extreme observations unlike the arithmetic mean
2. Can be conveniently obtained in case of open ended class
3. It is easy to understand and calculate

Demerits
1. It is not always unique depending on the distribution e. g bimodal or trimodal etc.
2. It is not based on all the observations
3. As compared to median it is affected by fluctuations in sampling
Clearly then, the choice of the average in any given case must be determined by the nature of the
data and the purpose to be served by the average. Remember that a single average value is
designed to replace the detail, yet at the same time to provide an outline of that detail. The
selection of the average will thus depend on which measure fulfills this requirement most
adequately. Since the three averages (mean, mode, median) comprise rather different concepts,
the data may be such as to warrant the use of all the three. In practice, the mean is a firm favorite
in so far as it is so readily computed and understood; generally speaking it should be used instead
of the others. But either the median or even the mode will be preferable if the generalization
concerning midpoints in the calculation of the mean is unjustified, or the mean is seriously
affected by extreme values.

78
Review Questions
1. The reaction time of an individual to a certain stimuli was measured by a psychologist to be
0.53, 0.46, 0.50, 0.49, 0.53, 0.44 and 0.55 seconds respectively.
Obtain the mean reaction time of the individual to the stimuli.
2. The mean of 80 observations is found to be 38.80. At the time of computation of the mean, two
observations are wrongly taken as 17 and 69 instead of 71 and 96 respectively. Compute the
correct mean

3. The mean annual salary paid to all employees in a company was 1500 Kenya pounds. The
mean annual salaries paid to male and female employees were 1560 Kenya pounds and 1260
Kenya pounds respectively. Determine the percentages of males and females employed by the
company.
3. Find the mean x for the data in the table below;
x 462 480 498 516 534 552 570 588 606 624
f 98 75 56 42 30 21 15 11 6 2
4. Find the mean, median and mode for the set of numbers
(a) 7, 4, 10, 9, 15, 12, 7, 9, 7
(b) 8, 11, 4, 3, 2, 5, 4, 10, 6, 1, 10, 8, 12, 6, 5, 7.

5. The table below shows the distribution of the maximum loads in Kilonewtons supported by
certain cables produced by a company.
Maximum load KN Number of cables
93 – 97 2
98 – 102 5
103 – 107 12
108 – 112 17
113 – 117 14
118 – 122 6
123 – 127 3
128 - 132 1
Determine
(a) The mean maximum load (b) The median maximum load (c) The mode maximum load

References for further reading


Harper W. M (1991). Statistics (6th Ed.) Harlow England: Longman Group UK Ltd.

Roger E. K. (2008). Statistics: An introduction, Thomson Wadworth, Belmont, USA

Saleemi N. A (1997). Statistics simplified. Acme Press (Kenya) Ltd. Nairobi, Kenya

79
8 CHAPTER EIGHT: MEASURES OF DISPERSION

Learning Objectives
By the end of this chapter the learner should be able to:

1. Define dispersion and give the properties of a good measure of dispersion

2. Describe the significance of measures of dispersion

3. Define Range, Standard Deviation, Variance and Coefficient of Variation

4. Determine the range, standard deviation, variance and Coefficient of Variation from
given data.

Various measures of central tendency i.e. mean, mode and median gives us one single value that
represents the entire data. But average alone cannot adequately describe a set of observations
unless all the observations are alike.

It is therefore necessary to describe the variations or dispersions of the observations in two or


more distributions. The central tendency may be the same but still there can be wide disparities
in the formation of distribution.
Measures of dispersion help us in studying this important character of a distribution.

8.1 Definition of Dispersion


It is the extent of scatterdness of items around a central tendency or variations or dispersions of
data. There are two types of dispersions:
1. Absolute measure of dispersion

This states the actual amount by which the value of an item deviates from the measure of central
tendency. It is expressed in same units as frequency distribution.
2. Relative measures of dispersion

It is a quotient by dividing the absolute measures by a quantity in respect to which absolute


deviation has been computed. It is expressed in percentage form and is normally used for
comparison purpose.

8.2 Properties of a good measure of dispersion


a) It should be based on all observations
b) It should be readily comprehensive
c) It should be fairly easily calculated

80
d) Should be affected as little as possible by fluctuations of sampling
e) Should not be unduly affected by extreme items
f) Should be amenable for further mathematical calculations

8.3 Significance of measures of dispersion


a) To determine the reliability of an average
b) To serve as a basis for the control of the variability
c) To compare two or more series with regard to their variability
d) To facilitate the use of other statistical measures
Measures of dispersion include:
i) Range
ii) Quartile deviation or interquartile range
iii) Mean deviation or deviation or average deviation
iv) Standard deviation
v) variance
vi) Co-efficient of variation

For the purpose of this course, only Range, Standard deviation, Variance and Co-efficient of
variation will be covered.

8.4 Range: Meaning and computation


Range is the difference between highest and lowest values in a set of observations;

That is, Range = X Max − X Min

Where;

X Max is the highest observation

X Min is the lowest observation

The range is the simplest and also the least reliable measure of dispersion. This is due to the fact
that it is based only on two extreme values. It thus does not take into account all the observations
and does not say anything about the distribution of values in relation to the central tendency. In
addition, it cannot be computed with open-ended classes.
Nevertheless, range is simple to calculate and understand.

81
Example
Find the range of the following data set: 7, 8, 9, 8, 9, 11, 15, 14, 16, 6, 15, 14.
We find that the lowest value is 6 and the highest value is 16. Thus the range is
16 – 6 = 10
The major characteristics of the range are:
i) Only two values are used in its calculation.
ii) It is influenced by extreme values
iii) It is easy to compute and to understand.

8.5 Variance as a measure of dispersion


Variance is the arithmetic mean of the squares of deviations of all observations from the mean.
It is denoted by; S 2

For ungrouped data, Variance is given by the following formulae;

∑ (X i − X )2 ∑ X 2

s2 = or s2 = − ( X )2
n n

In case of grouped data or frequency distribution the following formulae are used to determine
variance.

∑ f ( X i − X )2 ∑ fX 2
s2 = or s2 = − ( X )2
n n

8.6 Standard Deviation


It is the most commonly used measure of dispersion.
It is the positive square root of the arithmetic mean of squares of deviations of given values from
their mean. It is also known as the Root Mean deviation. When all numbers in a sample are close
to each other, then standard deviation is close to 0, but if dispersed then it tends to be large.
Small standard deviation shows high degree of uniformity and homogeneity and vice versa.

82
For ungrouped data, standard deviation is given by the following formulae;

∑(X i − X )2 ∑X 2

s = or s= − ( X )2
n n

In case of grouped data or frequency distribution the following formulae are used to determine
standard deviation.

∑ f (X i − X )2 ∑ fX 2

s = or s = − ( X )2
n n

Example 1

Calculate the variance and standard deviation of; 30, 32, 38, 26, 40

Solution

n=5
30 + 32 + 38 + 26 + 40
X= = 33.2
5

Using the first equation

X −X (X − X ) X2
2
X

30 -3.2 10.24 900


32 -1.2 1.44 1024
38 4.8 23.04 1444
26 -7.2 51.84 676
40 6.8 46.24 1600
132.8 5644

∑ (X i − X )2
Variance = s 2 =
n

132.8
Thus; S2 = = 26.56
5

Standard deviation = S = Variance = 26.56 = 5.154

83
Using the second equation

∑ X 2

Variance = s 2 = − ( X )2
n

5644
S2 = − ( 33.2 ) = 26.56
2

Standard deviation = S = Variance = 26.56 = 5.154

For grouped data X is the class mark or the mid-point of the class.

Example 2

Calculate the variance and standard deviation in the following distribution;

Marks x 50 - 54 55 - 59 60 - 64 65 - 69
Frequency f 4 10 15 8

Solution;

Marks Mid-point, x f fx X2 fX 2

50 - 54 52 4 208 2704 10816


55 - 59 57 10 570 3249 32490
60 - 64 62 15 930 3844 57660
65 - 69 67 8 536 4489 35912
37 2244 136878

Mean = x =
∑xf X=
2244
= 60.65
n 37

∑ fX 2
Variance = s 2 = − ( X )2
n
136878
Variance = S 2 = − ( 60.65 )
2

37
S 2 = 3699.41 − 3678.26 = 21.15

Standard deviation, S = Variance = 21.15 = 4.5989

84
Example 3

For the data below, compute the range and standard deviation

Observations ( x ) 0 1 2 3 4 5 6 7 8
Frequency, f 1 9 26 59 72 52 29 7 1
Cumulative Frequency 1 10 36 95 167 219 248 255 256

Solution

a) Range;
Extreme values are 0 and 8 Range = 8 – 0 = 8
b) Standard deviation

x f f i xi xi − x f i | xi − x | f i ( xi − x ) 2

0 1 0 -3.973 3.973 15.785


1 9 9 -2.973 26.757 79.551
2 26 52 -1.973 51.298 101.218
3 59 177 -0.973 57.407 55.873
4 72 288 0.027 1.944 0.050
5 52 260 1.027 53.404 54.86
6 29 174 2.027 58.783 119.161
7 7 49 3.027 21.189 64.141
8 1 8 4.027 4.027 16.217
256 1017 278.782 506.856

Where x =
∑xf =
1017
= 3.973
n 256

∑ f (X i − X )2
506.856
Standard Deviation, s = δ= = 1.40709 = 1.407
n 256

Variance = δ 2 = (1.40709 )2 = 1.9799023 = 1.98

85
Example 4

We have two sets of data with same mean as shown below. Let us calculate the standard
deviation of the two sets and see how measures of dispersion help us to check for dispersion or
spread-ness of data.

a). The set 26, 27, 28, 29, 30 has mean = 28 and n=5, Using the standard deviation formula,

∑ (x − x)
2
i
s= i =1

n −1

We calculate the value for ( x − x ) 2 , where x =28 as

(26 − 28) 2 + (27 − 28) 2 + (28 − 28) 2 + (29 − 28) 2 + (30 − 28) 2

= 4 + 1 + 0 + 1 + 4 = 10

10
Thus, s = = 2.5 = 1.581
4

b). And 5, 19, 23, 33, 60, has mean = 28, with n=5
We calculate the value for ( x − x ) 2 , where x =28 as

(5 − 28) 2 + (19 − 28) 2 + (23 − 28) 2 + (33 − 28) 2 + (60 − 28) 2

= 529 + 81 + 25 + 25 + 1024 = 1684

1684
Thus, s = = 421 = 20.518
4

What can we conclude from Example 4 above?

Even though the two sets of data have the same mean value, the 1st has a much smaller standard
deviation than the 2nd set of data due to spread-ness of the 2nd set. The second set implies that the
data is much spread from the measure of central tendency than the first set.

The standard deviation measures how data is spread or dispersed from the mean value. In
calculating standard deviation, each value of x is subtracted from the mean value, that
is, ( xi − x ) .2

86
8.7 Relative Dispersion (Coefficient of Variation)
o The mean is a measure of central tendency or “centered-ness”
o Standard deviation is a measure of dispersion or “spread-ness”

Suppose we want to know how spread out the data is relative to the mean. Then a quantity,
Coefficient of Variation (CV) is used. It is defined as;

Standard deviation
C.V = x100%
mean
o A large C.V means that data is relatively spread out from the mean while a small C.V means
that data is relatively concentrated closely around the mean
o If C.V = 0, all the data values are the same and are exactly at the mean.

o The C.V indicates the relative magnitude of the standard deviation as compared with the mean
of the distribution measurements

Example; you are supplied with the following data about the performance of two schools
(Bahati and Tumaini) in the 2012 KCSE examination.

Bahati Tumaini

Number of students 120 92

Mean score 9.82 8.40

Variance of Wages 81 36

In which school is there greater variability in performance?

Solution
Bahati Tumaini

Standard deviation Standard deviation


C.V = x100% C.V = x100%
mean mean

Standard deviation = 81 = 9 Standard deviation = 36 = 6

9 6
thus, C.V = = 0.9165 thus, C.V = = 0.7143
9.82 8.40
There is thus a greater variability in performance in Bahati than in Tumaini. In other words the
marks obtained at Bahati are less homogenous or less consistent.

87
8.8 Importance of variance and standard deviation

Standard deviation is used for drawing statistical inferences or conclusions e. g percentage of


observations lying within a certain range of the mean. Its merits are that it is based on all
observations and that it is not affected by fluctuations of the sample. The Standard deviation
however gives more weight to extreme values due to squaring deviations and it is complicated to
calculate than the other measures.

Self-Exercise

1. Compute the variance and the standard deviation for the data
(a) 11, 6, 10, 6 and 7
(b) 28, 32, 24, 46, 44, 40, 54, 38, 32 and 42.
2. A sample of eight companies in the aerospace industry had the following returns on
investment last year; 10.6, 12.6, 14.8, 18.2, 14.8, 12.2 and 15.6. Compute the mean return and
the standard deviation of the returns.

3. Trout, Inc. feeds fingerling trout in special ponds and markets them when they attain a certain
weight. Samples of 10 trout were isolated in a pond and fed a special food mixture, designated
RT – 10. At the end of the experimental period, the weights of the trout were (in grams):
124, 125, 125, 123, 120, 124, 127, 125, 126 and 121
Determine the sample variance and standard deviation

88
Review Questions
1. Give definitions of the following terms used in statistics
a). Range; Standard deviation;
2. For 108 randomly selected Nairobi residents, the weight frequency distribution is

Weight (Kg) Frequency (f)

40 – 48 6

49 – 57 22

58 – 66 43

67 – 75 28

76 – 84 9

a). Find the Standard deviation and variance


3. Calculate the mean and the standard deviation of the following distribution

x f

2.5 - < 12.5 40

12.5 - < 22.5 186

22.5 - < 32.5 373

32.5 - < 42.5 296

42.5 - < 52.5 93

52.5 - < 62.5 12

4. The mean of 5 observations is 4.4 and the variance is 8.24. If three of the five observations are
1, 2 and 6, find the other two.
5. (a) The arithmetic mean and variance of a set of ten figures are known to be 17 and 33
respectively. Of the ten figures, one figure (i.e. 26) was subsequently found to be inaccurate, and
was weeded out. What is the resulting
i. Arithmetic mean?
ii. Standard deviation?

6. A distribution consists of three components with frequencies 200, 250 and 300 having means

89
25, 10, and 15 and standard deviations 3, 4, and 5 respectively.
Calculate
(i) The mean
(ii) The standard deviation.
7. The distribution of life- times of two models of refrigerators in a recent survey are given
below:
Life time Number of Refrigerators
No. of years Model A Model B
0-<2 5 2
2-<4 16 7
4-<6 13 12
6-<8 7 19
8 - < 10 5 9
10 - < 12 4 1
(a) What is the average lifetime of each model of these refrigerators?
(b) Which model has greater uniformity?

References for further reading


Roger E. K. (2008). Statistics: An introduction, Thomson Wadworth, Belmont, USA

Saleemi N. A. (1997). Statistics simplified. Acme Press (Kenya) Ltd. Nairobi, Kenya

Yamane T. (1967). Statistics: An introductory analysis (2nd Edition). Harper and Row
Publishers, New York

90
9 CHAPTER NINE: THE CONCEPT OF THE NORMAL CURVE

Learning Objectives
By the end of this chapter the learner should be able to:

1. Describe the meaning of a Symmetrical distribution or a normal curve

2. Define skewness and distinguish between Skewed to the right or positively skewed and
Skewed to the left or negatively skewed

3. Explain the meaning of Kurtosis

4. Distinguish between leptokurtic, platykurtic and mesokurtic distibutions

5. Distinguish between dispersion and skewness

9.1 Introduction
The measures of central tendency and dispersion discussed in Chapter Seven and Eight
respectively do not reveal the entire story about a frequency distribution. Further description of a
frequency distribution is necessary and is provided by measures of skewness and measures of
Kurtosis. Measures of skewness and measures of Kurtosis in a way tell us how a frequency
distribution would look like if presented diagrammatically by a smoothed frequency polygon.

A distribution is said to be symmetrical about the mean if when presented diagrammatically, it


has a reflective symmetry (or has a line of reflective symmetry). This is known as a Normal
Curve.

Mean
Mode
Median

When a distribution is symmetrical about the mean, the mean and the median coincide i.e. they
are equal. The mode may also coincide with the mean on occasions where it exists. In this case
any of the three measures of central tendency mean, mode or median is as good as the other. In

91
this case the observations are arranged equally and symmetrically around a measure of central
tendency. Furthermore in symmetrical distributions the sum of the positive and negative
deviations from the mean, median or mode is zero. The shape of such a curve is always bell-
shaped.
Another example of symmetrical distribution is the bimodal distribution.

Mode Mode Mode


Median
In this case, mean and median are not good as measures of central tendency since not many
values in the distribution cluster around them. The two modes are good as measures of central
tendency.

9.2 The concept of Skewness


Skewness refers to lack of symmetry. A skewed distribution is a frequency distribution that is
asymmetric (not symmetric). In a skewed distribution the values of mean, median or mode are
not equal. The measure of skewness indicates the difference between the manner in which the
observations are distributed in a particular distribution compared to the normal distribution.

9.2.1 Skewed to the right


When the longer tail of the distribution extends to the right, that is a few observations are
extremely large, the distribution is said to be positively skewed (skewed to the right). When a
distribution is skewed to the right, it contains a large number of relatively low scores and a few
extremely high scores. Generally the mode is less than the median, which in turn is less than the
mean.

Mode Median Mean

92
9.2.2 Skewed to the left
On the other hand, when the longer tail of a distribution extends to the left, that is a few
observations are extremely small, the distribution is said to be negatively skewed (skewed to the
left). When a distribution is skewed to the left, it contains a large number of relatively high
scores and a few extremely low scores. Generally the mean is less than the median, which in turn
is less than the mode.

Mean Median Mode

Thus for a skewed distribution, the mean is different from the median, which in turn is different
from the mode.
In terms of skewness, a frequency curve can be

• Symmetrical – where the mean, median and mode have the same value

• Skewed to the right or positively skewed, that is, Mode < median < mean

• Skewed to the left or negatively skewed, that is, Mode > median > mean
Skewness is different from variation in that;
(i) Variation tells us about the amount of spread while skewness tells us about the direction
of spread.

(ii) In business and economic series, measures of variation have greater practical application
than measures of skewness.

(iii) The dispersion indicates how far the mean is representative of values. Skewness helps in
judging if the distribution is normal.

93
9.3 The concept of Kurtosis
Kurtosis indicates “peaked ness” of a distribution.

In describing a frequency distribution, we use an average to show the typical value or central
tendency in the distribution, a measure of variation to show the spread of values and a measure
of skewness to show the direction of the spread of values. The measure of kurtosis is the fourth
device in describing a frequency distribution and can be used to show the degree of
concentration of the values. When the values are more concentrated around the mode, we have a
peaked curve and when the values are widely spread from the mode in both directions we have a
flat-topped curve. Kurtosis refers to the convexity of a curve. It enables us to have an idea about
the flatness or the peakedness of the curve. The degree or kurtosis of a distribution is measured
relative to the peakedness of a normal curve.

• If a curve is more peaked than the normal curve, it is called “leptokurtic”;

• If it is more flat-topped than the normal curve, it is called “platykurtic” or “flat-topped”.

• The normal curve is called “mesokurtic”.


The diagram below illustrates the three different curves

Leptokurtic

Mesokurtic

Platykurtic

The measure of kurtosis can be used to show the degree of concentration of the values. When the
values are more concentrated around the mode, we have a peaked curve and when the values are
widely spread from the mode in both directions, we have a flat- topped curve.

94
Review Questions
1. What do you understand by the terms skewness and kurtosis? Point out their roles in
analyzing a frequency distribution.

2. Describe the meaning of a Symmetrical distribution or a normal curve

3. Define skewness and distinguish between Skewed to the right or positively skewed
and Skewed to the left or negatively skewed

4. Explain the meaning of Kurtosis

5. Distinguish between leptokurtic, platykurtic and mesokurtic


6. Distinguish between dispersion and skewness

References for further reading


Allan G. Bluman (1995). Elementary statistics: A step by step approach (2nd Edition) Wn. C.
Brown Publishers Melbourne, Australia

Harper W. M (1991). Statistics (6th Ed.) Harlow England: Longman Group UK Ltd.

Saleemi N. A. (1997). Statistics simplified. Acme Press (Kenya) Ltd. Nairobi, Kenya

95
10 CHAPTER TEN: CORRELATION AND REGRESSION ANALYSIS

Learning Objectives

By the end of this chapter the learner should be able to:

1. Explain the meaning of correlation and Regression;

2. Compute the Pearson’s product moment correlation coefficient and give an


interpretation of the values obtained;

3. Determine the Coefficient of determination

4. Compute the Spearman Rank Correlation Coefficient and give an interpretation of the
values obtained;

5. Determine the Regression Line Equation

6. Differentiate between Correlation and Regression

10.1 Introduction Correlation Analysis


If two variables X and Y vary in such a way that changes in one variable is accompanied by
changes in the other variable, then the variables are said to be correlated.
Examples;
a) Increase in rainfall leads to increase in the sale of umbrellas;
b) Increase in inflation leads to increase in the price of commodities;
c) Increase in bank interest rates leads to decrease in the rate of borrowing;
d) Can you think of other examples?

In each of the examples given, changes in one variable are accompanied by changes in the other
variable;
Correlation can be zero, negative or positive

• If an increase in one variable is associated with a corresponding increase in the other


variable, the correlation is Positive.

• If an increase in one variable is associated with a corresponding decrease in the other


variable, the correlation is Negative.

Correlation between two variables is simply the extent to which their values vary together
systematically. The primary objective of investigating the correlation between two variables is to

96
determine whether there is any causal connection between them. Furthermore, correlation
techniques are used in predicting the values of one variable from the values of another variable.

When only two variables are involved, we speak of simple correlation and when more than two
variables are involved, we have multiple correlation.

A useful method of investigating if there is any correlation between any two variables is by
drawing a Scatter diagram. The values of one variable, say Y (dependent variable) is measured
along the y-axis and plotted against corresponding values of another variable, say X
(independent variable) along the x-axis.
The following scatter diagrams arise;

97
98
10.1.1 Pearson’s product moment correlation coefficient
The degree of association between two variables (correlation) can be described by a visual
representation or by a number (termed a coefficient) indicating the strength of association. The
quantitative computation of the correlation was first derived in 1896 by Karl Pearson and is
referred to as “Pearson’s product moment correlation coefficient.” It is denoted by letter r.
Correlation coefficient varies between -1 and +1. That is -1 ≤ r ≤ +1
When r = 1, then there is a perfect positive linear relationship between the variables.
When r = 0, then there is no linear relationship between the variables.
When r = -1, then there is a perfect negative linear relationship between the variables.
When r is between -1 and 0, it indicates negative relationship between the variables. In this case,
when r is closer to -1, there is a strong negative relationship. Similarly, when r is closer to 0 but
negative, it implies a weak negative relationship.

When r is between 0 and +1, it indicates positive relationship between the variables. In this case,
when r is closer to +1, there is a strong positive relationship. Similarly, when r is closer to 0 but
positive, it implies a weak positive relationship.

Correlation Coefficient
The correlation coefficient computed from the sample data measures the strength and direction
of a linear relationship between two variables. The symbol for the sample correlation coefficient
is r. It is the statistic which is to measure the strength of relationships
Correlation coefficient r is given by;

r=
∑ (X i − X )(Yi − Y )

{∑ (X )(Y )}
1
− nX − nY
2 2 2 2 2
i i

n∑ xy − ∑ x∑ y
r=
Or
[{n∑x − (∑x) }{n∑ y − (∑ y) }]
2 2 2 2
1
2

99
Where:
n represents the number of pairs of data

∑ denotes the summation of the items indicated


∑X denotes the sum of all X scores
∑X2 indicates that each X score should be squared and then those squares summed

(∑X)2 indicates that the X scores should be summed and the total squared. [Avoid
confusing ∑X2 (the sum of the X squared scores) and (∑X)2 (the square of the sum
of the X scores]
∑Y denotes the sum of all y-scores
∑Y2 indicates that each Y score should be squared and then those squares summed
(∑Y)2 indicates that the Y scores should be summed and the total squared
∑XY indicates that each X score should be first multiplied by its corresponding Y score
and the product (XY) summed

Example 1

Compute the Pearson’s product moment correlation coefficient (r) for the height-weight data of
students shown in the table below;
Height, 174 175 176 177 178 182 183 186 189 193
cm (X)

Weight, 61 65 67 68 72 74 80 87 92 95
kg (Y)

Solution;

The Pearson’s product moment correlation coefficient, r is given by;

n∑ xy − ∑ x∑ y
r=
[{n∑x − (∑x) }{n∑ y − (∑ y) }]
2 2 2 2
1
2

100
Height, cm (X) Weight, kg (Y) XY X2 Y2

174 61 10614 30276 3721

175 65 11375 30625 4225

176 67 11792 30976 4489

177 68 12036 31329 4624

178 72 12816 31684 5184

182 74 13468 33124 5476

183 80 14640 33489 6400

186 87 16182 34596 7569

189 92 17388 35721 8464

193 95 18335 37249 9025

∑X=1813 ∑Y=761 ∑XY=138646 ∑X2=329069 ∑Y2=59177

n∑ xy − ∑ x∑ y
r=
[{n∑ x − (∑ x) }{n∑ y − (∑ y) }]
2 2 2 2
1
2

r=
(10 x138646 ) − (1813x761)
{10 x329069 − 18132 }{10 x59177 − 7612

6767
r=
6860.5342
r = 0.9864
There is a high positive linear relationship between the variables. This implies that there is a
strong positive relationship between the heights and weights of the students.

101
Example 2
The table below shows marks obtained by ten students selected randomly in two mathematics
tests done in a certain school term. Use the information to calculate the Pearson’s product
moment correlation coefficient, r and comment;

Student A B C D E F G H I J

Test 1 86 45 70 66 80 55 50 88 50 90

Test 2 32 76 40 40 30 50 92 28 78 25

Solution

CAT 1 (X) CAT 2 (Y) XY X2 Y2


86 32 2752 3136 1024

45 76 3420 2025 5776

70 40 2800 4900 1600

66 40 2640 4356 1600

80 30 2400 6400 900

55 50 2750 3025 2500


50 92 4600 900 8464

88 28 2464 7744 784

50 78 3900 2500 6084

90 25 2250 8100 625

∑X=680 ∑Y=491 ∑XY=29976 ∑X2=48946 ∑Y2=29357

n∑ xy − ∑ x∑ y
r=
[{n∑x − (∑x) }{n∑ y − (∑ y) }]
2 2 2 2
1
2

102
r=
(10 x 29976 ) − ( 680 x 491)
{10 x 48946 − 6802 }{10 x 29357 − 4912

−34120
r=
27060 x52489
−34120
r=
37687.562
r = −0.9053
There is a high negative linear relationship between the variables. This implies that there is a
strong negative relationship in performance of Test 1 and Test 2 among the ten students.

Interpreting Pearson’s Correlation Coefficient


The usefulness of the correlation depends on its size and significance. If r reliably differs from
0.00, the r-value will be statistically significant (i.e. does not result from a chance occurrence)
implying that if the same variables were measured on another set of similar subjects, a similar r-
value would result. If r achieves significance we conclude that the relationship between the two
variables was not due to chance.

How to Evaluate a Correlation


The values of r always fall between -1 and +1 and the value does not change if all values of
either variable are converted to a different scale. For example, if the weights of the students in
example 1 were given in pounds instead of kilograms, the value of r would not change (nor
would the shape of the scatter plot.)
The size of any correlation generally evaluates as follows:

Correlation Value Interpretation

≤0.50 Very low

0.51 to 0.79 Low

0.80 to 0.89 Moderate

≥0.90 High (Good)

A high (or low) negative correlation has the same interpretation as a high (or low) positive
correlation. A negative correlation indicates that high scores in one variable are associated with
low scores in the other variable and vice versa.

103
10.1.2 Coefficient of Determination

o It is equal to the square of Correlation coefficient, that is, r 2


o It shows how much of the dependent variable can be explained by independent variable

Example
The following data refers to the amount of money spent by 10 customers who visited a
supermarket in a certain year and their social class index.

Amount spent in Supermarket ( x ) 57 54 49 42 38 32 30 24 20 18


Kshs. in 1,000s
Social Class Index ( y ) 113 111 107 103 100 96 94 84 74 76

Calculate
i). Correlation coefficient
ii). Coefficient of determination

Solution

X Y XY X2 Y2

57 113 6441 3249 12769

54 111 5994 2916 12321

49 107 5243 2401 11449

42 103 4326 1764 10609

38 100 3800 1444 10000

32 96 3072 1024 9216

30 94 2820 900 8836

24 84 2016 576 7056

20 74 1480 400 5476

18 76 1368 324 5776

∑X=364 ∑Y=958 ∑XY=36560 ∑X2=14998 ∑Y2=93508

104
i). We use the Correlation coefficient formula;

n∑ xy − ∑ x∑ y
r=
[{n∑x − (∑x) }{n∑ y − (∑ y) }]
2 2 2 2
1
2

(10 × 36560) − (364 × 958)


r=
{[(10 × 14998) − (364 × 364)][(10 × 93508) − (958 × 958)]}2
1

16888
= 1
{17484 x17316}2

16888
=
17399.797

= 0.9706

There is a high relationship between amount of money spent in supermarket and the social class
index (a positive relationship).

ii). Coefficient of determination, r2

r2 = (0.9706)2 = 0.9420

This means that 94.2% of the variation of the social class (dependent variable) can be explained
by the variation of the amount of money spent in the supermarket every year (independent
variable), and 5.8% is determined by other factors

105
10.1.3 Spearman Rank Correlation coefficient
Correlation coefficient is calculated from the actual values of the variables X and Y in the
sample data. However, in some cases, relative orders of magnitude of these pairs of values are
more instructive than the values themselves. It is thus more useful to access the relationship
between the ranks of the two variables. In this case we use rank correlation.
The Rank correlation is also known as “Spearman Rank Correlation Coefficient”, applicable
when variables are ranked. It is given by;

6∑ d 2
R = 1− and d=u-v
n(n 2 − 1)

Where d = difference in Rank

u = rank of first variable


v = rank of second variable

n = Number of pairs of rankings


Example 1
The table shows the marks of students for Statistics I (Stats 1) and Statistics II (Stats 2).

Stats 1 80 60 65 50 35 30 90

Stats 2 80 50 60 55 45 30 95

Calculate the Spearman Rank Correlation Coefficient, R

Stats 1 Stats 2 Rank (Stats 1) Rank (Stats 2) d= u - v d2


u v
80 80 2 2 0 0
60 50 4 5 -1 1
65 60 3 3 0 0
50 55 5 4 1 1
35 45 6 6 0 0
30 30 7 7 0 0
90 95 1 1 0 0
2
∑d = 2

106
6∑ d 2
R = 1−
n(n 2 − 1)

6× 2
∑d 2
= 2;n = 7 ; R = 1−
7(49 − 1)
= 0.9643

There is a high degree of relationship between performances in the two subjects. The marks
obtained by the students in the two tests agree to a large extent or have a high positive
relationship.

Example 2
In a Drama festival involving five Primary schools, two adjudicators Jane and John awarded the
following marks.

School St. Mary’s Menengai St. Xaviers St. John’s Lenana

Jane 84 80 72 70 78

John 88 76 78 74 82

Calculate the Spearman Rank Correlation Coefficient, R

Jane John Rank (Jane) Rank (John) d= u - v d2


u v
84 88 1 1 0 0
80 76 2 4 -2 4
72 78 4 3 1 1
70 74 5 5 0 0
78 82 3 2 1 1
∑d2 = 6

6∑ d 2
R = 1− n=5
n(n 2 − 1)

6 x6 36
R = 1− R = 1− = 0.7
5 x(52 − 1) 120

The high positive value of R implies that the two adjudicators agreed to a large extent.

107
10.2 Regression Analysis
Regression analysis attempts to discover the nature of the relationship between the variables, and
does this in the form of an equation. The equation can be used to predict one variable given that
sufficient information about the other variable is available.

• The variable whose values are to be predicted is called the response variable or
dependent variable and in the scatter diagram it is conventional to plot this variable on
the Y-axis.

• The variable on which the predictions are based is called the explanatory variable or
independent variable and in the scatter diagram it is conventional to plot this variable on
the X-axis.
The purpose of the regression line is to enable the researcher to see the trend and make
predictions on the basis of the data.

Regression analysis is a statistical procedure that can be used to develop a mathematical equation
showing how variables are related.

10.2.1 Simple Linear Regression Model


A single variable is used to predict another variable on the assumption of linear relationship,

Y = a + bX
Where

Y , is the dependent or response variable


X , is the independent or explanatory or regressor variable.

a , represents the Y-intercept

b , the slope of the regression line and indicates the amount of change of dependent
variable for a unit change in the independent variable

Determination of the Regression Line Equation


In algebra, as we considered the topic on graphs, the equation of a line is usually given
as y = mx + b , where m is the slope of the line and b is the y intercept.

The equation of the regression line is written as Y = a + bX . There are several methods for
finding the regression line but we consider one method.

108
10.2.2 Formulas for the Regression line

Y = a + bX
n (∑ xy ) − (∑ x )(∑ y )
b=
n (∑ x ) − (∑ x )
2 2

∑Y ∑X
a = Y − bX or a= −b
n n
Example
i). Find the equation of the regression line for the data below which is obtained in the study of
age and blood pressure

Subject Age ( x ) Pressure ( y ) xy x2 y2

A 43 128 5,504 1,849 16,384

B 48 120 5,760 2,304 14,400

C 56 135 7,560 3,136 18,225

D 61 143 8,723 3,721 20,449

E 67 141 9,447 4,489 19,881

F 70 152 10,640 4,900 23,104

345 819 47,634 20,399 112,443

Thus ∑ x = 345, ∑ y = 819, ∑ xy = 47,634, ∑ x 2


= 20,399, ∑ y 2 = 112,443 and n = 6

We compute the values of a and b

b=
(6 )(47,634 ) − (345 )(819 ) = 0.96438
6(20,399 ) − (345 )
2

819 345
a= − 0.96438 = 81.048
6 6
Hence, the equation of the regression line is: Y = 81.048 + 0.964 X

109
The regression equation can be used to estimate the pressure given the age
For example;
ii). Find the blood pressure for a person who is aged 50 years.

This means that the value of x = 50

Y = 81.048 + 0.964(50) = 129.25

A person who is 50 years of age will have a blood pressure of around 129 .

Other methods used to determine the regression equation is the method of least squares
considered in the next part.
10.2.3 Method of Least square
This is fitting the line of best fit. Our estimates of the true values of a and b leaves an error
variable or residual as it is not easy to exactly fit the line (only the best fit)
The fitted line should pass through the points of the scatter diagram in such a manner that the
sum of the squares of the vertical deviations of these points from the line will be minimum.
Since some deviations are negative and others positive, we eliminate the signs by squaring each
observation, then use the two normal equations to work out the values of a & b .
We have the normal equations

∑ y = na + b∑ x
∑ xy = a∑ x + b∑ x 2

Example
Apply the method of least squares to fit a straight line relationship (Regression of Y on X) for the
following points

x -2.4 -0.8 0.3 1.9 3.2


y -5.0 -1.5 2.5 6.4 11.0

Solution

Use the normal equations and find x 2 and xy

110
x y x2 xy y2

-2.4 -5.0 5.76 12.0 25.0

-0.8 -1.5 0.64 1.2 2.25

0.3 2.5 0.09 0.75 6.25

1.9 6.4 3.61 12.16 40.96

3.2 11.0 10.24 35.2 121.0

2.2 13.4 20.34 61.31 195.46

From the table, we have

∑ x = 2.2, ∑ y = 13.4, ∑ x ∑ xy = 61.31,


2
= 20.34, y 2 = 195.46 n=5

Using the normal equations,

∑ y = na + b∑ x
∑ xy = a∑ x + b∑ x 2

And the values of x and y in the table, as well as re-arranging the equation, we obtain

5a + 2.2b = 13.4
2.2a + 20.34b = 61.31
The above two equations are known as simultaneous equations and on solving them, we have

b = 2.861 a = 1.421
The best straight line for the given values is

Y = 1.42 + 2.86 X
Also called the equation of the regression line of Y on X

10.3 Differences between Correlation and Regression


o Regression analysis studies the relationship between the variables while the coefficient of
correlation is a measure of degree of relationship between X and Y

o Regression analysis studies both linear and non-linear relationship between variables while
correlation analysis studies only the linear relationship between variables.

o The cause and effect relation is clearly indicated through regression analysis but in correlation
we cannot say that one variable is the cause and the other the effect.

111
Review Questions
1. What is meant by the term, the variables have a negative relationship?
2. Why is correlation important?
3. Define the term correlation coefficient
4. Given the data of age and amount of money spent on buying music CD’s in dollars ($).
Age x 18 26 39 48 53 58
Amount of Money ($) y 16 12 9 5 6 2
Find the Correlation coefficient and comment.
5. In a science congress competition two judges, Okelo and Mwangi awarded the following
Marks
School Menengai Crater Tumaini Lenana Moi Umoja Makini Nakuru
Okelo 84 80 72 70 78 82 76 74
Mwangi 88 76 78 74 82 72 80 84
Calculate the Spearman rank correlation coefficient
6. What is the general form of the regression line used in statistics?
7. Given the data of age and amount of money spent on buying music CD’s in dollars ($).
Age x 18 26 39 48 53 58
Amount of Money ($) y 16 12 9 5 6 2
Find the
a) Equation of the regression line
b) Plot of the regression line in (a) above.
c) Amount of money a 30 year old might spend in buying CD’s by using the equation in
(a) above.
d) Regression line using the method of least squares. Do the results tally with those of
(a) above?

References for further reading


Allan G. Bluman (1995). Elementary statistics: A step by step approach (2nd Edition) Wn. C.
Brown Publishers. Melbourne, Australia

Harper W. M (1991). Statistics (6th Ed.) Harlow England: Longman Group UK Ltd.

Saleemi N. A. (1997). Statistics simplified. Acme Press (Kenya) Ltd. Nairobi, Kenya

112
Appendix 1: Sample Test Papers

DEPARTMENT OF EARLY CHILDHOOD STUDIES


End of Semester Examinations
BEM 4102: INTRODUCTION TO STATISTICS, MEASUREMENTS AND
EVALUATION
Time: 2 Hours
Instructions to Candidates: Answer question 1 (Compulsory) and any other TWO questions.

QUESTION ONE

a) State the meaning of evaluation and give two functions of evaluation (3mks)
b) I) Citing appropriate examples explain the following qualities of a good test
• Reliability (2mks)
• Validity (2mks)
ii) Outline Four Factors that threaten the validity of a test (4mks)
c) The frequency distribution for marks obtained by 45 pupils of Sunshine Pre-school is as
shown below;
Marks 60-64 65-69 70-74 75-79 80-84
Frequency 6 15 12 8 4

Determine the mean of the class (4mks)

i) Present the information in a Histogram (4mks)


ii) State two differences between a Histogram and a Bar Chart (2mks)
d) Giving appropriate examples, distinguish between Formative and Summative evaluation
(6mks)
e) List the six levels of the Bloom’s Taxonomy that outline the order of learning (3mks)

QUESTION TWO

a) The table below shows the distribution of wages in Kshs. ‘000 of 50 employees of
uchumi supermarket
Wages 10-14 15-19 20-24 25-29 30-34 35-39
Kshs. ‘000
No. of 6 16 12 8 6 2
employees

113
Calculate; i) Mode (4mks)

ii) Median (4mks)

b) The table below shows production of Coffee and Tea for the period 1972-1977 in
thousands tonnes
Year 1972 1973 1974 1975 1976 1977
Coffee (‘000 tonnes) 120 150 180 220 210 160
Tea (‘000 tonnes) 110 130 160 190 200 140

Present the information in a Bar Chart (3mks)

c) State four limitations of statistics (4mks)


d) In a drama festival two judges A and B awarded the following Marks
School Bahati Umoja Union Chania Tana Diani Tiwi Ayani
Judge A 84 80 72 70 78 82 76 74
Judge B 88 76 78 74 82 72 80 84
Calculate the Spearman ran correlation coefficient (5mks)

QUESTION THREE

a) Define the following terms giving an example of each


i) Mode (2mks)
ii) Median (2mks)
b) Outline TWO advantages and TWO disadvantages of the three measures of central
tendency (12mks)
c) Mr. Kamau computed the average mark of 122 pupils in a History Exam. He found out
that the mean was 78. However it was realized that he had wrongly entered one student’s
mark as 80 instead of 50.
Calculate the adjusted mean (4mks)

QUESTION FOUR

a) State the meaning of Educational measurement and give two functions of measurement
(4mks)
b) Identify four guidelines for test construction (4mks)
c) Distinguish between Essay and Objective tests (4mks)
d) Give the advantages and disadvantages of essay tests (8mks)

114
QUESTION FIVE

a) Define the following terms


i) Range (4mks)
ii) Standard deviation (4mks)

b) Give ONE advantages and ONE disadvantages of Range (2mks)


c) The TSC investigated the number of days teachers in Molo district were absent from
school in the year 2011. Staff returns of 100 teachers selected at random was taken
and the following table was compiled
No. of days absent 5- 9 10-14 15-19 20-24 25-29 30-34 35-39 40-44
No. of Teachers 4 10 17 20 22 16 8 3
Calculate;

i) Variance (4mks)
ii) Standard deviation (4mks)
d) Represent the following diagrammatically showing the relationship between the
Mean, Mode and Median
i) Normal curve (2mks)
ii) Curve skewed to the Right (2mks)
e) Shimo la Tewa High School has Five Streams in Form 4 (E, W, N, S, C). in a trial
exam the following information was obtained
Stream E W N S C
Mean score 9.211 9.643 8.997 9.000 9.840
No. of candidates 62 55 50 60 42

Calculate the mean of the entire form 4 class (4mks)

115
DEPARTMENT OF EARLY CHILDHOOD STUDIES
End of Semester Examinations
BECC 425: INTRODUCTION TO STATISTICS, MEASUREMENTS TESTS AND
EVALUATION
Time: 2 Hours
Instructions to Candidates: Answer question 1 (Compulsory) and any other TWO questions.

QUESTION ONE

1. a ) Give the meaning of the following term


i. Tests (2 Marks)
ii. Measurement (2 Marks)
b) Outline the purpose of tests and measurement as related to instructional decisions
c) Categorize the following variables as either discreet or continuous (non- discrete)
i. time taken to complete a project
ii. length of a journey to a game reserve
iii. number of rooms in a house
iv. Volume of water in a tank (2 marks)
d) The frequency distribution for the marks obtained by 50 pupils of a pre- school is as shown
below

Marks 65-69 70-74 75-79 80-84 85-89


Frequency 7 16 13 9 5
Determine the mean of the class (4mks)

i) Present the information in a Histogram (4mks

e) i) using appropriate illustrations explain the meaning of the term validity of a test (2 marks)

ii) Discuss the following types of test validity

Face Validity (3 marks)

Content validity (3 marks)

116
QUESTION TWO

a. Give the meaning of the term reliability of a test (2 marks )

b. Outline three factors that could impinge on the reliability of a test (3 marks )

c. Discuss any three methods of assessing the reliability of a test giving steps involved in
each (9 marks)

d. A mathematics teacher administered a test on a certain day. He then re-tested the students
with the same test after two weeks to determine its reliability. The marks obtained by ten
students in the two tests were as follows :

Student A B C D E F G H I J
Test I 7 8 9 9 10 12 14 12 11 12
Test II 16 17 19 21 20 24 22 23 22 20

Determine the Pearson’s Product Moment Correlation Coefficient and comment on the reliability
of the test (6 marks)

QUESTION THREE

a. Explain the types of Evaluation used in classroom instruction (4 Marks)

b. Outline the functions of Educational evaluation (6 marks)

c. The municipal Education Officer in Nakuru investigated the number of days teachers
were absent from school in the year 2010. The staff returns of 100 teachers selected at
random was taken and the results compiled below:

No. of days absent 5- 9 10-14 15-19 20-24 25-29 30-34 35-39 40-44
No. of Teachers 4 10 17 20 22 16 8 3
Calculate;

i) Variance (4mks)
ii) Standard deviation (2mks)
d. Represent the following diagrammatically showing the relationship between mean, mode
and median
i. Normal Curve
ii. Curve skewed to the left

117
QUESTION FOUR
a) Giving appropriate examples explain the following terms giving an example of each
i) Mode (2mks)
ii) Median (2mks)
b) Outline TWO advantages and TWO disadvantages of mean as a measures of central
tendency (4mks)
c) The temperature outside a factory was monitored at regular intervals on 80 occasions.
The frequency distribution is as follows:
Temperature 30.0-30.2 30.3-30.5 30.6-30.8 30.9-31.1 31.2-31.4 31.5-31.7 31.8-32.0
Frequency 6 12 15 20 13 9 5

Calculate: The mode (4 marks)


The median (4 Marks)
d) Madam Owiti computed the average performance of 42 students in a Kiswahili test. She
found out that the mean was 72. However it was realized that he had wrongly entered one
student’s mark as 78 instead of 28. Calculate the adjusted mean. (4 marks)

QUESTION FIVE
a) Briefly discuss each of the six levels of cognitive learning as outlined by the Bloom
taxonomy (12 marks)

b) The two judges John and Mike in a county drama competition involving eight primary
schools awarded the following marks to the participating schools.
School Afraha Lanet Flamingo Langalanga Menengai Uhuru Crater

John 58 55 59 46 59 56 50

Mike 55 54 58 53 57 57 57

Calculate the spearman rank correlation coefficient. Comment on your answer. (5 marks)
c) Utafiti High School has four Streams in Form 1 (E, W, N, and S). In a trial exam the
following information was obtained
Stream E W N S
Mean score 72 78 80 56
No. of candidates 58 55 50 62

Calculate the mean of the entire Form 1 class (3mks)

118

You might also like