Professional Documents
Culture Documents
INTRODUCTION TO STATISTICS,
MEASUREMENTS TESTS AND EVALUATION
i
BEM 4102: Introduction to Statistics, Measurements and Evaluation
Credit Hours: 3:
Pre-requisites: None
Purpose: To use tests and measurements appropriately
Course objective:
By the end of the unit the learner should be able to
i) Analyze data using appropriate statistical measures in education
ii) Discuss various statistical methods available
iii) Explain the nature of educational tests
iv) Use locally acceptable measurement techniques
v) Appreciate the use of central tendency measures in tests and measurement
Course content
Meaning of educational measurement, Philosophy and nature of educational testing and
measurement; Reliability and validity; Forms of evaluation; Discreet and none discreet data;
Central tendency measurement; Measuring variance; the concept of the normal curve;
Correlational and Regressional tests.
COURSE OUTLINE
WEEK 1
CHAPTER ONE: INTRODUCTION TO MEASUREMENT, TESTS AND EVALUATION
• Meaning and purpose of Tests and Measurement
• Meaning of Evaluation
• Purpose/Functions of Educational Evaluation
• Principles of Evaluation
• Types of Evaluation used in classroom instruction
WEEK 2 & 3
CHAPTER TWO: INSTRUCTIONAL OBJECTIVES, TEST DEVELOPMENT AND
TEST IMPROVEMENT
• Definitions
• Type of objectives
• Test Development
• Planning the test and steps to ensure successful test planning by the teacher
• General guidelines in test construction
• Test improvement
ii
WEEK 4
CHAPTER THREE: TYPES OF TESTS
• The Essay tests
• Merits (advantages) and Demerits (limitations) of essays tests
• Suggestions to reduce limitations of essay tests
• Objective test
• Advantages and Disadvantages of objective tests
• Supply item tests;
• Selection item tests; True – False tests; Matching – item tests; Multiple-choice-test tests
and Pictorial – item tests
• Rank – order test items
WEEK 5
CHAPTER FOUR: CHARACTERISTICS OF GOOD TESTS
• Reliability of a Test
• Factors impinging on test on reliability
• Methods of Assessing Reliability
• Validity of a Test
• Types of test validity
• Factors threatening test validity
• Other characteristics of a good test
WEEK 6
CHAPTER FIVE: INTRODUCTION TO STATISTICS
• Introduction; Importance and Limitations of Statistics
• Subdivisions in statistics
• Scales of Measurements
• Variables and their Classification
• Discrete and Non-discrete data
• Sources of data
iii
WEEK 7
CHAPTER SIX: DATA COLLECTION, ORGANIZATION AND PRESENTATION
• Collection and Presentation of Data
• Organizing Data
• Presentation of Data;
• Bar charts; Multiple Bar Charts; Composite Bar Charts and Pie Charts
• General Rules of Forming Frequency Distribution
• Histograms
• Cumulative Frequency
WEEK 8 & 9
CHAPTER SEVEN: MEASURES OF CENTRAL TENDENCY
• The Mean and its computation
• Median and its computation
• Mode and its computation
• Merits and demerits of the Measures of Central Tendency
WEEK 10
CHAPTER EIGHT: MEASURES OF DISPERSION
• Definition of Dispersion
• Properties of a good measure of dispersion
• Significance of measures of dispersion
• Range and its computation
• Standard Deviation and its computation
• Variance and its computation
• Importance of variance and standard deviation
• Relative Dispersion (Coefficient of Variation)
WEEK 11
CHAPTER NINE: SKEWNESS AND KURTOSIS
• Symmetrical distribution
• Skewness
• Skewed to the right
• Skewed to the left
• Kurtosis
iv
WEEK 12 & 13
CHAPTER TEN: CORRELATION AND REGRESSION ANALYSIS
Gronlund, N.E & Linn, R.L (1990). Measurement and evaluation in teaching. (6th Ed). New
York: Macmillan Publishing Company.
Kithuka, M. (2004). Educational measurement and evaluation: A guide to teachers. Egerton,
Kenya: Egerton University Press.
Ministry of Education (1987); A Handbook for Teachers of English in Secondary School; Jomo
Kenyatta Foundation- Nairobi
Saleemi N. A. (1997). Statistics simplified. Acme Press (Kenya) Ltd. Nairobi, Kenya
Richard A. J and Gouri K. B. (2010). Statistics: Principles and methods (6th Edition), John
Wiley and sons Inc. USA
v
TABLE OF CONTENT
Page
COURSE OUTLINE .................................................................................................................... ii
CHAPTER ONE:INTRODUCTION TO MEASUREMENT,TESTS AND EVALUATION 1
1.1 Meaning of Tests .............................................................................................................. 1
1.2 Measurement .................................................................................................................... 2
1.3 Purpose of Tests and Measurements ................................................................................ 2
1.4 Meaning of Evaluation ..................................................................................................... 3
1.5 Purpose/Functions of Educational Evaluation ................................................................. 4
1.6 Principles of Evaluation ................................................................................................... 4
1.7 Types of Evaluation used in classroom instruction.......................................................... 5
1.7.1 Formative Evaluation ............................................................................................... 5
1.7.2 Summative Evaluation ............................................................................................. 5
Review Questions ..................................................................................................................... 6
CHAPTER TWO: INSTRUCTIONAL OBJECTIVES, TEST DEVELOPMENT AND
TEST IMPROVEMENT .............................................................................................................. 7
2.1 Introduction ...................................................................................................................... 7
2.2 Definitions ........................................................................................................................ 7
2.3 Type of objectives ............................................................................................................ 7
2.4 Test Development ............................................................................................................ 9
2.4.1 Planning the test ....................................................................................................... 9
2.4.2 Steps to ensure successful test planning by the teacher ......................................... 10
2.5 General guidelines in test construction .......................................................................... 13
2.6 Test improvement........................................................................................................... 13
2.6.1 Test Tryout ............................................................................................................. 14
2.6.2 Establishing Test Reliability and Validity.............................................................. 17
Review Questions .................................................................................................................... 19
CHAPTER THREE: TYPES OF TESTS ................................................................................. 20
3.1 Introduction .................................................................................................................... 20
3.2 The Essay tests ............................................................................................................... 20
3.2.1 Merits (advantages) of essay tests .......................................................................... 20
3.2.2 Demerits (limitations) of essays tests ..................................................................... 21
3.2.3 Suggestions to reduce limitations of essay tests ..................................................... 21
vi
3.3 Objective test .................................................................................................................. 22
3.3.1 Advantages of objective tests ................................................................................. 23
3.3.2 Disadvantages of objective test .............................................................................. 23
3.4 Supply item tests ............................................................................................................ 23
3.5 Selection item tests ......................................................................................................... 24
3.5.1 True – False tests .................................................................................................... 24
3.5.2 Matching – item tests ............................................................................................. 25
3.5.3 Multiple-choice-test tests ....................................................................................... 26
3.5.4 Pictorial – item tests ............................................................................................... 28
3.6 Rank – order test items ................................................................................................... 29
3.7 Summary of the suggestions for all testing techniques .................................................. 29
Review Questions .................................................................................................................... 30
CHAPTER FOUR: CHARACTERISTICS OF GOOD TESTS ............................................ 31
4.1 Introduction .................................................................................................................... 31
4.2 Reliability of a Test ........................................................................................................ 31
4.2.1 Factors impinging on test reliability ....................................................................... 32
4.2.2 Methods of Assessing Reliability ........................................................................... 32
4.3 Validity of a Test ............................................................................................................ 34
4.3.1 Types of test validity .............................................................................................. 34
4.3.2 Factors threatening test validity ............................................................................. 36
4.4 Other characteristics of a good test ................................................................................ 37
4.4.1 Administrability...................................................................................................... 37
4.4.2 Scorability .............................................................................................................. 37
Review Questions .................................................................................................................... 38
CHAPTER FIVE: INTRODUCTION TO STATISTICS ....................................................... 39
5.1 Introduction .................................................................................................................... 39
5.2 Importance of Statistics .................................................................................................. 40
5.3 Limitations of Statistics.................................................................................................. 40
5.4 Subdivisions in statistics ................................................................................................ 40
5.5 Scales of Measurements ................................................................................................. 41
5.6 Variables: meaning and classification ............................................................................ 42
5.6.1 Classification of variables ...................................................................................... 43
5.6.2 Discrete and Non-discrete data............................................................................... 43
vii
5.7 Sources of data ............................................................................................................... 44
Review Questions ................................................................................................................... 46
CHAPTER SIX: DATA COLLECTION, ORGANIZATION AND PRESENTATION ..... 47
6.1 Collection and Presentation of Data ............................................................................... 47
6.2 Organizing Data ............................................................................................................. 47
6.3 Presentation of Data ....................................................................................................... 49
6.3.1 Bar charts ................................................................................................................ 49
6.3.2 Multiple Bar Charts ................................................................................................ 50
6.3.3 Composite Bar Charts ............................................................................................ 51
6.3.4 Pie Charts ............................................................................................................... 52
6.4 General Rules of Forming Frequency Distribution ........................................................ 55
6.5 Histograms ..................................................................................................................... 56
6.6 Cumulative Frequency ................................................................................................... 59
Review Questions ................................................................................................................... 61
CHAPTER SEVEN: MEASURES OF CENTRAL TENDENCY.......................................... 63
7.1 Introduction .................................................................................................................... 63
7.2 The Arithmetic Mean ..................................................................................................... 64
7.2.1 Mean from Ungrouped data ................................................................................... 64
7.2.2 Mean from Frequency Distribution ........................................................................ 65
7.2.3 Mean from Grouped Data....................................................................................... 66
7.2.4 Weighted mean ....................................................................................................... 68
7.2.5 Combined mean ...................................................................................................... 69
7.2.6 Adjusting mean for a wrong entry .......................................................................... 71
7.3 The Median – Meaning and computation ...................................................................... 72
7.4 The Mode – meaning and computation .......................................................................... 75
7.5 Merits and demerits of the Measures of Central Tendency ........................................... 77
Review Questions ................................................................................................................... 79
CHAPTER EIGHT: MEASURES OF DISPERSION ............................................................ 80
8.1 Definition of Dispersion ................................................................................................. 80
8.2 Properties of a good measure of dispersion ................................................................... 80
8.3 Significance of measures of dispersion .......................................................................... 81
8.4 Range: Meaning and computation ................................................................................. 81
8.5 Variance as a measure of dispersion .............................................................................. 82
viii
8.6 Standard Deviation ......................................................................................................... 82
8.7 Relative Dispersion (Coefficient of Variation) .............................................................. 87
8.8 Importance of variance and standard deviation ............................................................. 88
Review Questions ................................................................................................................... 89
CHAPTER NINE: THE CONCEPT OF THE NORMAL CURVE ...................................... 91
9.1 Introduction .................................................................................................................... 91
9.2 The concept of Skewness ............................................................................................... 92
9.2.1 Skewed to the right ................................................................................................. 92
9.2.2 Skewed to the left ................................................................................................... 93
9.3 The concept of Kurtosis ................................................................................................. 94
Review Questions .................................................................................................................... 95
CHAPTER TEN: CORRELATION AND REGRESSION ANALYSIS ............................... 96
10.1 Introduction Correlation Analysis .................................................................................. 96
10.1.1 Pearson’s product moment correlation coefficient ................................................. 99
10.1.2 Coefficient of Determination................................................................................ 104
10.1.3 Spearman Rank Correlation coefficient ............................................................... 106
10.2 Regression Analysis ..................................................................................................... 108
10.2.1 Simple Linear Regression Model ......................................................................... 108
10.2.2 Formulas for the Regression line.......................................................................... 109
10.2.3 Method of Least square ........................................................................................ 110
10.3 Differences between Correlation and Regression ........................................................ 111
Review Questions ................................................................................................................. 112
Appendix 1: Sample Test Papers ............................................................................................... 113
ix
1 CHAPTER ONE: INTRODUCTION TO MEASUREMENT, TESTS AND
EVALUATION
Concern for the quality of education pupils are receiving in relation to the money being spent on
education has been a major factor in the current demand for accountability. Procedures for
holding educators accountable for effective educational Programme tend to be supported by
citizens but opposed by educators.
Parents complain that their children are unable to read and write effectively after primary
education. Many secondary school leavers find it difficult to write or express themselves well in
their language of communication. These problems point to the need to overhaul the educational
system as a whole; but, before doing so, we must be well acquainted with measurement and
evaluation procedures through which reliable data about the status of the educational system can
be objectively determined. In this sense it implies both: the process of collecting and ordering the
information, and the result of this information.
Learning Objectives
1
1.2 Measurement
Measurement refers to the means of using a scale to determine the degree or level of
achievement or learner’s attribute. Through measurement you may be able to establish the
learners’; intelligence, reading readiness and level, ability to comprehend and ability to use
language. In education the tools used for measurement are tests, experiments and examinations.
Numbers are assigned to learners according to a carefully prescribed, repeatable procedure. The
numbers are also assigned so that the differences between scores represent differences in the
property of characteristic being measured.
Measurement has one main goal; the ability to describe, explain and predict the performance of a
person, process or system in a precise manner. To a large extent it is concerned with finding out
how well students are performing in terms of specific objectives. In other words, the process of
measurement is secondary to that of defining objectives. The ends to be achieved must first be
formulated clearly. Then measurement procedures can be sought as tools for appraising the
extent to which those ends have been achieved.
NOTE: A test given to determine how much the students have learnt is generally referred to as
an achievement test or an attainment test. The daily, weekly, end of term and end of year tests
are all examples of achievement tests. An achievement test is therefore only relevant if it
determines how much the students have learnt. If it does not then it has been misused. The
following principles should be considered by a teacher or examiner in order for an achievement
test to be valid;
i. Questions should be set from all parts of the syllabus;
ii. The number of questions set in each of the syllabus section must reflect the relative
importance of these sections.
a) Communicating the teacher’s goals and determining how much the learner has learnt and
also his difficulties.
b) Increasing motivation
c) Encouraging good study habits
2
d) Providing feedback that identifies strengths and weaknesses.
NOTE: The goals/objectives of instruction should be communicated to learners in advance
before any evaluation.
2. Guidance Decisions
Students need to be guided on their vocational choices, in their educational programmes and in
their personal problems. For students to make sound decisions in their areas, they need accurate
information. Tests provide students with data about significant characteristics which can help
them understand themselves better. Results of tests help teachers to guide students on subject
choices that determine long term career placement.
3. Administrative decisions
Administrative decisions include selection, classification and placement decisions. In selecting
decisions, one decides whether to accept or reject a person for a particular programme. In
classification one decides the type of programme suitable for oneself when for example enrolling
in a college of education programmes such as Arts, Law, Medicine or Engineering and the level
at which the programme is offered.
Other Administrative/Supervisory functions of tests and measurement include;
3
Evaluation is likely to use tests and measurements as tools and also to include other informal
types of evidence, and undertakes to integrate these into a value judgment of the effectiveness of
an educational enterprise. Since evaluative judgments are usually data-based, measurement is
included in the evaluation process as a functional sub-component, hence the credibility of the
measures used.
4
1.7 Types of Evaluation used in classroom instruction
1.7.1 Formative Evaluation
This refers to the evaluation that continues as the project implementation goes on. It is conducted
throughout the stages of project implementation. It is diagnostic in nature for the purpose of
improving the effectiveness and appropriateness of the whole project.
In education, formative evaluation helps a teacher to identify learners’ weaknesses and thus
enable implementation of remedial measures. It provides feedback regarding the student’s
performance in attaining instructional objectives. It identifies learning errors that need to be
corrected and it provides information to make instruction more effective.
Formative evaluation aims at ensuring acquisition and development of knowledge and skills by
students. The purpose is to find out whether after learning experience students are able to do
what they were previously unable to do. Formative evaluation therefore provides the evaluator
with useful information about the strengths or weaknesses of the student within an educational
context.
Common forms of Formative evaluation used in many educational institutions include the
Continuous Assessment tests and the End of term examination.
5
Review Questions
Activity 1
Categorize the following as either measurement or evaluation.
Activity 2
1. Define the following terms
i. Measurement
ii. Tests
iii. Evaluation
2. Distinguish between Measurements and Evaluation
3. Discuss the purpose of Tests and Measurements as related to educational setting
4. Outline the Purpose/Functions of Educational Evaluation
5. Explain the types of Evaluation used in classroom instruction
6
2 CHAPTER TWO: INSTRUCTIONAL OBJECTIVES, TEST DEVELOPMENT AND
TEST IMPROVEMENT
2.1 Introduction
Traditionally, educational measurement has been very helpful in determining the degree to which
certain objectives have been achieved. If education is to be effective, frequent assessment must
be made of the extent to which the desired behavioral changes have been produced. This
evaluation of students’ achievement is based on clearly defined instructional objectives. There is
a spiral relationship between objectives, instruction and evaluation. This means that any testing
programme needs to be based on the existing educational objectives.
Learning Objectives
By the end of this chapter the learner should be able to:
i. State the types of instructional objectives and explain the importance of writing
objectives.
ii. Describe the Bloom’s Taxonomy for classifying educational objectives
iii. Classify cognitive behaviour in six levels
iv. Explain the steps in test development
v. Prepare a table of specification for a test of a given content area
vi. State the guidelines in test construction
vii. Describe the criteria for test improvement
2.2 Definitions
7
1. Cognitive domain
This is rational learning that calls for thinking. Its emphasis is upon knowledge, using the mind,
and intellectual abilities. It is often referred to as Instructional or Behavioral Objectives that
begin with VERBS. This is what we know as Bloom’s Taxonomy
2. Affective domain
This deals with emotional learning and has much to do with feelings. It is concerned with
attitudes, appreciations, interests, values and adjustments.
3. Psychomotor skills
This is the Physical learning that is characterized by doing. It emphasizes speed, accuracy,
dexterity (agility), and physical skills
1. Knowledge
This involves recall or recognition in an appropriate context of material whether it is specific
facts, universal principals, methods, process, patterns, structures or settings. Little is required
besides bringing to mind appropriate materials; e.g. recall of major facts about particular
cultures. Verbs applicable include; Arrange, Define, List
2. Comprehension
This is the lowest level of what is commonly called “understanding” and requires that the
individual be able to paraphrase knowledge accurately, to explain or summarize it in his own
words, or to show logical extensions in terms of complications or corollaries; e.g. skill in
translating verbal descriptions of mathematical material into symbolic statements and vice versa.
Verbs applicable include; Classify, Describe, Discuss
3. Applications
This is the ability to select a given abstractions (idea, rule of procedure, or generalized method)
appropriate for a new situation and to correctly apply it; e.g. the ability to predict the probable
effect of a change in a factor, such as an educational Programme on a social situation previously
at equilibrium. Verbs applicable include; Apply, Choose, Write
4. Analysis
This is the ability to break apart a communication or concept into its constituent elements to
show the hierarchy or other internal relation of ideas, to show the basic for organization, and to
8
indicate how it conveys its effects; e.g. the ability to recognize form and pattern in literary and
artistic works as a way of understanding their meaning. Verbs applicable include; Compare,
Contrast, Analyze.
5. Synthesis
This is the arrangement and combination of pieces, parts, elements, etc., in such a way as to
constitute a pattern or structure not there before e.g. ability to tell a personal experience
effectively. Verbs applicable include; Construct, Create, Design
6. Evaluation
This is the qualitative and quantitative judgment about the extent to which material and methods
satisfy criteria determined by teacher or student; e.g. the ability to compare a work with the
highest known standards in its field-especially with other work of recognized excellence. Verbs
applicable include; Appraise, Defend, Judge
• The test produced that way generally contains items that are poorly conceived, poorly
worded, ambiguous, and sometimes grammatically incorrect.
9
• Furthermore, the test may contain items that are either not scorable or have more than one
correct answer.
• It may be either too easy or too difficult and may be measuring trivial details rather than
the more important pervasive outcomes of learning.
Writing items that are valid, reliable and objectively scorable requires times, energy, and
adequate planning. Processional item writers are seldom able to write more than ten good items
per day. So it is unrealistic to expect the ordinary classroom teacher to be able to prepare a 100 –
item test if he begins writing the test only a few days before it is scheduled. The solution to the
problem lies in adequate planning and in spreading out the item writing over a long period of
time.
Ideally, every test should be reviewed critically by other teachers to minimize deficiencies. In
that case, the teacher should prepare the test in sufficient time to permit a critical, independent
review.
Undoubtedly, the most difficult step in the test planning is the specification of objectives, yet,
this is essential; for without objectives, the teacher will not know what is to be measured.
10
Step II: Preparing the Table of Specification
The second major question that the classroom teacher (who has become a test constructor) must
ask him/herself is;
“What is it that I wish to measure?”
Thus, the teacher must know what he wants to measure. For instance, should the teacher test for
factual knowledge or should he test the extent to which students are able to apply their factual
knowledge or should he test the extent to which students are able to apply their factual
knowledge? The answer to this question depends upon the teacher’s instructional objectives and
what has been stressed in class. If the teacher emphasized the recall of names, places, and dates,
he should test for this. On the other hand, if in chemistry, he had stressed the interpretation of
data, then his test, in order to be a valid measure of his teaching, should emphasize the
measurement of interpretation of data.
In this stage of thinking about the test, the teacher must consider the relationships among his
objectives, teaching, and testing. Once the course content and instructional objectives have been
specified, the teacher is ready to integrate them in some meaningful way so that the test, when
completed, will be a measure of the student’s knowledge.
Kithuka (2004) defines a table of specification as “a two –dimensional table that describes the
nature of items (to be included in a test). It shows whether the item will be testing knowledge,
comprehension, application, analysis, synthesis, or evaluation.”
There are different ways of preparing a table of specifications, depending on the areas being
tested. Generally, tables of specifications have some commonalities. Among them are course
content, behaviour, number of test items and percentage of items.
Kithuka (2004) suggests the following as the steps in constructing a table of specifications:
i. List the general behavioral objectives at the top of the matrix table;
ii. List the content taught on the left hand side of the matrix;
iii. Decide on the length of the test in terms of the number of questions.
iv. Decide on the weighting of the objectives guided by the level of learners.
v. Decide on the weighting of the content taught guided by the amount of time spent on it.
vi. Distribute items in the different cells based on the weighting. Instead of the number of
questions in each cell, the particular item numbers (e.g. Q1, Q2, Q3 etc.) can be written in
the cells. This helps better other users of the test to determine whether or not the items
classified against each cognitive skill truly belongs where it is placed.
11
The table below is an illustrative example of the table of specifications
There are various items formats to select from. Some are less appropriate than others for
measuring certain objectives. For instance, if the objectives to be measured are stated as
“students will be able to organize his ideas and write them in a logical and coherent manner.” It
would be inappropriate to have him select his answer from a series of possible answers. If the
objectives are about recalling names, places, dates and events, it would not be efficient to use a
lengthy essay question. Although there are instances where the instructional objectives can be
measured by different item formats, the teacher should use the least complicated one.
In taking the final decision as to the item format(s) to be used, the test constructor should be
governed by such factors as:-
i. The purpose of the test;
ii. The time available to prepare and score;
iii. The number of students to be tested;
iv. The physical facilities available for reproducing the test;
v. The teacher’s skill in writing the different types of items;
12
2.5 General guidelines in test construction
Zulueta (2006) suggests that the following fundamental principles should be observed to guide
teachers when they construct their evaluation tests:
1. Measure all instructional objectives; teachers should construct tests to measure clearly
the prescribed learning objectives that have been communicated and imparted to the
learners. The test is designed as an operational control to guide the learning sequences
and experience and should be in harmony with the teacher’s instructional objectives.
2. Cover all important learning tasks; a good test focuses and measures a representative
sample of learned tasks.
3. Use appropriate test items; a good test usually includes items that are most appropriate
for a particular objective to check on learner achievement. Some test questions are better
for measuring recall of specific information while other type are good for tapping higher
level thinking process and skills.
4. Make test reliable and valid; tests that are clearly written and minimize guessing are
more reliable than ambiguous statement. Tests that contain a fairly large number of items
or questions are generally more reliable than those with just a few questions or items.
Tests that are well planned and cover a wide range of objectives and topics and that are
well executed will most likely ensure validity. No matter what type of test the teacher
may use, it should be reliable and valid.
5. Use tests to improve learning; this principle reminds teachers that even though tests may
be used primarily to diagnose or evaluate learners’ achievement, in effect they can also
be a learning experience.
However, one of the common mistakes of teachers is that they do not check on the effectiveness
of their tests. The probable reasons for the behavior include:-
a) Teachers feel that test analysis is too time-consuming;
b) Teachers are not aware of the methods of analyzing tests;
c) Teachers do not always understand the importance of accurate evaluation.
This section presents some procedures in analyzing test items and interpreting the results.
13
In improving the quality of tests, two main steps are generally followed: trying out the test and
establishing the test reliability and validity.
There are a variety of item analysis procedures, but most of the procedures provide essentially
the same information. One method that can be used for item analysis is the U – L Index Method.
The steps of the U-L Index Method are:-
1. Score and rank the papers from the highest to lowest according to the total score.
2. Separate the top 27% and the bottom 27% of the papers
3. Tally responses made to each test item by each individual in the upper 27% group.
4. Tally responses made to each test item by each individual in the lower 27% group.
5. Compute the percentage of the upper group that got the item right and call it U.
6. Compute the percentage of the lower group that got the item right and call it L.
7. Average U and L percentage and the result is the difficulty index of the item.
8. Subtracted the L percentage from the U percentage and the result is the Discrimination
index of the item.
By difficulty index, it is meant the percentage of the student who got the item right. It can
also be interpreted as how easy or how difficult an item is.
Good (1973) states that a discrimination index is an indication of the degree to which
individual test items discriminate among students in designated criterion groups. It is
sometimes called differential index or validity index. A discrimination index separates the
bright students from the poor ones. Thus a good test item separates the bright from the poor
students.
14
After item analysis, the following table of equivalents can be used in interpreting the
difficulty indexes:
Likewise, after item analysis, the following table of equivalents can be used in interpreting
the discrimination indexes:
When a teacher prepares test items, he/she aims to have average difficulty. So item analysis
helps in selecting the items that are of average difficulty; thus, the results of an item analysis
tell if the teacher needs to revise items that are too difficult or too easy.
However, care and caution must be taken in using the above table in interpreting the results
of an item analysis. Judgment of the test constructor is still very important. For example,
what will be done with an item having difficulty index of 0.16 and a discrimination index of
0.11? Using the table above, that particular item should be revised: when that particular item
is the only item left to test a very important concept. So, the teacher has no other choice but
to revise or improve it.
On the other hand, what will be done with an item having a difficulty index of 0.50 and a
discrimination index of 0.48? Normally that item should be retained because it has very good
indices. But there will also be an instance when that kind of item may be rejected or
discarded. That will happen if there are already enough items to test the particular concept or
skill that is assessed.
15
Example: The table below shows the result of a tryout of a 10 –item test in mathematics done
by sixty (60) students.
Assignment;
a) What do the numbers in column ‘Upper 27%’ and ‘Lower 27%’ mean?
b) How many items were Rejected, Retained or Revised? Give your justification in the
table above
16
2. Complete the table below (100 students tested)
After analyzing the results of the first tryout, test items are usually revised for improvement.
After revising those items which need revision, another tryout is necessary. The revised form of
the test is administered to a new set of samples. The same conditions as in the first tryout are
followed. After the tryout, another item analysis is done. This is to find out if the test items
revised improved in terms of difficulty and discrimination indexes.
Usually, after two revisions, the test is considered ready to be in its final form. The test is now
good in terms of the difficulty and discrimination indices and, therefore, it is also ready to be
tested for reliability and validity.
17
a. Establishing Test Reliability
18
Review Questions
1. State and explain the types of instructional objectives
2. Describe the Blooms Taxonomy for classifying educational objectives
3. Explain the steps involved in test development
4. Prepare a table of specification for a test in a given content area of your choice
5. State the guidelines in test construction
6. Describe the criteria for test improvement
19
3 CHAPTER THREE: TYPES OF TESTS
Learning Objectives
By the end of this chapter the learner should be able to:
3. Give suggestions for reducing the limitations of the various types of tests
3.1 Introduction
There are basically two broad categories of tests: essay and objective. Essay tests allow students
to express themselves freely in their answers to particular questions. To a large extent, the
emphasis is on students’ overall understanding of the subject in question. In an objective test,
however students’ response are restricted to a number of symbols, words, phrases or simple
sentences, one of which is considered to be the best answer out of several possible alternatives.
A Long Essay is commonly used with older and more mature students who have
mastered the language to be used at certain degree of proficiency. Terms used in a long
essay include discuss, explain, apply, express your view etc.
In a Short Essay test, the student is required to treat the subject as briefly as possible.
Terms generally associated with short essay tests include such words as describe, define,
compare and contrast, classify, illustrate etc.
20
3.2.2 Demerits (limitations) of essays tests
Very often essay tests suffer from various limitations:
1. They measure no more than the ability of the student to recall the information.
2. They suffer from content validity because of inadequate sampling of the course content.
Very often items sampled contain only a limited number of questions, many vital areas
being excluded.
3. They are highly subjective;
ii. The test-to-test carryover effect: that is, the fact that essays of average quality
often are rated higher when proceeded by poor essays and rated lower when
preceded by very good essays.
iii. The order effect: that is the fact that the orders in which papers are marked affect
the score. Some researchers found that papers read earlier tend to receive higher
ratings than those read nearer the end of the sequence.
21
(ii) To the measurement of the content and objectives of higher order skills.
3) Questions should be structured in such a way that an overall understanding of students can be
assessed. Structure items that will measure the ability to apply generalization or principles,
identify relationships by starting with such terms as explain why, show relationship,
compare, associate, interpret, give reasons for, analyze how, distinguish etc.
4) Prepare in advance a scoring scheme based on criteria
5) Score one question at a time for all who attempted to it.
6) Allow sufficient time for students to answer the questions.
7) Score every objective that is to be measured independently.
8) Mark an essay test when you are physically sound and mentally alert and in an environment
with fewest distractions that is conducive to an intellectual work
9) Give instructions for the test that are explicit and well written out.
10) Mark students’ copies using students’ numbers instead of their names and score the answer
question by question instead of script by script. This helps increase objectivity of marking
scheme.
11) Provide enough questions and question variety. This practice tends to improve both the
validity and reliability of the test.
22
These sub-categories are shown in the table below;
Objective tests
Supply Rank Selection item
item order item True-false Matching Multiple Pictorial
choice
Critics of objective tests mention the following advantages and demerits of objective tests.
4. They are not effective in testing students’ ability to organize their thoughts or to write
coherently
5. They tend to test recall of factual information items and do not provide for self-
expression, creativity and comments on the part of the student.
23
Questions of this kind will elicit different types of answers among students. Some students may
correctly fill the two blanks in question (1) with any capital city and any east African country.
Likewise, question (2) elicits an infinity of correct couples of numbers such as (1; 1) (2; 4) (-3;
9) and so on.
Suggestion for designing and improving supply items test include the following:
1. The wording must be clear and specific enough to avoid ambiguous and unexpected
responses.
2. There should be only one possible correct answer.
3. Too many blank spaces in the same sentences should be avoided since they tend to
confuse students.
4. Specify in what unit (kg, m, inches, etc.) or value a numerical answer is to be given.
5. Do not make the answer too obvious.
6. Instructions for each sentence should be brief and clearly stated. In addition giving
examples of how information is to be supplied reduces students’ anxiety and saves time.
The direction “fill in the blanks”, is usually sufficient, but the student should be informed
about how detailed the answer should be.
7. The use of lengthy and tortuous statements and highly technical terms should be avoided.
8. Do not use statements that are copied from the text book or workbook, since this
encourages memorization.
24
c. The chance of guessing the correct answer is very high, and this impinges on the primary
purpose of the test. To measure what the students know and not how lucky they are.
Suggestions for designing and improving true/false tests;
1. Use statements that are absolutely true or false in the student’s environment.
2. Do not use items that will provide clues about the right response. This include words such
as never, always, purely, none, all in statements that are likely to be false and sometimes,
may, usually, could, perhaps in statements that are likely to be true.
3. The intended correct answer should be clear only to a knowledgeable student.
4. Avoid using rhetorical statements. E.g. Water boils at 100oC, does it not? true/false
5. Avoid long and torturers statements and those that are both partly true and partly false.
6. Avoid statements with double negations, e.g. is it not true that a negative number cannot
be a square root of a positive number? true/false
7. Unless carefully worded synonyms tend to make the choice of answer unnecessarily
difficult.
8. Avoid the use of highly technical terms and overlapping statements. These tend to
distract the students unnecessarily.
3.5.2 Matching – item tests
In Matching-item tests, a choice is to be made from among the same set of alternatives. Students
are required to associate/match two or more related words or phrases. Thus the items might
consist of several terms to be defined, while the alternatives could consist of definitions of such
terms. Usually this test has two columns of items which are to be associated directly. The items
could be simple or complex depending on the level of the students.
Limitations of this type of test include;
a. Often too many items are included in each column, thus requiring too much scrutiny on
the part of the student. The student wastes time and makes more mistakes as he becomes
fatigued. The test does not thus evaluate cognitive ability but endurance.
25
Suggestions for designing and improving Matching-item tests;
1. Specify the directions as clearly as possible in order to avoid confusion. Directions
should the basis for matching the items in the two columns.
2. The items in the two columns should be randomly distributed and should give no clues.
3. The entire matching question should appear on a single page. Running the questions in
two pages could be confusing and distracting to students.
4. Wording items in column A should be shorter than those in B. This permits the student to
scan the test quickly.
5. Column A should contain not less than 5 items and not more than 10 items. Longer lists
confuse students.
6. Column A items should be numbered as they will be graded as individual questions and
Column B items should be lettered.
8. Items in both columns should be similar in terms of content, form, grammar and length.
Dissimilar alternatives in column B result in irrelevant clues that can be used to eliminate
items or guess answers by the test-wise student.
9. Negative statements in either column should be avoided.
b) The alternatives or choices of which one is the correct answer and the others are incorrect
answers known as detractors, decoys or distractors
The stem is usually in the form of a complete problem or a question on which the central issue is
stated. If one of the options of responses is the answer, the purpose of detractors is to
discriminate between knowledgeable students from the less knowledgeable ones.
26
Some merits of this type of test include;
a) This type of test has the capacity to test not only knowledge and comprehension but also
high level thinking abilities.
b) They can be adapted to a variety of subject matter content and
c) They can be scored easily and objectively.
According to Kithuka (2005) and Zulueta (2006), the following are essential guidelines in the
construction of multiple-choice item tests;
1. Ensure clarity of the task: the statement of the stem must be worded carefully in order to
avoid vagueness and direct interpretation.
2. Strive for absolute rather than relative correctness of the answer. The intended response
should admit no difference of opinion from experts. The stem must have a definite
answer.
3. Avoid inserting unnecessary information/preamble in the question.
4. Avoid introducing unintended hints about the correct response by repetition of key
words, synonyms or response length.
5. Avoid writing items in the negative form unless there is absolutely no other way of
testing the concept.
6. Include in the stem any word that might otherwise be repeated in the alternative
responses.
7. Use numbers to label the stem and letters for choices.
8. Avoid using items directly from the text books or past papers since this practice
encourages memorization.
9. Alternatives should be parallel in content, form, length and grammar. Avoid making the
correct answer different from others in form, length or grammar.
10. Correct answers should be in random order. Do not use one letter more than others or
create a pattern.
11. All decoys should be plausible and attractive to students who do not know the answer, yet
should be clearly incorrect.
12. Avoid absolute terms (always, never, none) especially in alternatives. A test-wise person
usually avoids answers that include them.
13. The alternatives ‘all of the above’, ‘none of the above’ should be used sparingly.
27
14. Arrange alternatives in some logical order such as alphabetically or chronologically.
15. A multiple choice must not be too long, or else it becomes an endurance test rather than a
test of ability.
a) Because students especially at primary level may have difficulty in perceiving depth, they
tend to see two instead of the three dimensions that the picture represents.
b) Students with limited socio-economic background may have less contact with materials
used in the school and hence may be disadvantaged.
c) Complex pictures tend to confuse students, and sketchy diagrams might not contain
enough information to answer a given set of questions.
Suggestions for designing Pictorial – item tests include the following;
1) Pictures used in test must be clear so as to enable the students to recognize the item in
question.
2) Avoid shading as this tends to complicate the diagram beyond recognition.
3) Use pictures that portray the object or event in its simplest form.
4) Teachers with poor drawing ability can obtain good pictures from books and magazines,
which they can either copy or trace. Many biological, chemical and physical processes
are available in form of charts which can be very useful.
28
3.6 Rank – order test items
In this case students are expected to indicate the appropriate order (serial, chronological, logical
etc.) of the items presented to him. Examples;
1. Arrange the names below in alphabetical order;
(a) Ojo (b) Ade (c) Mulopo (d) Aquab (e) Wahab (f) Josef
2. What is the chronological order of the following countries in getting independence?
(a) Angola (b) Kenya (c) Togo (d) Ghana (e) Namibia (f) Somalia
Ability to order things or events depends to a large extent on exposure to and familiarity with
relevant learning materials and the level of logical development. It also involves both the ability
to identify similarities or differences among objects as well as the ability to discriminate between
relevant and irrelevant attributes of such objects.
2. Gauge item difficulty to ensure that the items are suitable for the students for whom they
are intended. It is essential to know the background of the students. It must not be
assumed that simply the items are clear to the teacher or some students; they must be so
for the rest of the students.
3. Check the literacy level of each item so that the students who are being evaluated in
physics, for instance, are not penalized for reading problems.
4. In order to save students’ time and to encourage them to complete the test, arrange test in
order of difficulty, or else give enough time.
5. State the instructions as clearly as possible and avoid the use of unfamiliar terms. The
students must be made aware by reading the instructions what they are supposed to do.
6. Develop weighting for the test that reflects the objectives.
7. Keep records of tests scores as a reference for evaluating students’ progress or difficulties
and as a means for monitoring the effectiveness of teaching methods as well as the test
itself.
8. Avoid irrelevant sources of difficulty. Instead of using compound words like “socio-
politic-economical equilibrium”, simply state “social, political and economical stability”.
9. The duration of the test should not be above students’ attention span.
29
Review Questions
1) Distinguish between Essay and Objective tests
2) Describe the various types of objective tests giving the merits and demerits of each
3) Discuss the essay tests and outline their advantages and disadvantages
4) Give suggestions for reducing the limitations of the various types of tests
5) Distinguish between the Rank – order type of test items and the supply item tests
30
4 CHAPTER FOUR: CHARACTERISTICS OF GOOD TESTS
Learning Objectives
By the end of this chapter the learner should be able to:
1. Distinguish between reliability and validity of a test
2. Outline the factors that impinge on test reliability
3. Explain the methods of assessing the reliability of a test
4. Describe the types of test validity
5. Outline the factors that threaten the validity of a test
6. Describe administrability and scorability as characteristics of good tests.
4.1 Introduction
There are a number of factors that affect student’s performance in any test. Such factors include
variables such as; Social-economic background, anxiety, interest, mood, cultural values, teacher
characteristics, nature of learning materials, instructional techniques, time of the day etc.
In addition to these factors, other variables must be considered. The evaluator must seek answers
to the following questions,
i. What is the objective of the test? Will it bring out a student’s understanding,
ability, industry, skill?
ii. Does the test cover the subject adequately? Is the content of the test adequate and
relevant to what has been taught?
iii. What acceptable criteria guide the tester in the identification, selection, and
weightings of items?
iv. Is the test reliable, valid, administrable, scorable, interpretable and economical?
This chapter pays a particular attention to the last question and discusses reliability, validity,
administrability and scorability as characteristics of good tests.
31
test. The test scores of students should be reproducible and dependable. Thus if repeated more
than one or two times under similar situations, a reliable test should yield, produce identical
results. If a student scores 90% in math test on Monday and gets 40 on the same on a similar test
on Friday both scores cannot be relied upon.
11) Length of the test (e.g. the longer the test the more reliable it is generally and the more
the test’s reliability index approaches 1)
12) Level of difficulty of the test items
13) Socio-cultural variables.
14) Practice and fatigue effects.
32
b) Administer the test to the learners
c) Keeping all the initial conditions constant, administer the same test to the same learners
say after two to four weeks.
d) Correlate the scores from both tests.
The correlation coefficient (Pearson product moment or Spearman rank correlation coefficient)
obtained is referred to as “coefficient of Reliability” if the coefficient is high (equal or higher
than 0.7) then the instrument is said to be more reliable.
Disadvantages: - The subjects may be influenced by the first test and hence tend to remember
their responses during the second test. In addition students do change or get bored with
repetition.
33
Advantage; This method eliminates chance error due to different test conditions as in the first
two methods.
Disadvantage; The reliability computed may not be for the whole test since the method
correlates one half of the items against the other half in the same test.
34
a) Face validity
Face validity means that a test appears valid on the face of it. It is established when on the
examination of a test a person concludes that it measures the relevant trait. It is sometimes called
expert validity or validation by consensus. The examiner of the test may be an expert or a novice
in test construction. However every test must have face validity. This is particularly important
for tests used for screening job applications. If such tests lack Face validity, there can be an
outcry against the firm using them. Face validity is important from a public relations point of
view.
b) Construct validity
Construction validity is measure of the degree to which data obtained from a test meaningfully
and accurately reflects or represents a theoretical concept. For example, would a score of 90
points on a reading test actually reflect the true reading ability of pupil, or would a score on a
series of mathematics items truthfully reflect the mathematical aptitude of a student? This
approach is often used where no criteria or domain of content is generally accepted as an
adequate measure of a concept. Concepts such as level of management, creativity, self-esteem,
motivation, etc. are all abstract, hypothetical concepts. They cannot be directly observed but their
effects on the behavior of learners can be observed.
c) Content validity
Content validity is a measure of the degree to which data collected using a particular test
represents a specific domain of indicators or content of a particular concept. For example, a test
of arithmetic for standard four pupils would not yield content valid data if items do not include
all four operations – addition, subtraction, multiplication, and division. In designing a test that
will yield content-valid data, the teacher must first specify the domain of indicators which are
relevant to the concept being measured. Theoretically, a content-valid measure should contain
all possible items that should be used in measuring the concept. The usual procedure in
assessing the content validity of a test is to use professionals or experts in the particular field.
d) Criterion-related validity
Criterion-related validity refers to the use of a test in assessing learners’ behavior in specific
situations. If a test purports to measure performance in a job, the subjects who score high on the
test must also perform well in their jobs. Two types of criterion-related validity are recognized:
predictive and concurrent.
Predictive validity refers to the degree to which obtained data predict future behavior of
subjects. An engineering firm, for example, may advertise posts for two graduates in mechanical
engineering. Ten graduates may apply for the post and during the interview, they are given a test.
The test scores are supposed to assess the graduates’ performance on the job once they are
employed. The extent, to which such measures determine the performance on the job of the
35
selected graduates in the future, is the predictive validity of the instrument. If the data obtained
using the tool has predictive validity, the graduates’ scores on the test would correlate highly
with a measure of their future performance on the job.
Concurrent validity, on the other hand, refers to the degree to which data are able to
predict the behavior of subjects in the present and not in the future. An example of this may be
found in medical studies, particularly in psychiatry. A psychiatrist might use a measure to
establish whether a patient is schizophrenic. In this case, a patient’s scores on a psychiatric test
would correlate highly with his or her present behavior if the instrument does indeed yield data
that accurately represent this type of mental illness.
4.3.2 Factors threatening test validity
Once again, a test may show no validity, some validity, or perfect validity. There are several
factors that may threaten, or diminish the validity of test and instruments. These factors as
outlined by Gronlund (1985) include;
1. Unclear test direction
2. Confusing and ambiguous test items
3. Appropriateness of test items
4. Difficult items
5. Objectivity of the test
6. Using vocabulary too difficult for test takers
7. Overly difficult and complex sentence structures
8. Inconsistent and subjective scoring methods
9. Untaught items included on achievement tests
10. Failure to follow (standardized) test administration procedures
11. Cheating, either by participants or someone teaching the correct answers to the specific
test items; or identifiable pattern of answers
12. Improper length Test (too short/long) of the test and arrangement of items
13. In appropriate level of difficulty of the test items
14. Poorly constructed test items
15. Test items inappropriate for the outcomes being measured
36
4.4 Other characteristics of a good test
Other characteristics of a good test include administrability, scorability, interpretability, and
economy
4.4.1 Administrability
Administrability is a characteristic of measuring instrument involving
• The ease with which an examiner may understand and present the instructions for the test
• The ease in which the individuals tested comprehend how they are to proceed
• The efficiency with which the test may be scored (Good, 1973)
A good test is administered with ease, clarity and uniformity. Test procedures are standardized so
as to achieve uniformity of procedures in administering the test. Testing conditions are controlled
in such a way that they are the same for all examiners. This is done so that the scores obtained by
them are comparable.
In order to secure uniformity of testing conditions the test contractor provides detailed directions
of administering the test. Time limits, oral instructions to the examinees and sample items for
demonstration are specified. Definite provision for preparation, distribution and collection of test
materials and other factors are made.
To ensure administrability of test, directions should be made simple, clear and concise. Test
items should be introduced by sample items and illustrated by practice exercise. The test format
should not be difficult to read, recording their answers or in moving from one page or part of the
test to the next. The size of the page, length of line size and style of type or illustration should be
made such that they facilitate test administration.
4.4.2 Scorability
Scorability is a criterion used in judging tests. It refers to degree of objectivity possible,
directions provided, time involved and simplicity of procedures (Good, 1973, p.519)
A good test is easy to score. Test results should be easily available to both the student and the
teacher so that proper remedial and follow up measures and curricular adjustment can be made.
However if test results become available only after a considerable time they lose their usefulness
for both learner and teacher. Tests are easy to score when the directions for scoring are simple
and clear.
37
Review Questions
1. Distinguish between reliability and validity of a test
2. Outline the factors that impinge on test reliability
3. Explain the methods of assessing the reliability of a test
4. Describe the types of test validity
5. Outline the factors that threaten the validity of a test
6. Describe administrability and scorability as characteristics of good tests.
Gronlund, N.E & Linn, R.L (1990). Measurement and evaluation in teaching. (6th Ed). New
York: Macmillan Publishing Company.
Oriondo, L.L & Dallo-Antonio, E.M. (1989). Evaluating educational outcomes: Tests,
measurement and evaluation. Manila, Philippines: REX book Store.
38
5 CHAPTER FIVE: INTRODUCTION TO STATISTICS
Learning Objectives
By the end of this chapter the learner should be able to:
1. Give the meaning of statistics and state the importance and limitations of statistics
2. Explain the sub-divisions of statistics
3. Identify the scales of measurement and determine which one is applicable for given
variables
4. Give the meaning of Discrete and Non-discrete data/variables citing relevant examples.
5. Identify Primary and Secondary data and describe the methods of data collection
5.1 Introduction
Definitions of statistics
1) Statistics is the science of collecting, organizing, presenting, analyzing, and interpreting data
to assist in making more effective decisions.
2) Statistics is also the science of data which involves collecting, classifying, summarizing,
organizing, analyzing and interpreting numerical information.
3) Statistics is the branch of scientific inquiry, which provides for collecting data (sampling and
experimental design), organizing and summarizing data (graphs and tables), and statistical
inference (making generalizations to a larger population based on observations from a
sample).
Statistical approach to a problem may be broadly summarized as follows:
a) Collection of facts – this is the first stage in the statistical treatment of the problem.
Assembling of facts is thus a very important process. Always ensure that data collected is
accurate, reliable and thorough.
b) Organization of facts – at times data can be large and therefore there is need to condense
it through organization, classification, tabulation and presentation of the data in a suitable
form.
d) Interpretation of facts – this is done through judgment and inference drawn from the
sample.
39
5.2 Importance of Statistics
i. Statistics permits summarization and presentation of large quantities of information
ii. It helps to undertake and understand research in our areas of interest
iii. Used in government to formulate policies and administration. E.g. the National
Population and Housing census.
iv. Help businesses in decision making by making future estimates and expectations
v. It enables us to formulate and test a hypothesis (statistically assess a statement)
vi. Can you think of more?
For example, when two people are in love and taking time to know each other (courtship), they
are collecting data of each other to arrive at a conclusion. We may find our partner to be kind,
honest hearted, etc. This is all data, which comes in different forms and it is collected for various
reasons and purposes.
b) Statistics deals with aggregate of facts and no importance is attached to individual items.
It is always suitable to problems where group characteristics are desired to be studied.
c) Statistical data is only appropriately and not mathematically correct. Sampling techniques
allows observation of a limited number of items hence gives an estimate to desired
results. Thus statistics fails when exactness is essential.
d) Statistics can be used to establish wrong conclusions and thus can be used only by
experts.
e) Liable to be misused – for example, opinion polls
40
1. Descriptive statistics and inferential statistics.
Descriptive statistics involves description of data. It deals with methods of organizing,
summarizing, and presenting data (by use of graphs, charts such as bar charts and pie charts) in
an informative way.
Inferential statistics deals with methods used to find out something about a population, based on
results from a sample. This is where generalizations are made about the entire population on the
basis of the sample of results. Inferential statistics is more emphasized than descriptive statistics
because it is important in decision making and that it acknowledges the potential for errors that
may be involved in making generalizations.
1. Nominal Scale
In the nominal scale of measurement, numbers are used simply as labels for groups or classes. If
our data set consists of red, orange, yellow green and blue items, we may designate red as 1,
orange as 2, yellow as 3 green as 4 and blue as 5. In this case the numbers 1, 2, 3, 4 and 5 stand
only for the category for which a data point belongs. “Nominal” stands for “name” of category.
The nominal scale of measurement is used for qualitative rather than quantitative data: red,
orange, yellow, green, blue; male, female; professional classification; geographic classification,
and so on.
2. Ordinal Scale
In the ordinal scale of measurement, data elements may be ordered according to their relative
size or quality. For example five products may be ranked by a consumer as 1, 2, 3, 4 and 5;
where 5 is the best and 1 is the worst. In this scale, we do not know how much better one product
is than others, we only know that it is better.
3. Interval Scale
In the interval scale of measurement, we can assign meaning to distances between any two
observations. The data are in the interval of numbers, and distances between elements can be
41
measured in units. For example in 2003, the mean KCSE score for Kamwamu Secondary School
was 8.345. In 2004 it was 8.764. These numbers are in an interval scale since they provide
ranking of the performance and the arithmetic operation of addition and subtraction are
meaningful.
4. Ratio Scale:
The ratio scale is the strongest scale of measurement. Here not only do the distances between
pairs of observations have a meaning, but also there is a meaning to ratios of distances. Salaries
are measured on a ratio scale; a salary of Kshs. 85,000 is twice as large as a salary of Kshs.
42,500. Such a comparison is not possible with temperatures; which are on an interval scale but
not a ratio scale (we cannot say that 30oC is twice as warm as 15oC). The ratio scale contains a
meaningful zero (0oc is not meaningful in this respect). Typical business data, such as revenue,
cost, and profit fall into the group of ratio data.
Self-test exercise
1. What is the scale of measurement for each of the following variables?
a) Student grade point averages
b) Distance students travel to class
c) Students’ scores on the first statistics test
d) A classification of students by district of birth
e) A ranking of students by year of study
2. What is the scale of measurement for these items related to the Newspaper business?
a) The number of papers sold each Sunday during 2004
b) The number of employees in each of the departments, such as; editorial, advertising,
sports, etc.
c) A summary of the number of papers sold by District
d) The number of years each employee has been working for the institution.
42
If a variable can assume only one characteristic then it is called a Constant.
Qualitative
Quantitative
For example, family size, Number of students in a class, Number of passengers in a bus, Number
of worshippers in a church etc.
43
A summary of the classification of data is shown in the figure I below;
Data
Qualitative Quantitative
Discrete Continuous
Secondary data refers to data collected from existing published or unpublished sources such as
official document, government publications, text books, journals etc.
There are 3 classes of secondary data:
- Continuous or regular data - this refers to data from regular publications such as monthly
data on rainfall, monthly data on treasury bills, monthly data on inflation etc.
- Periodical data – this is data collected and published over a period of time e. g
population census.
- Irregular data – this data cannot be predicted on the basis of time, e.g. publication of
journals, thesis, books etc.
44
Comparison between primary and secondary data
Primary data is preferred to secondary data due to the following:
i) Compiling errors might be made while collecting secondary data.
ii) In case of secondary data, one may use data out of context.
45
Review Questions
1. Give the meaning of statistics and state its importance and limitations.
2. Five ice cream flavors are rank ordered by preference. What is the scale of
measurement?
3. Which of the following variables are discrete and which are continuous?
a). Time taken to complete a project
b). Length of a safari to a game reserve
c). Number of rooms in a house
d). Age of a building
e). Volume of water in a tank
4. Explain the various sub-divisions of statistics
5. Identify the scales of measurement and determine their applicablility for given
variables
6. Give the meaning of Discrete and Non-discrete data/variables citing relevant
examples.
7. Identify Primary and Secondary data and describe the methods of data collection
Harper W. M (1991). Statistics (6th Ed.) Harlow England: Longman Group UK Ltd.
Saleemi N. A. (1997). Statistics simplified. Acme Press (Kenya) Ltd. Nairobi, Kenya
46
6 CHAPTER SIX: DATA COLLECTION, ORGANIZATION AND PRESENTATION
Learning Objectives
By the end of this chapter the learner should be able to:
This method is not convenient when data involves too many values to be listed.
Another method is using a Tally table. For example, in a statistics exam the performance of 40
students was as follows:
53, 52, 51, 56, 58, 60, 52, 54, 52, 63, 52, 56, 56,
66, 70, 52, 52, 70, 68, 52, 56, 52, 51, 59, 62, 63, 76,
52, 54, 56, 52, 50, 51, 56, 52, 60, 68, 72, 52, 54
47
Tally table
Exercise
The data below shows marks out of 30 for 50 students in a History Examination’s Continuous
Assessment Test. Construct a tally table for this data
10 12 11 15 16 15 14 20 21 19 18 17 16 14 15 13 15
11 21 19 12 14 15 20 19 18 16 16 17 18 18 16 14
15 14 12 11 17 19 13 12 9 10 17 16 18 19 16 17 9
48
6.3 Presentation of Data
Having collected data then next is to present in tables. For example, the marks obtained by 40
students in a Statistics examination can be presented in table form as follows:
Marks 50 51 52 53 54 56 58 59 60 62 63 66 68 70 72 76
No. of 1 3 12 1 3 6 1 1 2 1 2 1 2 2 1 1
students
However, one easier way to understand data is through diagrammatic presentation which
indicates very clearly any trends and patterns in the data. The most common diagrams are pie
and bar charts.
The gap between one bar and another should be uniform. In bar charts, it is the length of the bar
that matters and not the width.
Example;
The table below shows production of coffee in thousands of tonnes for the period 1970 to 1976.
Construct a bar chart;
49
6.3.2 Multiple Bar Charts
This type of bar chart is used when comparisons are to be made between more than one
characteristic. Bars representing the different characteristics are placed side by side.
Example;
The table below shows foreign exchange earnings in million shillings between Tourism and
Agriculture sectors of the economy for the period of year 2000 and year 2005.
Present this information by use of multiple bar charts.
50
6.3.3 Composite Bar Charts
This type of chart is used when we want to compare the same characteristics from different
sources. In this case a bar chart for the total is drawn and then broken down into components.
Example;
The table below shows earnings in million shillings for Domestics and Foreign Tourism. Present
this information in a composite bar chart.
51
6.3.4 Pie Charts
Information in a pie chart is represented in a circular figure with angles representing the various
values.
Example: A student spent Kshs. 1000 within a term in the following manner
190
so a p = x 3 6 0 = 6 8 .4 o
1000
350
fa re = x360 = 126o
1000
280
su g a r = x 3 6 0 = 1 0 0 .8 o
1000
80
to o th p a s te = x 3 6 0 = 2 8 .8 o
1000
100
b e v e ra g e = x360 = 36o
1000
52
The pie chart is as given below;
Grouping Data
One way of organizing data is by means of classification. Classification is the grouping of related
facts into different classes.
Facts in one class differ from those in another class with respect to some characteristics called a
Basis of classification
There are four basis of classification namely:
i) Geographical
Data is classified on the basis of geographical or locational differences between the various items
e.g. production of maize may be classified according to counties or districts.
ii) Chronological
Refers to data that has been observed over a period of time e.g. a company may classify sales
figures according to years.
53
iii) Qualitative
Data is classified on the basis of some characteristics that is not measurable on a quantitative
scale or cannot be expressed numerically e.g. sex, marital status, colour of hair etc.
iv) Quantitative
Refers to classification of data according to some characteristics that can be measured or
enumerated e.g. Height, weight, income, sales, age etc.
5-9 12
10 - 14 16
15 - 19 20
20 - 24 14
25 - 29 10
30 - 34 6
35 - 39 2
TOTAL 80
A symbol defining a class such as 25-29 in the table above is called a class interval. The end
numbers i.e. 25 and 29 are referred to as the class limits.
Since the salaries are in thousands then a salary of 24501 belongs to this class 25-29. Also
somebody earning 24600 or 24800 will also belong to the class of 25-29. i.e. 24501- 29500
belong to 25-29.
The two dividing points between the class of 10-14 and 15-19 is 14.5. Such dividing lines such
as 14.5, 19.5, 24.5, and 29.5 etc. are called class Boundaries, or true class limits and can be
obtained as follows;
14 + 15 29 29 + 30 50
= = 14.5 = = 29.5
2 2 2 2
54
The lower class boundary of class 25-29 is 24.5 while the upper class boundary is 29.5. The class
size or width of any class interval is the difference between the upper and the lower class
boundaries of the class.
For the class 25-29 the upper and the lower class boundaries are;
Lower = 24.5 upper = 29.5
The difference between the upper and lower class boundaries is the class width; that is
The class mark or midpoint is the mid points of the class. For the 25-29 and 10 - 14 class
intervals, the midpoints are obtained as follows;
25 + 29 54
= = 27
2 2
10 + 14 24
= = 12
2 2
Marks 50 51 52 53 54 56 58 59 60 62 63 66 68 70 72 76
No. of 1 3 12 1 3 6 1 1 2 1 2 1 2 2 1 1
students
1. Determine the largest and the smallest numbers in the raw data. In this case 50 and 76,
and compute the range as follows;
Range = largest value - smallest value = 76 -50 =26
2. Divide the range into convenient number of class intervals having the same size. In
normal practice we use classes of between 5 and 20
26
Number of intervals = = 5.2 6
5
3. Determine the number of observations falling into each class interval and form a grouped
frequency distribution using class size of 5
55
Marks Tally No. of students
60 - 64 //// 5
65 -69 /// 3
70 - 75 /// 3
75 -79 / 1
Having organized data in a frequency distribution as shown above then the next presentation of
data is by use of diagrams namely histograms and frequency polygons.
6.5 Histograms
Frequency histogram consists of a set of rectangles having
a) Basis on a horizontal axis with centres at the class mark and length equal to the class
interval size.
b) Areas proportional to class frequencies. For example, the table below gives a distribution
of masses of 64 students in a town college. Present this by use of a histogram and a
frequency polygon.
frequency 8 20 22 10 4
50-54 8 52
55-59 20 57
60-64 22 62
65-69 10 67
70-74 4 72
56
II: The resulting histogram and frequency polygon are as shown below;
Histogram
20
16
Frequency Polygon
12
Frequency
Amended frequency
The table below gives a frequency distribution for wages of 45 employees of KKV Company.
Present the information by use of a histogram and a frequency polygon
Wages (‘000) 15 - 19 20 - 29 30 - 34 35 - 39
frequency 5 18 8 4
Class 20-29 is of size 10 while other classes have size 5. Since the size is bigger we have to
amend the frequency for this class.
18 x5 90
Amended frequency = = =9
10 10
57
Wages frequency Mid-points
15 – 19 5 17
20 – 29 9 25
30 – 34 8 32
35 - 39 4 37
The resulting histogram and frequency polygon will thus be as shown below
16
12
Frequency
Mass (kg)
58
6.6 Cumulative Frequency
The total frequency of all values less than the upper class boundary of a given class interval is
called the cumulative frequency up to and including class intervals.
From the previous example involving 45 employees of KKV Company, cumulative frequency
can be tabulated as shown below;
Frequency 5 20 15 5
Cumulative frequency 5 25 40 45
frequency 8 20 22 10 4
45-49 0 49.5 0
50-54 8 54.5 8
55-59 20 59.5 28
60-64 22 64.5 50
65-69 10 69.5 60
70-74 4 74.5 64
59
The resultant cumulative frequency polygon or ogive is as shown below;
60
Review Questions
1. The table below shows the areas in millions of square kilometers of the oceans of the world.
Graph the data using a bar chart
Ocean Pacific Atlantic Indian Antarctic Arctic
Area Million KM2 183.4 106.7 73.8 19.7 12.4
2. The following table shows the numbers of agricultural and non-agricultural workers in ADC
Farm for the years, 2000–2007.
Year 2000 2001 2002 2003 2004 2005 2006 2007
Agricultural workers 3.7 4.9 6.2 6.9 8.6 9.9 10.9 11.6
millions
Non-Agricultural 1.7 2.8 4.3 6.1 8.8 13.4 18.2 25.8
workers millions
Graph the data using
i) Bar charts
ii) A composite bar chart.
3. The following table shows the birth and death rates per 1000 people in Copa land.
Graph the data using an appropriate type of graph.
Year 2002 2003 2004 2005 2006 2007
Birth rate per 1000 people 25.0 25.0 23.7 21.3 18.9 16.9
Death rate per 1000 people 13.2 13.2 13.0 11.7 11.3 10.9
4. Based on sales during a recent year, the following data represent the market shares (in
percent) held by the leading producers of soft drinks sold in mango Republic.
Soft Drink Producers Market Share (%)
Coca-Cola 39.6
Pepsi-Cola 29.4
7 – Up 6.0
D. Pepper 6.1
Royal Crown 4.5
Crush 1.4
Softa 0.9
All others 12.1
Present this information in a pie chart.
61
5. The data below shows the ages (in years) of 50 employees of a small company
26 57 41 38 19 20 37 58 33 37 24 29 40 30 23 27 27 25 48 32 28 43 62 27 54 42 23 35 18 31
49 34 46 47 52 36 28 36 19 29 40 44 42 37 21 31 39 34 32 39
Construct a frequency table with classes of size 5 starting from 15 years.
6. The monthly salaries of 87 employees of a supermarket were rounded to the nearest Kenya
pounds. They ranged from a low of Kenya pound 1,041 to a high of Kenya pounds 2,348.
(a) Suppose we want to condense the data into 7 classes. Using the same interval for each class
determines a suggested class interval.
(b) What class interval would be easier to work with?
(c) What are the class limits for the first class? The next class?
7. The following table shows the diameters in millimeters of a sample of 60 ball bearings
manufacture by a company. Construct a frequency distribution of the diameters using
appropriate class intervals.
7.38 7.28 7.45 7.33 7.35 7.32 7.31 7.39 7.32 7.25
7.29 7.37 7.36 7.30 7.32 7.37 7.36 7.34 7.35 7.26
7.43 7.36 7.42 7.32 7.35 7.31 7.33 7.27 7.44 7.36
7.40 7.35 7.40 7.30 7.27 7.46 7.39 7.36 7.40 7.35
7.36 7.24 7.28 7.39 7.34 7.35 7.41 7.30 7.41 7.29
7.33 7.38 7.34 7.32 7.35 7.34 7.37 7.35 7.42 7.38
(b) Construct
i. A histogram
ii. A frequency polygon
iii. A relative frequency histogram
iv. A relative frequency polygon
v. A cumulative frequency distribution
Saleemi N. A. (1997). Statistics simplified. Acme Press (Kenya) Ltd. Nairobi, Kenya
62
7 CHAPTER SEVEN: MEASURES OF CENTRAL TENDENCY
Learning Objectives
By the end of this chapter the learner should be able to:
1. Define the three measures of central tendency
2. Compute the mean, median and mode for ungrouped, frequency distributions and
grouped data
3. Determine the weighted and combined mean from given data.
4. Outline the merits and demerits of each of the three measures of central tendency
7.1 Introduction
For the purpose of statistical decision making, it is essential to extract from the data important
facts which summarize essential information. This summarizing of numbers is called Statistics if
the information is from a sample.
Consider the following;
∑X k = X 1 + X 2 + X 3 ,K , X k
63
A measure of central tendency is a single value within the data that is used to represent all the
values in the population.
There are three common measures of central tendency, namely; Mean, Mode and Median.
X 1 + X 2 + X 3 ,K , X N
Mean = X =
N
X 1 + X 2 + X 3 ,K , X N
Mean = X =
N
∑X
Mean = X =
N
Examples:
Determine the mean of the following set of numbers
a) 40 45 70 65 68
b) 21 25 33 45 12 54 34
64
Solution;
40 + 45 + 70 + 65 + 68 288
a) X = = = 57.6
5 5
21 + 25 + 33 + 45 + 12 + 54 + 34 224
b) X = = = 32
7 7
7.2.2 Mean from Frequency Distribution
In case of a frequency distribution; suppose the observations; x1, x2,... xk occur with frequencies
f1, f2,..., fk respectively, then their mean is given by;
Mean = x =
∑xf =
∑ xf
n ∑f
Example 1
The masses of 20 students were found as follows:
Mass ( x ) 58 59 60
No. of students ( f ) 6 10 4
x f xf
58 6 348
59 10 590
60 4 240
∑ 20 1178
Mean = x =
∑xf
n
1178
X= = 58.9
20
Example 2
Find the mean, x for the data in the table below;
x 455 560 490 516 534 552
f 9 10 11 8 20 5
65
Solution;
x f xf
455 9 4095
560 10 5600
490 11 5390
516 8 4128
534 20 10680
552 5 2760
∑ 63 32653
Mean = x =
∑xf thus; X=
32653
= 518.3016
n 63
Illustrations;
1. The table below shows the heights of 100 seedlings in millimeters.
Height in mm Mid-point x f xf
0–4 2 40 80
5–9 7 48 336
10 - 14 12 10 120
15 - 19 17 2 34
∑ 100 570
Mean = x =
∑xf thus; X=
570
= 57
n 100
66
2. The frequency distribution for marks obtained by 45 pupils of Yala Pre-school is as
shown below;
∑ 45 3185
∑ Xf 3185
Mean = X = X= = 70.78
N 45
3. The temperature outside a factory was monitored at regular interval on 80 occasions. The
frequency distribution is as follows
67
Solution
Temperature, x (oC) Frequency, f Mid points ( x ) xf
30.0 – 30.2 6 30.1 180.6
30.3 – 30.5 12 30.4 364.8
30.6 – 30.8 15 30.7 460.5
30.9 – 31.1 20 31.0 620.0
31.2 – 31.4 13 31.3 406.9
31.5 – 31.7 9 31.6 284.4
31.8 – 32.0 5 31.9 159.5
80 2476.7
Mean = x =
∑xf thus; x =
2476.7
= 30.959
n 80
7.2.4 Weighted mean
w1 x1 + w2 x2 + w3 x3 +,K , + wk xk
Mean = X = Is called the weighted mean
w1 + w2 + w3 +,K , wk
Illustration
The table below shows the performance of a candidate in an end of semester examination at a
local University. The associated weight of each unit is also given.
Unit Math 213 Math 214 Math 216 Phy. 201 Phy. 204 Educ. 262 Educ. 266
Marks 72 64 82 75 80 88 90
68
1x72 + 0.8 x64 + 0.6 x82 + 0.5 x75 + 0.7 x80 + 0.4 x88 + 0.4 x90
Mean = X = = 76.614
1 + 0.8 + 0.6 + 0.5 + 0.7 + 0.4 + 0.4
7.2.5 Combined mean
Quite often we may wish to compute the mean for a large group from the means of small groups,
which make up the large group. For example in a school having several streams per class, if we
know the mean mark in a particular subject for each stream, we may wish to compute the mean
mark in that subject for all the students in the whole class. The formula for the combined mean is
given as;
The combined mean of two populations is given by
NAX A + NB X B
XC =
NA + NB
Where
NA, is the size of the population A
NB, is the size of the population B
N A X A + N B X B + NC X C
Combined mean = X =
N A + N B + NC
69
Example 2
A class of 250 students was divided into groups A and B. The 2 groups were given the same test.
The mean of group A of 100 students was 15, but the mean of the whole group was 15.6.
What was the mean of group B?
Solution
Thus, X B = 16
Example 3
In a school with 4 streams in Form One Class i.e. A, B, C and D each having the following
population A = 45, B = 40, C = 40 and D =50. The average performance (mean score) of each
stream in a term test was A = 82, B =72, C = 76 and D =50.
Calculate the mean performance of the entire form one class.
Solution;
class Mean, x n n. x
A 82 45 3690
B 72 40 2880
C 76 40 3040
D 50 50 2500
175 12110
N A X A + N B X B + N C X C + N D X D 12110
Combined Mean, X = = = 69.2
N A + N B + NC + N D 175
NOTE: If N1 number has mean X 1 , N 2 number has X 2 and N K has X K then the overall mean of
all the numbers is;
N1 X 1 + N 2 X 2 + ... + N K X K
X=
N1 + N 2 + ... + N K
=
∑N X
∑N
70
Activity
The mean monthly salary paid to all employees in a company was Kshs 5,000. The monthly
salary of male and female employee’s averaged Kshs 5200 and Kshs 4200 respectively.
Determine the percentage of males and females employed by the company.
Mrs. Akelo a primary school teacher was hired by Mt. Kenya University to compute the mean
mark of 32 second year students in a Statistics Examination. She found the mean to be 72.24.
However she later realized that she had erred by entering 90 and 88 marks instead of 40 and 38
respectively.
x =
∑x n = 32
n
72.24 =
∑X
32
2211.68
Thus; correct mean, X = = 69.115
32
Self Exercise
1. Compute the mean hourly wage paid to carpenters who earned the following wages
Kshs. 154.00, Kshs. 201.00, Kshs. 187.50, Kshs. 227.60, Kshs. 306.70, and Kshs. 180.00.
2. The Kaimba Power and Light Company selected 20 residential customers at random. The
following are the amounts in Kshs, the customers were charged for electricity last month
54750, 482.30, 587.35, 507.25, 252.70
475.80 758.40, 464.60, 601.20 706.85
679.10 682.30, 395.65, 355.40, 561.20
659.90 329.80 628.15 654.85, 673.35
Compute the mean.
71
3. The Toro Construction Company pays its hourly employees Kshs. 65, Kshs. 75 or Kshs. 85 per
hour. There are 26 hourly employees, 14 are paid at the Kshs. 65 rate, 10 at the Kshs. 75 rate,
and 2 at the Kshs. 85 rate. What is the mean hourly rate paid to the 26 employees?
4. The next monthly incomes of a sample of large importers of motor vehicles were organized
into the following table:
Net income in Kshs. Millions Number of Importers
2–5 1
6-9 4
10 – 13 10
14 – 17 3
18 - 21 2
(a) What is the table called?
(b) Based on the distribution, what is the estimate of the arithmetic mean net income?
It is therefore the middle value of observations when arranged either in ascending or descending
order if the number of observations N is odd.
If N is even, then median is the arithmetic mean of the two middle values
For example the set
a) 4, 5, 7, 7, 9, 10, 11, 12, 15 has median 9
8 + 10
b) 6, 7, 8, 10, 11, 12 has median = =9
2
Now the median is the value below which half of the values lie and above which the other half of
the values lie.
For grouped data where the raw data has been organized into frequency distribution, median is
obtained from the formulae;
72
N −( f )
Median = L1 + 2
∑ mc
fm
Where
N , is the total frequency
L1 , is the lower class boundary of the median class
c , is the size of the median class interval
f m – Frequency of the median class
(∑ f ) m
- Sum of frequencies of all classes lower than the median class
Illustration 1
The table below gives a frequency distribution for salaries of 50 workers at K & K Company.
frequency 4 12 20 9 5
Cumulative frequency 4 16 36 45 50
N −( f )
Median = L1 + 2
∑ mc
fm
Median class is 30 - 39
50
2 − 16
median = 29.5 + x10 = 34
20
73
Illustration 2
The temperature outside a factory was monitored at regular interval on 80 occasions. The
frequency distribution is as follows;
Temperature, 30.0-30.2 30.3-30.5 30.6-30.8 30.9-31.1 31.2-31.4 31.5-31.7 31.8-32.0
x (oC)
Frequency, f 6 12 15 20 13 9 5
State the Median class: 30.9 – 31.1 (the class at the middle)
Solution;
30.0 – 30.2 6 6
30.3 – 30.5 12 18
30.6 – 30.8 15 33
30.9 – 31.1 20 53
31.2 – 31.4 13 66
31.5 – 31.7 9 75
31.8 – 32.0 5 80
80
N −( f )
Median = L1 + 2
∑ mc
fm
80 − 33
Median = 30.85 + 2 0.3
20
= 30.85 + 0.105
= 30.955
74
Self Exercise
1. The following is the percentage change in net income from 1997 to 1998 for a sample of 12
construction companies in Kenya 5, 1, -10, -6, 5, 12, 7, 8, 2, 5, -1, 11. Determine the median
change.
2. A sample of daily production of meat at KMC was organized into the following distribution.
Estimate the median daily production
Daily Production Frequency
80 – 89 5
90 – 99 9
100 - 109 20
110 -119 8
120 - 129 6
130 - 139 2
b) 10, 12, 13.2, 13.2, 14, 16, 16, 17, 19 has two modes 13.2 and 16 and is called
bimodal.
c) 4, 8, 5, 8, 9, 5, 6, 2, 9, 3,2, 7 has three modes 5,8and 9 and is called trimodal
d) 3, 5, 6, 7, 8, 9 has no mode
In the case of grouped frequency data the mode will is obtained using the formula
∆1
Mode = L1 + c Where;
∆1 + ∆ 2
∆1 , is the excess (difference) of modal class frequency over frequency of the lower class.
∆ 2 , is the excess of a modal class frequency over frequency of the next higher class
75
Example
The temperature outside a factory was monitored at regular interval on 80 occasions. The
frequency distribution is as follows
∆1
Mode = L1 + c
∆1 + ∆ 2
5
= 30.85 + x 0.3 = 30.85 + 0.125 = 30.975
12
Self exercise
1. The net monthly salary of lecturers in 15 selected universities in a certain country is as shown
below in Kshs.
35,000, 49,100, 60,000, 60,000, 40,000, 58,000, 60,000, 60,000, 40,000, 65,000, 50,000, 60,000,
71,400, 60,000, 55,000.
What is the modal monthly salary?
2. The number of work stoppages in the automobile industry for selected months is 6, 0, 10, 14, 8
and 0. What is the modal number of work stoppages?
76
3. Listed below are the total automobile sales in (thousands) in Kenya for the last 14 years. What
are the modes during this period; 9.0, 8.5, 9.1, 10.3, 11.0, 11.5, 10.3, 10.5, 9.8, 9.3, 8.2, 8.2, and
8.5?
4. An automatic machine that fills containers appears to be performing erratically. A check of
weights of the contents of a number of cans revealed
Weigh (grams) Number of cans
130 - < 140 2
140 - < 150 5
150 - < 160 20
160 - < 170 15
170 - < 180 9
180 - < 190 7
190 - < 200 3
200 - < 210 2
Obtain an estimate of the modal weight.
Demerits
1. It is affected by extreme observations
E.g. the salaries of 3 employees is as follows; 45,000, 60,000 and 940,000. The mean salary is
77
Merits of median
1. It is easy to identify and calculate.
2. It is not affected by extreme values
3. It is suitable for distribution with open ended classes
4. Can be used for qualitative characteristics
Demerits
1. Sometimes the median is not exact e.g. when there are even observations
2. It is not based on each and every item in the distribution
3. It is affected by fluctuations in sampling
Merits of mode
1. Not affected by extreme observations unlike the arithmetic mean
2. Can be conveniently obtained in case of open ended class
3. It is easy to understand and calculate
Demerits
1. It is not always unique depending on the distribution e. g bimodal or trimodal etc.
2. It is not based on all the observations
3. As compared to median it is affected by fluctuations in sampling
Clearly then, the choice of the average in any given case must be determined by the nature of the
data and the purpose to be served by the average. Remember that a single average value is
designed to replace the detail, yet at the same time to provide an outline of that detail. The
selection of the average will thus depend on which measure fulfills this requirement most
adequately. Since the three averages (mean, mode, median) comprise rather different concepts,
the data may be such as to warrant the use of all the three. In practice, the mean is a firm favorite
in so far as it is so readily computed and understood; generally speaking it should be used instead
of the others. But either the median or even the mode will be preferable if the generalization
concerning midpoints in the calculation of the mean is unjustified, or the mean is seriously
affected by extreme values.
78
Review Questions
1. The reaction time of an individual to a certain stimuli was measured by a psychologist to be
0.53, 0.46, 0.50, 0.49, 0.53, 0.44 and 0.55 seconds respectively.
Obtain the mean reaction time of the individual to the stimuli.
2. The mean of 80 observations is found to be 38.80. At the time of computation of the mean, two
observations are wrongly taken as 17 and 69 instead of 71 and 96 respectively. Compute the
correct mean
3. The mean annual salary paid to all employees in a company was 1500 Kenya pounds. The
mean annual salaries paid to male and female employees were 1560 Kenya pounds and 1260
Kenya pounds respectively. Determine the percentages of males and females employed by the
company.
3. Find the mean x for the data in the table below;
x 462 480 498 516 534 552 570 588 606 624
f 98 75 56 42 30 21 15 11 6 2
4. Find the mean, median and mode for the set of numbers
(a) 7, 4, 10, 9, 15, 12, 7, 9, 7
(b) 8, 11, 4, 3, 2, 5, 4, 10, 6, 1, 10, 8, 12, 6, 5, 7.
5. The table below shows the distribution of the maximum loads in Kilonewtons supported by
certain cables produced by a company.
Maximum load KN Number of cables
93 – 97 2
98 – 102 5
103 – 107 12
108 – 112 17
113 – 117 14
118 – 122 6
123 – 127 3
128 - 132 1
Determine
(a) The mean maximum load (b) The median maximum load (c) The mode maximum load
Saleemi N. A (1997). Statistics simplified. Acme Press (Kenya) Ltd. Nairobi, Kenya
79
8 CHAPTER EIGHT: MEASURES OF DISPERSION
Learning Objectives
By the end of this chapter the learner should be able to:
4. Determine the range, standard deviation, variance and Coefficient of Variation from
given data.
Various measures of central tendency i.e. mean, mode and median gives us one single value that
represents the entire data. But average alone cannot adequately describe a set of observations
unless all the observations are alike.
This states the actual amount by which the value of an item deviates from the measure of central
tendency. It is expressed in same units as frequency distribution.
2. Relative measures of dispersion
80
d) Should be affected as little as possible by fluctuations of sampling
e) Should not be unduly affected by extreme items
f) Should be amenable for further mathematical calculations
For the purpose of this course, only Range, Standard deviation, Variance and Co-efficient of
variation will be covered.
Where;
The range is the simplest and also the least reliable measure of dispersion. This is due to the fact
that it is based only on two extreme values. It thus does not take into account all the observations
and does not say anything about the distribution of values in relation to the central tendency. In
addition, it cannot be computed with open-ended classes.
Nevertheless, range is simple to calculate and understand.
81
Example
Find the range of the following data set: 7, 8, 9, 8, 9, 11, 15, 14, 16, 6, 15, 14.
We find that the lowest value is 6 and the highest value is 16. Thus the range is
16 – 6 = 10
The major characteristics of the range are:
i) Only two values are used in its calculation.
ii) It is influenced by extreme values
iii) It is easy to compute and to understand.
∑ (X i − X )2 ∑ X 2
s2 = or s2 = − ( X )2
n n
In case of grouped data or frequency distribution the following formulae are used to determine
variance.
∑ f ( X i − X )2 ∑ fX 2
s2 = or s2 = − ( X )2
n n
82
For ungrouped data, standard deviation is given by the following formulae;
∑(X i − X )2 ∑X 2
s = or s= − ( X )2
n n
In case of grouped data or frequency distribution the following formulae are used to determine
standard deviation.
∑ f (X i − X )2 ∑ fX 2
s = or s = − ( X )2
n n
Example 1
Calculate the variance and standard deviation of; 30, 32, 38, 26, 40
Solution
n=5
30 + 32 + 38 + 26 + 40
X= = 33.2
5
X −X (X − X ) X2
2
X
∑ (X i − X )2
Variance = s 2 =
n
132.8
Thus; S2 = = 26.56
5
83
Using the second equation
∑ X 2
Variance = s 2 = − ( X )2
n
5644
S2 = − ( 33.2 ) = 26.56
2
For grouped data X is the class mark or the mid-point of the class.
Example 2
Marks x 50 - 54 55 - 59 60 - 64 65 - 69
Frequency f 4 10 15 8
Solution;
Marks Mid-point, x f fx X2 fX 2
Mean = x =
∑xf X=
2244
= 60.65
n 37
∑ fX 2
Variance = s 2 = − ( X )2
n
136878
Variance = S 2 = − ( 60.65 )
2
37
S 2 = 3699.41 − 3678.26 = 21.15
84
Example 3
For the data below, compute the range and standard deviation
Observations ( x ) 0 1 2 3 4 5 6 7 8
Frequency, f 1 9 26 59 72 52 29 7 1
Cumulative Frequency 1 10 36 95 167 219 248 255 256
Solution
a) Range;
Extreme values are 0 and 8 Range = 8 – 0 = 8
b) Standard deviation
x f f i xi xi − x f i | xi − x | f i ( xi − x ) 2
Where x =
∑xf =
1017
= 3.973
n 256
∑ f (X i − X )2
506.856
Standard Deviation, s = δ= = 1.40709 = 1.407
n 256
85
Example 4
We have two sets of data with same mean as shown below. Let us calculate the standard
deviation of the two sets and see how measures of dispersion help us to check for dispersion or
spread-ness of data.
a). The set 26, 27, 28, 29, 30 has mean = 28 and n=5, Using the standard deviation formula,
∑ (x − x)
2
i
s= i =1
n −1
(26 − 28) 2 + (27 − 28) 2 + (28 − 28) 2 + (29 − 28) 2 + (30 − 28) 2
= 4 + 1 + 0 + 1 + 4 = 10
10
Thus, s = = 2.5 = 1.581
4
b). And 5, 19, 23, 33, 60, has mean = 28, with n=5
We calculate the value for ( x − x ) 2 , where x =28 as
1684
Thus, s = = 421 = 20.518
4
Even though the two sets of data have the same mean value, the 1st has a much smaller standard
deviation than the 2nd set of data due to spread-ness of the 2nd set. The second set implies that the
data is much spread from the measure of central tendency than the first set.
The standard deviation measures how data is spread or dispersed from the mean value. In
calculating standard deviation, each value of x is subtracted from the mean value, that
is, ( xi − x ) .2
86
8.7 Relative Dispersion (Coefficient of Variation)
o The mean is a measure of central tendency or “centered-ness”
o Standard deviation is a measure of dispersion or “spread-ness”
Suppose we want to know how spread out the data is relative to the mean. Then a quantity,
Coefficient of Variation (CV) is used. It is defined as;
Standard deviation
C.V = x100%
mean
o A large C.V means that data is relatively spread out from the mean while a small C.V means
that data is relatively concentrated closely around the mean
o If C.V = 0, all the data values are the same and are exactly at the mean.
o The C.V indicates the relative magnitude of the standard deviation as compared with the mean
of the distribution measurements
Example; you are supplied with the following data about the performance of two schools
(Bahati and Tumaini) in the 2012 KCSE examination.
Bahati Tumaini
Variance of Wages 81 36
Solution
Bahati Tumaini
9 6
thus, C.V = = 0.9165 thus, C.V = = 0.7143
9.82 8.40
There is thus a greater variability in performance in Bahati than in Tumaini. In other words the
marks obtained at Bahati are less homogenous or less consistent.
87
8.8 Importance of variance and standard deviation
Self-Exercise
1. Compute the variance and the standard deviation for the data
(a) 11, 6, 10, 6 and 7
(b) 28, 32, 24, 46, 44, 40, 54, 38, 32 and 42.
2. A sample of eight companies in the aerospace industry had the following returns on
investment last year; 10.6, 12.6, 14.8, 18.2, 14.8, 12.2 and 15.6. Compute the mean return and
the standard deviation of the returns.
3. Trout, Inc. feeds fingerling trout in special ponds and markets them when they attain a certain
weight. Samples of 10 trout were isolated in a pond and fed a special food mixture, designated
RT – 10. At the end of the experimental period, the weights of the trout were (in grams):
124, 125, 125, 123, 120, 124, 127, 125, 126 and 121
Determine the sample variance and standard deviation
88
Review Questions
1. Give definitions of the following terms used in statistics
a). Range; Standard deviation;
2. For 108 randomly selected Nairobi residents, the weight frequency distribution is
40 – 48 6
49 – 57 22
58 – 66 43
67 – 75 28
76 – 84 9
x f
4. The mean of 5 observations is 4.4 and the variance is 8.24. If three of the five observations are
1, 2 and 6, find the other two.
5. (a) The arithmetic mean and variance of a set of ten figures are known to be 17 and 33
respectively. Of the ten figures, one figure (i.e. 26) was subsequently found to be inaccurate, and
was weeded out. What is the resulting
i. Arithmetic mean?
ii. Standard deviation?
6. A distribution consists of three components with frequencies 200, 250 and 300 having means
89
25, 10, and 15 and standard deviations 3, 4, and 5 respectively.
Calculate
(i) The mean
(ii) The standard deviation.
7. The distribution of life- times of two models of refrigerators in a recent survey are given
below:
Life time Number of Refrigerators
No. of years Model A Model B
0-<2 5 2
2-<4 16 7
4-<6 13 12
6-<8 7 19
8 - < 10 5 9
10 - < 12 4 1
(a) What is the average lifetime of each model of these refrigerators?
(b) Which model has greater uniformity?
Saleemi N. A. (1997). Statistics simplified. Acme Press (Kenya) Ltd. Nairobi, Kenya
Yamane T. (1967). Statistics: An introductory analysis (2nd Edition). Harper and Row
Publishers, New York
90
9 CHAPTER NINE: THE CONCEPT OF THE NORMAL CURVE
Learning Objectives
By the end of this chapter the learner should be able to:
2. Define skewness and distinguish between Skewed to the right or positively skewed and
Skewed to the left or negatively skewed
9.1 Introduction
The measures of central tendency and dispersion discussed in Chapter Seven and Eight
respectively do not reveal the entire story about a frequency distribution. Further description of a
frequency distribution is necessary and is provided by measures of skewness and measures of
Kurtosis. Measures of skewness and measures of Kurtosis in a way tell us how a frequency
distribution would look like if presented diagrammatically by a smoothed frequency polygon.
Mean
Mode
Median
When a distribution is symmetrical about the mean, the mean and the median coincide i.e. they
are equal. The mode may also coincide with the mean on occasions where it exists. In this case
any of the three measures of central tendency mean, mode or median is as good as the other. In
91
this case the observations are arranged equally and symmetrically around a measure of central
tendency. Furthermore in symmetrical distributions the sum of the positive and negative
deviations from the mean, median or mode is zero. The shape of such a curve is always bell-
shaped.
Another example of symmetrical distribution is the bimodal distribution.
92
9.2.2 Skewed to the left
On the other hand, when the longer tail of a distribution extends to the left, that is a few
observations are extremely small, the distribution is said to be negatively skewed (skewed to the
left). When a distribution is skewed to the left, it contains a large number of relatively high
scores and a few extremely low scores. Generally the mean is less than the median, which in turn
is less than the mode.
Thus for a skewed distribution, the mean is different from the median, which in turn is different
from the mode.
In terms of skewness, a frequency curve can be
• Symmetrical – where the mean, median and mode have the same value
• Skewed to the right or positively skewed, that is, Mode < median < mean
• Skewed to the left or negatively skewed, that is, Mode > median > mean
Skewness is different from variation in that;
(i) Variation tells us about the amount of spread while skewness tells us about the direction
of spread.
(ii) In business and economic series, measures of variation have greater practical application
than measures of skewness.
(iii) The dispersion indicates how far the mean is representative of values. Skewness helps in
judging if the distribution is normal.
93
9.3 The concept of Kurtosis
Kurtosis indicates “peaked ness” of a distribution.
In describing a frequency distribution, we use an average to show the typical value or central
tendency in the distribution, a measure of variation to show the spread of values and a measure
of skewness to show the direction of the spread of values. The measure of kurtosis is the fourth
device in describing a frequency distribution and can be used to show the degree of
concentration of the values. When the values are more concentrated around the mode, we have a
peaked curve and when the values are widely spread from the mode in both directions we have a
flat-topped curve. Kurtosis refers to the convexity of a curve. It enables us to have an idea about
the flatness or the peakedness of the curve. The degree or kurtosis of a distribution is measured
relative to the peakedness of a normal curve.
Leptokurtic
Mesokurtic
Platykurtic
The measure of kurtosis can be used to show the degree of concentration of the values. When the
values are more concentrated around the mode, we have a peaked curve and when the values are
widely spread from the mode in both directions, we have a flat- topped curve.
94
Review Questions
1. What do you understand by the terms skewness and kurtosis? Point out their roles in
analyzing a frequency distribution.
3. Define skewness and distinguish between Skewed to the right or positively skewed
and Skewed to the left or negatively skewed
Harper W. M (1991). Statistics (6th Ed.) Harlow England: Longman Group UK Ltd.
Saleemi N. A. (1997). Statistics simplified. Acme Press (Kenya) Ltd. Nairobi, Kenya
95
10 CHAPTER TEN: CORRELATION AND REGRESSION ANALYSIS
Learning Objectives
4. Compute the Spearman Rank Correlation Coefficient and give an interpretation of the
values obtained;
In each of the examples given, changes in one variable are accompanied by changes in the other
variable;
Correlation can be zero, negative or positive
Correlation between two variables is simply the extent to which their values vary together
systematically. The primary objective of investigating the correlation between two variables is to
96
determine whether there is any causal connection between them. Furthermore, correlation
techniques are used in predicting the values of one variable from the values of another variable.
When only two variables are involved, we speak of simple correlation and when more than two
variables are involved, we have multiple correlation.
A useful method of investigating if there is any correlation between any two variables is by
drawing a Scatter diagram. The values of one variable, say Y (dependent variable) is measured
along the y-axis and plotted against corresponding values of another variable, say X
(independent variable) along the x-axis.
The following scatter diagrams arise;
97
98
10.1.1 Pearson’s product moment correlation coefficient
The degree of association between two variables (correlation) can be described by a visual
representation or by a number (termed a coefficient) indicating the strength of association. The
quantitative computation of the correlation was first derived in 1896 by Karl Pearson and is
referred to as “Pearson’s product moment correlation coefficient.” It is denoted by letter r.
Correlation coefficient varies between -1 and +1. That is -1 ≤ r ≤ +1
When r = 1, then there is a perfect positive linear relationship between the variables.
When r = 0, then there is no linear relationship between the variables.
When r = -1, then there is a perfect negative linear relationship between the variables.
When r is between -1 and 0, it indicates negative relationship between the variables. In this case,
when r is closer to -1, there is a strong negative relationship. Similarly, when r is closer to 0 but
negative, it implies a weak negative relationship.
When r is between 0 and +1, it indicates positive relationship between the variables. In this case,
when r is closer to +1, there is a strong positive relationship. Similarly, when r is closer to 0 but
positive, it implies a weak positive relationship.
Correlation Coefficient
The correlation coefficient computed from the sample data measures the strength and direction
of a linear relationship between two variables. The symbol for the sample correlation coefficient
is r. It is the statistic which is to measure the strength of relationships
Correlation coefficient r is given by;
r=
∑ (X i − X )(Yi − Y )
{∑ (X )(Y )}
1
− nX − nY
2 2 2 2 2
i i
n∑ xy − ∑ x∑ y
r=
Or
[{n∑x − (∑x) }{n∑ y − (∑ y) }]
2 2 2 2
1
2
99
Where:
n represents the number of pairs of data
(∑X)2 indicates that the X scores should be summed and the total squared. [Avoid
confusing ∑X2 (the sum of the X squared scores) and (∑X)2 (the square of the sum
of the X scores]
∑Y denotes the sum of all y-scores
∑Y2 indicates that each Y score should be squared and then those squares summed
(∑Y)2 indicates that the Y scores should be summed and the total squared
∑XY indicates that each X score should be first multiplied by its corresponding Y score
and the product (XY) summed
Example 1
Compute the Pearson’s product moment correlation coefficient (r) for the height-weight data of
students shown in the table below;
Height, 174 175 176 177 178 182 183 186 189 193
cm (X)
Weight, 61 65 67 68 72 74 80 87 92 95
kg (Y)
Solution;
n∑ xy − ∑ x∑ y
r=
[{n∑x − (∑x) }{n∑ y − (∑ y) }]
2 2 2 2
1
2
100
Height, cm (X) Weight, kg (Y) XY X2 Y2
n∑ xy − ∑ x∑ y
r=
[{n∑ x − (∑ x) }{n∑ y − (∑ y) }]
2 2 2 2
1
2
r=
(10 x138646 ) − (1813x761)
{10 x329069 − 18132 }{10 x59177 − 7612
6767
r=
6860.5342
r = 0.9864
There is a high positive linear relationship between the variables. This implies that there is a
strong positive relationship between the heights and weights of the students.
101
Example 2
The table below shows marks obtained by ten students selected randomly in two mathematics
tests done in a certain school term. Use the information to calculate the Pearson’s product
moment correlation coefficient, r and comment;
Student A B C D E F G H I J
Test 1 86 45 70 66 80 55 50 88 50 90
Test 2 32 76 40 40 30 50 92 28 78 25
Solution
n∑ xy − ∑ x∑ y
r=
[{n∑x − (∑x) }{n∑ y − (∑ y) }]
2 2 2 2
1
2
102
r=
(10 x 29976 ) − ( 680 x 491)
{10 x 48946 − 6802 }{10 x 29357 − 4912
−34120
r=
27060 x52489
−34120
r=
37687.562
r = −0.9053
There is a high negative linear relationship between the variables. This implies that there is a
strong negative relationship in performance of Test 1 and Test 2 among the ten students.
A high (or low) negative correlation has the same interpretation as a high (or low) positive
correlation. A negative correlation indicates that high scores in one variable are associated with
low scores in the other variable and vice versa.
103
10.1.2 Coefficient of Determination
Example
The following data refers to the amount of money spent by 10 customers who visited a
supermarket in a certain year and their social class index.
Calculate
i). Correlation coefficient
ii). Coefficient of determination
Solution
X Y XY X2 Y2
104
i). We use the Correlation coefficient formula;
n∑ xy − ∑ x∑ y
r=
[{n∑x − (∑x) }{n∑ y − (∑ y) }]
2 2 2 2
1
2
16888
= 1
{17484 x17316}2
16888
=
17399.797
= 0.9706
There is a high relationship between amount of money spent in supermarket and the social class
index (a positive relationship).
r2 = (0.9706)2 = 0.9420
This means that 94.2% of the variation of the social class (dependent variable) can be explained
by the variation of the amount of money spent in the supermarket every year (independent
variable), and 5.8% is determined by other factors
105
10.1.3 Spearman Rank Correlation coefficient
Correlation coefficient is calculated from the actual values of the variables X and Y in the
sample data. However, in some cases, relative orders of magnitude of these pairs of values are
more instructive than the values themselves. It is thus more useful to access the relationship
between the ranks of the two variables. In this case we use rank correlation.
The Rank correlation is also known as “Spearman Rank Correlation Coefficient”, applicable
when variables are ranked. It is given by;
6∑ d 2
R = 1− and d=u-v
n(n 2 − 1)
Stats 1 80 60 65 50 35 30 90
Stats 2 80 50 60 55 45 30 95
106
6∑ d 2
R = 1−
n(n 2 − 1)
6× 2
∑d 2
= 2;n = 7 ; R = 1−
7(49 − 1)
= 0.9643
There is a high degree of relationship between performances in the two subjects. The marks
obtained by the students in the two tests agree to a large extent or have a high positive
relationship.
Example 2
In a Drama festival involving five Primary schools, two adjudicators Jane and John awarded the
following marks.
Jane 84 80 72 70 78
John 88 76 78 74 82
6∑ d 2
R = 1− n=5
n(n 2 − 1)
6 x6 36
R = 1− R = 1− = 0.7
5 x(52 − 1) 120
The high positive value of R implies that the two adjudicators agreed to a large extent.
107
10.2 Regression Analysis
Regression analysis attempts to discover the nature of the relationship between the variables, and
does this in the form of an equation. The equation can be used to predict one variable given that
sufficient information about the other variable is available.
• The variable whose values are to be predicted is called the response variable or
dependent variable and in the scatter diagram it is conventional to plot this variable on
the Y-axis.
• The variable on which the predictions are based is called the explanatory variable or
independent variable and in the scatter diagram it is conventional to plot this variable on
the X-axis.
The purpose of the regression line is to enable the researcher to see the trend and make
predictions on the basis of the data.
Regression analysis is a statistical procedure that can be used to develop a mathematical equation
showing how variables are related.
Y = a + bX
Where
b , the slope of the regression line and indicates the amount of change of dependent
variable for a unit change in the independent variable
The equation of the regression line is written as Y = a + bX . There are several methods for
finding the regression line but we consider one method.
108
10.2.2 Formulas for the Regression line
Y = a + bX
n (∑ xy ) − (∑ x )(∑ y )
b=
n (∑ x ) − (∑ x )
2 2
∑Y ∑X
a = Y − bX or a= −b
n n
Example
i). Find the equation of the regression line for the data below which is obtained in the study of
age and blood pressure
b=
(6 )(47,634 ) − (345 )(819 ) = 0.96438
6(20,399 ) − (345 )
2
819 345
a= − 0.96438 = 81.048
6 6
Hence, the equation of the regression line is: Y = 81.048 + 0.964 X
109
The regression equation can be used to estimate the pressure given the age
For example;
ii). Find the blood pressure for a person who is aged 50 years.
A person who is 50 years of age will have a blood pressure of around 129 .
Other methods used to determine the regression equation is the method of least squares
considered in the next part.
10.2.3 Method of Least square
This is fitting the line of best fit. Our estimates of the true values of a and b leaves an error
variable or residual as it is not easy to exactly fit the line (only the best fit)
The fitted line should pass through the points of the scatter diagram in such a manner that the
sum of the squares of the vertical deviations of these points from the line will be minimum.
Since some deviations are negative and others positive, we eliminate the signs by squaring each
observation, then use the two normal equations to work out the values of a & b .
We have the normal equations
∑ y = na + b∑ x
∑ xy = a∑ x + b∑ x 2
Example
Apply the method of least squares to fit a straight line relationship (Regression of Y on X) for the
following points
Solution
110
x y x2 xy y2
∑ y = na + b∑ x
∑ xy = a∑ x + b∑ x 2
And the values of x and y in the table, as well as re-arranging the equation, we obtain
5a + 2.2b = 13.4
2.2a + 20.34b = 61.31
The above two equations are known as simultaneous equations and on solving them, we have
b = 2.861 a = 1.421
The best straight line for the given values is
Y = 1.42 + 2.86 X
Also called the equation of the regression line of Y on X
o Regression analysis studies both linear and non-linear relationship between variables while
correlation analysis studies only the linear relationship between variables.
o The cause and effect relation is clearly indicated through regression analysis but in correlation
we cannot say that one variable is the cause and the other the effect.
111
Review Questions
1. What is meant by the term, the variables have a negative relationship?
2. Why is correlation important?
3. Define the term correlation coefficient
4. Given the data of age and amount of money spent on buying music CD’s in dollars ($).
Age x 18 26 39 48 53 58
Amount of Money ($) y 16 12 9 5 6 2
Find the Correlation coefficient and comment.
5. In a science congress competition two judges, Okelo and Mwangi awarded the following
Marks
School Menengai Crater Tumaini Lenana Moi Umoja Makini Nakuru
Okelo 84 80 72 70 78 82 76 74
Mwangi 88 76 78 74 82 72 80 84
Calculate the Spearman rank correlation coefficient
6. What is the general form of the regression line used in statistics?
7. Given the data of age and amount of money spent on buying music CD’s in dollars ($).
Age x 18 26 39 48 53 58
Amount of Money ($) y 16 12 9 5 6 2
Find the
a) Equation of the regression line
b) Plot of the regression line in (a) above.
c) Amount of money a 30 year old might spend in buying CD’s by using the equation in
(a) above.
d) Regression line using the method of least squares. Do the results tally with those of
(a) above?
Harper W. M (1991). Statistics (6th Ed.) Harlow England: Longman Group UK Ltd.
Saleemi N. A. (1997). Statistics simplified. Acme Press (Kenya) Ltd. Nairobi, Kenya
112
Appendix 1: Sample Test Papers
QUESTION ONE
a) State the meaning of evaluation and give two functions of evaluation (3mks)
b) I) Citing appropriate examples explain the following qualities of a good test
• Reliability (2mks)
• Validity (2mks)
ii) Outline Four Factors that threaten the validity of a test (4mks)
c) The frequency distribution for marks obtained by 45 pupils of Sunshine Pre-school is as
shown below;
Marks 60-64 65-69 70-74 75-79 80-84
Frequency 6 15 12 8 4
QUESTION TWO
a) The table below shows the distribution of wages in Kshs. ‘000 of 50 employees of
uchumi supermarket
Wages 10-14 15-19 20-24 25-29 30-34 35-39
Kshs. ‘000
No. of 6 16 12 8 6 2
employees
113
Calculate; i) Mode (4mks)
b) The table below shows production of Coffee and Tea for the period 1972-1977 in
thousands tonnes
Year 1972 1973 1974 1975 1976 1977
Coffee (‘000 tonnes) 120 150 180 220 210 160
Tea (‘000 tonnes) 110 130 160 190 200 140
QUESTION THREE
QUESTION FOUR
a) State the meaning of Educational measurement and give two functions of measurement
(4mks)
b) Identify four guidelines for test construction (4mks)
c) Distinguish between Essay and Objective tests (4mks)
d) Give the advantages and disadvantages of essay tests (8mks)
114
QUESTION FIVE
i) Variance (4mks)
ii) Standard deviation (4mks)
d) Represent the following diagrammatically showing the relationship between the
Mean, Mode and Median
i) Normal curve (2mks)
ii) Curve skewed to the Right (2mks)
e) Shimo la Tewa High School has Five Streams in Form 4 (E, W, N, S, C). in a trial
exam the following information was obtained
Stream E W N S C
Mean score 9.211 9.643 8.997 9.000 9.840
No. of candidates 62 55 50 60 42
115
DEPARTMENT OF EARLY CHILDHOOD STUDIES
End of Semester Examinations
BECC 425: INTRODUCTION TO STATISTICS, MEASUREMENTS TESTS AND
EVALUATION
Time: 2 Hours
Instructions to Candidates: Answer question 1 (Compulsory) and any other TWO questions.
QUESTION ONE
e) i) using appropriate illustrations explain the meaning of the term validity of a test (2 marks)
116
QUESTION TWO
b. Outline three factors that could impinge on the reliability of a test (3 marks )
c. Discuss any three methods of assessing the reliability of a test giving steps involved in
each (9 marks)
d. A mathematics teacher administered a test on a certain day. He then re-tested the students
with the same test after two weeks to determine its reliability. The marks obtained by ten
students in the two tests were as follows :
Student A B C D E F G H I J
Test I 7 8 9 9 10 12 14 12 11 12
Test II 16 17 19 21 20 24 22 23 22 20
Determine the Pearson’s Product Moment Correlation Coefficient and comment on the reliability
of the test (6 marks)
QUESTION THREE
c. The municipal Education Officer in Nakuru investigated the number of days teachers
were absent from school in the year 2010. The staff returns of 100 teachers selected at
random was taken and the results compiled below:
No. of days absent 5- 9 10-14 15-19 20-24 25-29 30-34 35-39 40-44
No. of Teachers 4 10 17 20 22 16 8 3
Calculate;
i) Variance (4mks)
ii) Standard deviation (2mks)
d. Represent the following diagrammatically showing the relationship between mean, mode
and median
i. Normal Curve
ii. Curve skewed to the left
117
QUESTION FOUR
a) Giving appropriate examples explain the following terms giving an example of each
i) Mode (2mks)
ii) Median (2mks)
b) Outline TWO advantages and TWO disadvantages of mean as a measures of central
tendency (4mks)
c) The temperature outside a factory was monitored at regular intervals on 80 occasions.
The frequency distribution is as follows:
Temperature 30.0-30.2 30.3-30.5 30.6-30.8 30.9-31.1 31.2-31.4 31.5-31.7 31.8-32.0
Frequency 6 12 15 20 13 9 5
QUESTION FIVE
a) Briefly discuss each of the six levels of cognitive learning as outlined by the Bloom
taxonomy (12 marks)
b) The two judges John and Mike in a county drama competition involving eight primary
schools awarded the following marks to the participating schools.
School Afraha Lanet Flamingo Langalanga Menengai Uhuru Crater
John 58 55 59 46 59 56 50
Mike 55 54 58 53 57 57 57
Calculate the spearman rank correlation coefficient. Comment on your answer. (5 marks)
c) Utafiti High School has four Streams in Form 1 (E, W, N, and S). In a trial exam the
following information was obtained
Stream E W N S
Mean score 72 78 80 56
No. of candidates 58 55 50 62
118