You are on page 1of 18

CHAPTER ONE

INTRODUCTION TO STATSTICS
1.1 . INTRODUCTION
Statistics is concerned with scientific methods for collecting, organizing, summarizing,
presenting and analyzing data as well as with drawing valid conclusions and making reasonable
decision on the basis of such analysis. In a narrowing sense, the term statistics is used to denote
the data themselves or numbers derived from the data, such as averages. Thus we speak of
employment statistics, accidental statistics, etc.
Decision makers make better decisions when they use all available information in an effective
and meaningful way. The primary role of statistics is to provide decision makers with methods
for obtaining and analyzing information to help make these decisions. Statistics is used to answer
long-range planning questions, such as when and where to locate facilities to handle future sales.
1.1.1. DEFINITIONS OF STATISTICS
Statistics has been defined by various authors differently. Some of the definitions are extremely
narrow. This is understandable since statistics has developed over the past several decades and in
the earlier days; the role of statistics was confined to a limited sphere. Let us see some
definitions, which are given below.
W.I. King has defined statistics as the method of judging collective, natural or social phenomena
from the results obtained by analysis or enumeration or collection of estimates.
Croxton and Cowden: Statistics or statistical methods may be defined as a collection,
presentation, analysis and interpretation of numerical data.
Lovett: statistics is a science that deals with collection, classification and tabulation of numerical
facts as a basis of the explanation, description and comparison of phenomena.
A definition, which seems to be more comprehensive, is given by Secrist. He defined statistics
as: “aggregate of facts, affected to a marked extent by multiplicity of causes, numerically
expressed, enumerated or estimated according to reasonable standards of accuracy, collected in a
systematic manner for a pre-determined purpose, and placed in relation to each other”
It may be emphasized that this definition highlights a few major characteristics of statistics.
These are given below.
1. Statistics are aggregates of facts. This means a single figure is not statistics. For example,
national income of a country for a single year is not statistics but the same for two or
more years is.
2. Statistics are affected by a number of factors. For example sale of a product depends on a
number of factors such as its price, quality, competition, the income of consumers, and so
on.
3. Statistics must be reasonably accurate, wrong figures if analyzed, will lead to erroneous
conclusion. Hence, it is necessary that conclusion must be based on accurate figures.
4. Statistics must be collected in a systematic manner. If data are collected in a haphazard
manner, they will not be reliable and will lead to misleading conclusions.

1
5. Finally, statistics should be placed in relation to each other if one collects data unrelated
to each other, and then such data will be confusing and will not lead to any logical
conclusions. Data should be comparable overtime and space.
1.1.2. IMPORTANCE OF STATISTICS IN BUSINESS
There is an increasing realization of the importance of statistics in various quarters. This is
reflected in the increasing use of statistics in the government, industry, business, agriculture,
mining, transport, education, medicine and so on. As we are concerned with the use of statistics
in business and industry here, description given below is confined to these areas only. There are
three major functions in any business enterprise in which statistical methods are useful.
1. The planning functions. This may relate to either special projects or to the recurring
activities of the firm over specified period.
2. The setting up standards. This may relate to the size of employment, volume of sales,
fixation of quality norms for the manufactured products, norms for daily out put, and so
forth.
3. The function of control: This involves comparison of actual production achieved against
the norm or target set earlier. In case the production has fallen short of the target, it gives
remedial measures so that such a deficiency does not occur again.

1.1.3. TYPES OF STATISTICS


The statisticians commonly classify this subject in to two broad categories: the Descriptive
statistics and inferential statistics
Descriptive statistics: As the name suggests descriptive statistics includes any treatment
designed to describe or summarize the given data, bringing out their important features. Thus
statistics do not go beyond this. This means that no attempt is made to infer anything that
pertains to more than the data themselves.
Example: If someone compiles the necessary data and reports that during the fiscal year 2000-
2001, there were 1500 private limited companies in Ethiopia of which 1215 earned profits and
the remaining 285 companies sustained losses, his study belongs to the domain of descriptive
statistics.
Inferential statistics: It is a method used to generalize from sample to a population
Example: The average income of all families (the population) in the US can be estimated from
figures obtained from a few hundred (the sample) families.
Statistical population is the collection of all possible observations of specified characteristics of
interest.
An example is all of the students in Wachemo University. Note that a sample is a subset of the
population.

2
1.2. Descriptive Statistics
1.2.1. Statistical Data

Statistical data are the basic input needed to make an effective decision in a particular situation.
The main reasons for collecting data are:

i) To provide necessary inputs to a given phenomenon or situation under study.


ii) To measure performance in an ongoing process such as production, service, and so
on.
iii) To enhance the quality of decision – making by enumerating alternative courses of
action in a decision- making process, and selecting an appropriate one.
iv) To satisfy the desire to understand an unknown phenomenon.
v) To assist in guessing the causes and probable effects of certain characteristics in
given situations.
Types of Data

Statistical data are the outcome of a continuous process of measuring, counting, and/or
observing. These may pertain to several aspects of a phenomenon (or a problem) which are
measurable, quantifiable, countable or classifiable. The researcher will collect and analyze data
about the characteristics of the given population. These characteristics which one intends to
investigate and analyze are termed as variables. Variables have two natures: those which are
expressed in numerical terms and those which are not expressed in numerical terms. While sex,
religion, and language are a few examples of non-numerical variables, age, weight, height, and
distance are examples of numerical variables. The numerical variables are classified into two
categories:

i) Discrete variables – which can only take certain, fixed integer numerical values. For
example, the number of cards or the numbers of employees in an organization are
examples of discrete variables.
ii) Continuous variables – This can take any numerical value. Measurements of height,
weight, length, in centimeters/ inches, grams/kilograms are a few examples of
continuous variables.
Remark: Discrete data numerical measurements arise from a process of counting; while
continuous data are numerical measurements arise from a process of measuring.

Sources of Data

The choice of a data collection method from a particular source depends on the facilities
available, the extent of accuracy required in analysis, the expertise of the investigator, the time
span of the study, and the amount of money and other resources required for data collection.

Data sources are classified as primary sources and secondary sources.

3
i) Primary sources
Individuals, focus groups, and/or panels of respondents specifically decided up on and set up by
the investigator for data collection are examples of primary data sources. Any one or a
combination of the following methods can be chosen to collect primary data:

Observation: In observation studies, the investigator does not ask questions to seek
clarifications on certain issues. Instead he/she records the behavior, as it occurs of an event in
which he interested. Sometimes mechanical devices such as camera, tape recorders are also used
to record the desired data.

Interviewing: Interviews can be conducted either face –to-face or over telephone. Such
interviews provide an opportunity to establish a rapport with the interviewee and help extract
valuable information. Direct interviews are expensive and time-consuming if a big sample of
respondents is to be personally interviewed. Interviewers’ biases also come in the way. Such
interviews should be conducted at the exploratory stages of research to handle concepts and
situational factors.

Questionnaire: It is a formalized set of questions for extracting information from the target
respondents. The form of the questions should correspond to the form of the required
information. The three general forms of questions are: dichotomous (yes/no response type);
multiple choice, and open-ended. A questionnaire can be administered personally or mailed to
the respondents. It is an efficient method of collecting primary data when the investigator knows
what exactly is required and how to measure such variables of interest.

ii) Secondary Data Sources


Secondary data refer to those data which have been collected earlier for some purposes other
than the analysis currently being undertaken. There are external and internal sources of
secondary data.

External secondary data sources: the external secondary data sources include government
publications, non-government publications, various syndicate services such as Operations
Research Group (ORG) and international organizations.

Internal secondary data sources: The data generated with in an organization in the process of
routine business activities, are referred to as internal secondary data. Financial accounts,
production quality control, and sales records are examples of such data.

1.2.2. Organization and summarization of data


The best way to examine a large set of numerical data is first organize and present it in an
appropriate tabular and graphical format. As the number of observations gets large, it becomes
more and more difficult to focus on the specific features in a set of data. Thus we need to
organize the observation so that we can better understand the information that the data are
revealing.

4
The raw data can be organized in a data array and frequency distribution. Such an arrangement
enables us to see quickly some of the characteristics of the data we have collected. When a raw
data set is arranged in rank order, from the smallest to the largest observation or vice-versa, the
ordered sequence obtained is called an ordered array.

Frequency distribution

Frequency distribution divides observations in the data set into conveniently established,
numerically ordered classes (groups or categories). The number of observations in each class is
referred to as frequency, denoted as f.

Advantages: The following are few advantages of grouping and summarizing raw data in this
compact form:

i) The data are expressed in a more compact form. One can get a deep or insight into
the salient characteristics of the data at the very first glance.
ii) One can quickly note the pattern of distribution of observations falling in various
classes.
iii) It permits the use of more complex statistical techniques which help reveal certain
other obscure and hidden characteristics of the data.
Disadvantages: A frequency distribution suffers from some disadvantages as stated below:

i) In the process of grouping, individual observations lose their identity. It becomes


difficult to notice how the observations contained in each class are distributed. This
applies more to a frequency distribution which uses the tally method( as you can see
in the coming frequency tables) in its construction.
ii) A serious limitation inherent in this kind of grouping is that there will be too much
clustering of observations in various classes in case the number of classes is too
small. This will cause some of the essential information to remain unexposed.
As the number of observations obtained gets large, it becomes quite difficult and time consuming
to condense the data. Thus, the following steps should be taken:

A) Deciding the number of classes: The decision on the number of class groupings depend
largely on the judgment of the individual investigator and/or the range that will be used to
group the data. As a general rule, a frequency distribution should have at least five class
intervals (groups), but not more than fifteen. The following two rules are often used to decide
approximate number of classes in a frequency distribution.
i) If K represents the number of classes and N the total number of observations, then
the value of K will be the smallest exponent of the number 2, so that

5
Let N = 30 observations. If we apply this rule, then we shall have.

Thus we may choose K = 5 as the number of classes.

ii) According to Sturge’s rule, the number of classes can be determined by the formula
K = 1+3.222 logeN

Where K is the number of classes and log eN is the logarithm of the total number of observations.
Applying this rule to the above

K = 1+3.222 log 30

= 1+3.222 (1.4771) 5

B) Obtaining the width of class Intervals: when constructing the frequency distribution it is
desirable that the width of each class interval should be equal in size. The size (or width) of
each class interval can be determined by first taking the difference between the largest and
smallest numerical values in the data set and then dividing it by the number of class intervals
desired.

For example, if the largest numerical value is 95 and the smallest numerical value of the
observation is 84, using the above formula with 5 classes desired, the width of the class intervals
is approximated as:

Width of class interval =

For convenience, the selected width (or interval) of each class is rounded to 3.

C) Establishing class limits (Boundaries): the limits of each class interval should be clearly
defined so that each observation (element) of the data set belongs to one and only one class.
Each class has two limits – a lower limit and an upper limit. The usual practice is to let the lower
limit of the first class be a convenient number slightly below or equal to the lowest value in the
data set.

For example let us take an illustration to make it clear on the concepts discussed above.

6
Table 1.1: Raw data pertaining to total time hours worked by machinists

94 89 88 89 90 94 92 88 87 85
88 93 94 93 94 93 92 88 94 90
93 84 93 84 91 93 85 91 89 95

Table 1.2 reorganizes data given in table 1.1 above in the ascending order.
Table 2.2: Ordered arrays of total overtime hours worked by machinists
84 84 85 85 87 88 88 88
88 89 89 89 90 90 91 91
92 92 93 93 93 93 93 93
94 94 94 94 94 95

The frequency distribution of the number of hours of overtime given in Table 1.1 is shown in
Table 1.3.
Table 1.3: Array and Tallies
Number of overtime Hours Tally Number of weeks (Frequency)
84 Ll 2
85 ll 2
86 - 0
87 l 1
88 llll 4
89 lll 3
90 ll 2
91 ll 2
92 ll 2
93 llll l 6
94 llll 5
95 l 1

In Table 1.2, we may take the lower limit of the first class as 82 and the upper class limit as 85.
Thus the class would be written as 82 – 85. This class interval includes all overtime hours
ranging from 82 up to but not including 85 hours. The various other classes can be written as:

Overtime Hours Tallies Frequency


(class intervals )
82 but less than 85 Ll 2
85 but less than 88 lll 3
88 but less than 91 llll llll 9
91 but less than 94 llllllll 10
94 but less than 97 llll l

7
Graphical Presentation of Data
It has already been discussed that one of the important functions of statistics is to present
complex and unorganized (raw) data in such a manner that they would easily be understandable.

Advantages and limitations of diagrams (Graphs)


Advantages: Few of the advantages and usefulness of diagrams are as follows:
 Diagrams give an attractive and elegant presentation: diagrams have greater attraction
and effective impression.
 Diagrams leave good visual impact: the impression created by a diagram is likely to last
longer in the mind of people than the effect created by figures.
 Diagrams facilitate comparison: with the help of diagrams, comparisons of groups and
series of figures can be made easily.
 Diagrams save time: Diagrams present the set of data in such a way that their significance
is known without loss of much time.
 Diagrams simplify complexity and depict the characteristics of the data: Diagrams,
besides being attractive and interesting, also high light the characteristics of the data.
Disadvantages of diagrams (Graphs)

Few limitations of diagrams as a tool for statistical analysis are as under:

o They provide only an approximate picture of the data


o They cannot be used as alternative to tabulation of data.
o They can be used only for comparative study.
o They are capable of representing only homogeneous and comparable data.
Types of Diagrams

There are a variety of diagrams used to represent statistical data. Different types of diagrams,
used to describe sets of data, are divided into the following categories:

1) Dimensional Diagrams

i) One – dimensional diagrams such as histograms, frequency polygons, and pie chart.
ii) Two-dimensional diagrams such as rectangles, squares, or circles.
iii) Three dimensional diagrams such as cylinders and cubes.
2) Pictograms or Ideographs

An ideogram or ideograph is a graphic symbol that represents an idea or concept.

A pictogram conveys its meaning through its pictorial resemblance to a physical object.

3) Cartographies or Statistical Maps

Statistical maps are used to show the difference in values (frequency of an event, probability of
an event etc.) between different geographical regions in geo-spatial analysis.

8
One – Dimension Diagrams

These diagrams are most useful, simple, and popular in the diagrammatic presentation of
frequency distributions. These diagrams provide a useful and quick understanding of the shape of
the distribution and its characteristics.

These diagrams are called dimensional diagrams because only the length (height) of the bar (not
the width) is taken into consideration.

The one – dimensional diagrams (charts) used for graphical presentation of data sets are as
follows:

 Histogram
 Frequency polygon
 Frequency curve
 Cumulative frequency distribution (Ogive )
 Pie diagram
Histograms (Bar Diagrams)

Bar graphs are probably the most commonly used graphs, and one you're already familiar with. I
won't mention much more here, except to state a couple keys:
1. Heights can be frequency or relative frequency
2. Bars must not touch
 By using color example, we could then make both frequency and relative frequency bar
graphs.
Favorite color frequency relative frequency
Blue 10 10/26 ≈ 0.38
Red 3 3/26 ≈ 0.12
Orange 1 1/26 ≈ 0.04
Yellow 3 3/26 ≈ 0.12
Green 5 5/26 ≈ 0.19
Pink 3 3/26 ≈ 0.12
Purple 1 1/26 ≈ 0.04

9
Pie Charts
Like bar graphs, pie charts are very common.
Pie Charts: 1. should always include the relative frequency
2. Also should include labels, either directly or as a legend
Using the data from our previous color example, we get this pie chart:-

10
Frequency Polygon

A frequency polygon is drawn by plotting a point above each class midpoint and connecting the
points with a straight line. (Class midpoints are found by average successive lower class limits.)

To illustrate the idea, let's look at the average the following example.
Average commute midpoint frequency
16-17.9 17 1
18-19.9 19 2
20-21.9 21 1
22-23.9 23 6
24-25.9 25 2
26-27.9 27 1
28-29.9 29 1
30-31.9 31 1

11
Cumulative Frequency Distribution (0give)

A cumulative frequency curve popularly known as O give is another form of graphical


presentation of a cumulative frequency distribution.

The two methods of O gives are:

 Less than O give


 Greater than or more than O give

The graph given above represents less than and the greater than O give curve. The rising curve
represents the less than O give, and the falling curve represents the greater than O give.
1.3. MEASURE OF CENTRAL TENDENCY
The term central tendency was coined because observations (numerical values) in most data sets
show a distinct tendency to group or cluster around a value of an observation located somewhere
in the middle.
The measure of central tendency that will be discussed here are the arithmetic mean, the median,
the mode. Before discussing arithmetic mean or any other mean, the question arises why should
we use such a mean?
The answer is that there are the two main objectives of using mean. First, to get a single value
that indicates the characteristic of the entire data for instance, when we talk of per capita income
of a country, it gives a broad idea of the standard of living of the people in that country. The
Second reason for using mean is to facilitate comparisons of data.
1.3.1. The Arithmetic Mean
The arithmetic mean is obtained by adding all the observations and dividing the sum by the
number of observations. Suppose we have the following observations: 10, 15, 30, 7, 42, 79 and
83.
These are seven observations. Symbolically, the arithmetic mean, also called simply mean is
X
X̄ =∑
n , where x is the sample
=10 + 15 + 30 + 7 + 42 + 79 + 83
266
7
= = 38

12
It may be noted that the Greek letter µ is used to denote the mean of the population and n to
X
∑n
denote the total number of observations in a population. Thus, the population means µ = .
Ungrouped Data: Weighted Case
In case of ungrouped data where weights are involved, our approach for calculating arithmetic
mean will be different from the one used earlier.
Example 1
Suppose a student has secured the following marks in three tests
Mid –term test 30
Laboratory 25
Final 20
30+25+ 20
The simple arithmetic mean will be 3 = 25
However, this will be wrong if the three 3 tests carry different weights on the basis of their
relative importance. Assuming that the weights assigned to the three tests are:
Mid – term test 2 Points
Laboratory 3 Points
Final 5 Points
Solution
On the basis of this information, we can now calculate a weighted mean as shown below:
Calculation of weighted mean
Type of test Relative weight (w) Marks (x) W(x)
Mid – term 2 30 60
Laboratory 3 25 75
Final 5 20 100
Total Sw=10 235

X=
∑ wi ( x i ) 60+75+100
∑ wi = 2+3+5 =23 . 5 Marks.
Example 2
An investor is fond of investing in equity shares. During a period of falling prices in the stock
exchange, a stock is sold at birr 120 per share on one day, birr105 on the next and birr 90 on the
third day. The investor has purchased 50 shares on the first day, 80 shares on the second day and
100 shares on the third day. What average price per share did the investor pay?

Day price per share (birr) X No of shares purchased W Amount paid (WX)
1 120 50 6000
2 105 80 8400
3 90 100 900
Total - 230 23,400

13
w1 x 1 + w2 x 2 + w3 x 3
Weighted average = w1 + w2 + w3
6000+8400+ 9000
=50+80+ 100 =birr. 101.7
Thus, the investor paid an average price of birr 101.7 per share.
1.3.2. The Median
The median of a set of numbers arranged in order of magnitude (i.e., in an array) is either the
middle value or the arithmetic mean of the two middle values.
The median can be calculated for both ungrouped and grouped data sets.
Ungrouped data: In this case the data is arranged in either ascending or descending order of
magnitude
(i) If the number of observations (n) is an odd number then the median (med) is
n+1
represented by the numerical value corresponding to the positioning points of 2
ordered observation.
( n+1 )th
Med = size or value of 2 .
(ii) If the number of observations (n) is an even number, then the median is defined as the
n th n
[ +1 ]th
arithmetic mean of the numerical value of 2 and 2 observations in the data
array that is:
nth n+1
+
2 2
Med =
2
Example: Calculate the median of the following data that relates to the service time (in minutes)
per customer for seven customers at railway reservation counter.
Observations in the data array: 1 2 3 4 5 6 7
Service time (in minutes): 3 3.5 3.8 4 4.5 5 5.5
n+1 th
[ ]
Median = value of 2 observation in the data array
7+1 th th
[ ] =4
= 2 observation in the data array is 4, thus the median service time is 4
minutes per customer.
1.3.3. Mode
The mode is that value of an observation which occurs most frequently in the data set, that is, the
point (or class mark) with the highest frequency.
In the case of grouped data, the following formula is used for calculating mode.

14
( f mo −f mo−1 ) h
Mode=L+
2 f mo −f mo−1 −f mo+1
Where L = Lower limit of the model class interval.
fmo-1 = frequency of the class preceding the mode class interval.
fmo+1 = frequency of the class following the mode class interval.
h = Width of the mode class interval.
Example: The data below shows the sales of an item per day for 20 days period Sales volume
Class interval 53-56 57-60 61-64 65-68 69-92 72 and above
Number of days 2 4 5 4 4 1
(frequency)

Solution: Since its largest frequency corresponds to the class interval 61-64, therefore, it is the
modal class. Then we have,
L=61, fmo = 5, fmo-1 =4, fmo+1 = 4 and h = 3. Thus
( f mo−f mo−1 ) h
Mo=L+
2 f mo −f mo−1−f mo+1
(5−4 )×3
= 61 + 10−4−4 = 61 + 1.5 = 62.5
Hence, the modal sale is of 62.5 units.

1.4. MEASURES OF DISPERSION


Just as central tendency can be measured by a number of in the form of an average the amount of
variation (dispersion, spread, or scatter) among the values in the data set can also be measured.
1.4.1. Significance of Measuring variation
1. To test the reliability of an average: measures of variation are used to test to what extent an
average represents the characteristic of a data set.
2. To control the variability: helps to identify the nature and causes of variation, such
information is useful in controlling the variations.
3. Compare two or more sets of data with respect to their variability.
4. To facilitate the use of other statistical techniques such as correlation and regression
analysis, hypothesis testing, forecasting, quality control, and so on.
1.4.2. The Range
The simplest measure of dispersion is the range, which is the difference between the maximum
value and the minimum value of the data.
Range(R) = value of largest observation - value of smallest observation.
For example, if the smallest value of an observed value in the data set is 160 and largest value is
250, then the range is 250 – 160 = 90.
For grouped frequency distribution of values in the data set, the range is the difference between
the upper limit of the highest class and the lower limit of the lowest class. Note that the range is
not influenced by the frequencies.
15
Example: Find the range for the following frequency distribution.
Size of item Frequency
20-40 7
40-60 11
60-80 30
80-100 17
100-120 5
Total 70
Solution: Here the upper limit of the highest class is 120 and the lower limit of the lowest class
is 20. Hence, the range is 120 – 20 =100.
The relative measure of range caused the coefficient of range is obtained by applying the
following formula:
Highest−Lowwest
=
Coefficient of range Highest +Lowest
From the previous example
120−20 100
= =1 . 4
120+20 140
Coefficient of range =
1.4.3. The standard Deviation
Before defining the concept of the standard deviation, let us introduce another concept viz.
Variance. Consider the individual items given in the following table.
Example
X X–N (X –M)2
20 20-18=2 4
15 15-18=-3 9
19 19-18=1 1
24 24-18=6 36
16 16-18=-2 4
14 14-18=14 16
108 Total 70
Mean = 108/6 = 18
The second column shows the deviations from the mean. The third or the last column shows the
squared deviations the sum of which is 70. The arithmetic mean of the squared deviations is:
∑ ( X −μ)2 =70
A.M of squared deviations = N 6 = 11.67
This mean of the squared deviations is known as the variance. It may be noted that this variance
is described by different terms that are used interchangeably the variance of the distribution x;
the variance of x; the variance of the distributions symbolically,

16
∑ ( X −μ)2
Var (x) = N

σ 2
=
∑ ( X i−μ )2
It is also written as = N
Where
σ 2 (Called sigma squared) is used to denote the variance.
Although the variance is a measure of dispersion, the unit of its measurement is (points) 2. If a
distribution related to income of families, then the variance is (Rs) 2. Similarly, if another
distribution pertains to marks of students, then the unit of variance is (marks) 2. To overcome this
inadequacy, the square root of variance is taken, which yields a better measure of dispersion
known as the standard deviation.
Taking our earlier example of individual observations, we take the square root of the variance
SD. Or 6 = = 3.42 points

Symbolically
σ=
√ ∑ ( X i −μ)2
N

1.4.4. The coefficient of variation


One measure of relative dispersion is the coefficient of variation which relates the standard
deviation and the mean such that the standard deviation is expressed as a percentage of mean.
δ
x 100 %
Symbolically, CV (coefficient of variation) = N
From the previous (above) example
σ
CV = μ = x 100 = 40.15%

1.4.5. The skewness

The skewness of a distribution is defined as the lack of symmetry. In a symmetrical distribution,


the Mean, Median and Mode are equal to each other and the ordinate at mean divides the
distribution into two equal parts such that one part is mirror image of the other (Fig. 6.1). If some
observations, of very high (low) magnitude, are added to such a distribution, its right (left) tail
gets elongated.

These observations are also known as extreme observations. The presence of extreme
observations on the right hand side of a distribution makes it positively skewed and the three
averages, viz., mean, median and mode, will no longer be equal. We shall in fact have Mean >
Median > Mode when a distribution is positively skewed. On the other hand, the presence of
extreme observations to the left hand side of a distribution make it negatively skewed and the

17
relationship between mean, median and mode is: Mean < Median < Mode. In Fig. 6.2 we depict
the shapes of positively skewed and negatively skewed distributions.

18

You might also like