Probability and Statistics

Probability and Statistics for Engineers
Yabebal Ayalew
Statistics Department, Addis Ababa University
College of Natural & Computational Science

Statistics Department
Chapter One
Introduction
Outline
1 Introduction
— Definition and Classification of Statistics
— Stages in Statistical Investigation
— Definition of Some Basic Terms
— Application, uses and limitations of statistics
— Types of variables and measurement scales
2 Method of Data Collection and Presentation
— Method of Data Collection
— Source and Type of Data
— Methods of Data Presentation
• Frequency Distribution
• Diagrammatic and/or Graphical Presentation of Data
3 Statistics Department Probability and Statistics 22.2.2022

Introduction
Definition of Statistics
“ Statistical thinking will one day be as necessary

for efficient citizenship as the ability to read and write.
~Samuel S. Wilks (1906-1964)
• In the modern world of computers and information technology, the importance of statistics is very
well recognized by all the disciplines
• Statistics has originated as a science of statehood and found applications slowly and steadily in
Agriculture, Economics, Commerce, Biology, Medicine, Industry, planning, education and so on
Introduction
• In the meantime, there is no other human walk of life, where statistics cannot be applied. Hence, we
are constantly being bombarded with statistics and statistical information
• The word Statistics and Statistical are all derived from Latin word status which means a political
state∗
— In the olden days, the application of statistics was limited to state affairs
— In the 19th century, statistics as a field has included data analysis as its major component
• From time to time, the application of statistics has been grown and its definition has also been
changed
• The American Heritage Dictionary defines statistics as:
The mathematics of collection, organization and interpretation of numerical data, especially the
analyses of population characteristics by inference from sampling.
∗
A new Latin word statisticum collegium to mean council of state and Italian word statista to mean
statesman or politician.
Introduction
• The Merriam-Webster’s Collegiate Dictionary defines statistics as:
A branch of mathematics dealing with the collection, analyses, interpretation, and presentation of
masses of numerical data.
• The former American Statistical Association president Jon Kettering define statistics as:
...the science of learning from data ... It presents exciting opportunities for those who work as
professional statisticians. Statistics is essential for the proper running of government, central to
decision making in industry and a core component of modern educational curricula at all level.
• Despite these, the word statistics can have two different senses while we use it as plural and singular
noun.

Introduction
Definition of Some Basic Terms
• Population is a collection of objects possessing the same characteristics that can be studied
— Population is defined with respect to time and space
— Example: Economics students of Addis Ababa University
A 2nd year Economics students of Addis Ababa University registered for 2016/17 AY
— By this definition, population is not directly referring human being. It can be chair, ocean,
bacteria etc
• Sample is a small portion of the population
— Small in terms of size
— Should be highly representative
— Saves time, money and have greater accuracy
• A parameter is a number that summarizes some aspect of the population as a whole. A statistic is
a number computed from the sample data.

Introduction
Classification of Statistics
• Based on the usage of statistical data , statistics is defined broadly into two mutually exclusive
groups—descriptive statistics and inferential statistics
• Descriptive statistics is used to describe the basic features of the data in a study
— Provides simple summaries about the sample and the measures
• Various techniques that are commonly used in descriptive statistics are
— Graphical description (pie chart, line graph, bar graph, histogram etc)
— Tabular description (frequency distribution)
— Summary statistics (mean, variance, median, mode etc)
• Example: Of 350 randomly selected people in the city of Addis Ababa 280 people had the last name
Abebe. An example of descriptive statistics is the following statement: 80% of these people have the
last name Abebe
• Example: On the last 3 Sundays, Hiwot Car saleswoman sold 2, 1, and 0 new cars, respectively. An
example of descriptive statistics is the following statement: Hiwot averaged 1 new car sold for the
last 3 Sundays.

Introduction
Classification of Statistics
• Inferential statistics comprise the use of statistics to make inferences or conclusions and determine
the relationships concerning about some unknown aspect of a population parameters based on the
data which are obtained from the sample
— Inferential statistics aim to make inferences from the data in order to make conclusions that go
beyond the data
— It is also called statistical induction since it uses inductive reasoning
• Hypothesis testing is one of typical example of inferential statistics
• Example: Of 350 randomly selected people in the city of Addis Ababa 280, people had the last name
Abebe. An example of inferential statistics is: 80% of all people living in Addis have the last name
Abebe.
— We have no information about all people living in Addis Ababa, just only about 350 people
living in Addis Ababa
— We have taken that information and generalized it to talk about all people living in Addis Ababa
• Example: On the last 3 Sundays, Hiwot, Car saleswoman sold 2, 1, and 0 new cars respectively. An
example of inferential statistics is the following statements: Hiwot never sells more than 2 cars on
Sunday.
Introduction
Applications, Uses and Limitations of Statistics
• Statistics has application in every scientific fields. Some of the uses of Statistics are:
1 Statistics presents fact in the form of numerical data
2 It condenses and summarizes a mass of data into a few presentable and precise figures
3 It facilitates comparison of data
4 It helps to formulating and testing hypothesis
5 It helps to predicting future trend
6 It helps to formulate polices
Limitations of Statistics
• Statistics is not suitable to the study of qualitative phenomenon
— Unless we have indirect method to quantify those phenomenon, statistics is useless
• Statistics does not study individuals
— Statistics does not give any specific importance to the individual items; in fact it deals with an
aggregate of objects

Introduction
Applications, Uses and Limitations of Statistics
• Statistical laws are not exact

— It is well known that mathematical and physical sciences are exact
— But statistical laws are not exact and statistical laws are only approximations
— Statistical conclusions are not universally true. They are true only on an average.
• Statistical tables may be misused
— Statistics must be used only by experts; otherwise, statistical methods are the most dangerous
tools on the hands of the inexpert
— As King says aptly “Statistics are like clay of which one can make a God or Devil as one pleases.”
• Statistics is not only but one of the methods of studying a problem
— Problems should be studied by taking the background of country’s culture, philosophy or
religion into consideration
— Statistical study should be supplemented by other evidences

Introduction
Applications, Uses and Limitations of Statistic
• Statistics doesn’t study cause-and-effect type relationship

— In statistics, different types relationship can be studied but none of them are indicating
causation relationship
Characteristics of Statistical Data
• The data must be aggregate of facts
• The data must be affected to a marked extent by a multiplicity of causes
• The data must be estimated according to reasonable standards of accuracy
• The data must be collected in a systematic manner for predefined purpose
• The data should be placed in relation to each other

Introduction
Stages in Statistical Investigation
1 Formulating the problem

— Get a clear understanding of the physical background to the situation under study†
— Clarify the objectives;
— Formulate the objective in statistical terms
2 Proper collection of data
— In order to draw valid conclusions, it is important to have ‘good’ data.
— Data are gathered with aim to meet predetermine objectives
3 Organization and classification of data
— The data must be placed in relation to each other
— The classification or sorting out of data is, by itself, a kind of organization of data
†
An approximate answer to the right question is worth a great deal more than a precise answer to the wrong
question.—The first golden rule of applied mathematics.
Introduction
Stages in Statistical Investigation
4 Presentation of data: The purpose of putting the organized data in graphs, charts and tables is
two-fold
— First, it is a visual way to look at the data and see what happened and make interpretations
— Second, it is usually the best way to show the data to others
5 Analyses of data
— It is the process of looking at and summarizing data with the intent to extract useful
information and develop conclusions
— In this stage different types of inferential statistical methods will be applied
6 Interpretation of results‡
— Interpretation means drawing valid conclusions from data which form the basis of decision
making
— Correct interpretation requires a high degree of skill and experience
‡
Analyses and interpretation of data are the two sides of the same coin
Introduction
Scale of Measurement
• Variable is an attribute of physical and abstract system whose value varies while under consideration.
It can be classified as quantitative and qualitative
• Quantitative variables are those variables whose values are naturally expressed by numbers. e.g.
weight, salary etc
• Quantitative variable can be either discrete or continuous
— Discrete variable is a variable whose values have predefined gap. We don’t need to have
measuring device to know the next possible value. e.g. students number in class
— Continuous variable is a variable whose values don’t have predefined gap. We need to have
measuring device to know the next possible value of a variable
• Qualitative variable§ is a variable whose values are not naturally expressed by number. e.g. gender,
religion, political affiliation, military rank
— The four basic mathematical operation should not be applied
— They can be expressed by pseudo-numbers. e.g. Bus number
§
It can also called categorical variable
Introduction
• According to Wikipedia¶ Level of measurement or scale of measure is a classification that describes
the nature of information within the values assigned to variables
1 Nominal
2 Ordinal
3 Interval
4 Ratio
• The first two are reserved for categorical variables and the last two reserved for quantitative variables
• Nominal Scale of Measurement
— The variable values are not ordered
— The permissible mathematical operation is count
— Example: Gender, religion affiliation, and race are typical examples
• Ordinal Scale of Measurement
— Rank order is possible
— Don’t know the real difference between categories
— Count, >, <, ≥, and ≤ are permissible operations
— Example: Thesis grade, military rank, academic rank etc
¶
https://en.wikipedia.org/wiki/Level_of_measurement
Introduction
• Interval Scale of Measurement

— There is no true zero. i.e. zero doesn’t imply the absence of the quantity being measure
— + and − are possible
— Example: Temperature in degree centigrade, IQ
• Ratio Scale of Measurement
— There is true zero. i.e. zero imply the absence of the quantity being measure
— All four basic arithmetic operations are permissible
— Example: Weight, height, temperature in Kelvin scale

Methods of Data Collection and Presentation
Methods of Data Collection—Census Vs Sample Survey
• The word census was derived from a Latin verb censere which means—contrary to what’s
expected—not to count but rather to assess, or in a term closer to the world of statistics, to estimate ‖
Census
It is a complete process of extracting information from each element of a population. Population
Census is a complete process of collection, receipt, assessment, analysis, publication and distribution of
demographic, economic and social data, which relate, at a given moment in time, to all the residents of
a country or of a well-defined partial geographic area
• One typical example of census in our country is population and housing census∗∗
— It is conducted every 10 years (Eth.Constitution A.103(4))
— Ethiopia is able to conduct three censuses (1984, 1994, 2007)
— The 4th census was scheduled for 2017 but not conducted yet
‖
Americana Corporation of Canada. 1951. The Encyclopedia Americana. Montreal: Americana Corp. of
Canada.
∗∗
Countries like Japan, Canada and Australia conduct population census every 5 years
Interesting Facts about Ethiopia

• Life expectancy is 66.24 years (Male = 64.4 Years;
Female = 68.2 Years)
• Fertility rate: 4.25 births per woman
• The estimated population of Ethiopia was 116,956,221 as
of Wednesday, March 17, 2021, based on Worldometer
elaboration of the latest United Nations data
• The current population as of Friday February 11, 2022 is
119,790,055
• The median age in Ethiopia is 19.5 years.
Source: https://www.worldometers.info/world-population/ethiopia-population/


• Census activities can be divided into three main stages—planning, data collection and producing the
results††
• Planing—The end justifies the means
— The purpose and methodology of the census are determined
— Main strategical decisions are made
— Intermediate goals are defined
— Development of methods and means designed to achieved the goal of census
• Data Collection—The most intensive stage
— Collecting data by direct contact with residences
— Requires complex logistic preparation
— Public campaign to enlist the cooperation of the public and high level skill in the field operation
††
http://www.cbs.gov.il/census/census/pnimi_sub_page_e.html?id_topic=1&id_subtopic=1
• Producing the results

— Including receipt, processing, estimation, analysis, publication and distribution of the census data
Advantages of Census
• Benchmark data may be obtained for Disadvantages of Census
future studies • It is costly in terms of money and time
• Detailed information about small • Not possible when the population in
sub-groups within the population is more infinite
likely to be available • Its reliability is compromised in areas
• Provides a true measure of the with low literacy
population (no sampling error)
• Reading Assignment: An important aspect of census enumerations is determining which individuals
can be counted. Broadly, three definitions can be used: de facto residence; de jure residence; and
permanent residence.

• Sample survey is a study that obtains data from a subset of a population, in order to estimate
population attributes.
• The definition has made it clear that the sample survey is a study of sample elements with the
intention of estimating population parameters
• Sample survey is imminent when
— The population is infinite
— The budget is small to consider all elements in the population
— You think of updating census results
— The population is homogeneous
• In order to conduct sample survey, we have to select elements from a population as sample element
• The process of selecting sample elements from a population is called sampling. There are two types
of sampling techniques—probability and non-probability sampling techniques

• In probability sampling techniques, each element in the population has a known (non-zero) chance of
being selected as sample element
• In probability sampling techniques, each element in the population has a known (non-zero) chance
of being selected as sample element
— Simple, cluster, stratified, systematic random sampling
• With non-probability sampling methods, the elements in the population don’t have equal chance
of being sample element‡‡
— Quota, purposive, convenient
• Simple Random Sampling: The selection of elements in this sampling technique is purely random.
i.e., no personal bias on the selection
— Suppose the population has N elements
— The sample size is n
— Select n sample elements from a population of size N using random number table or lottery
method
‡‡
http://stattrek.com/survey-research/sampling-methods.aspx?Tutorial=AP
• Simple random sampling can be useful when the population is relatively homogeneous
• Stratified Random Sampling: The population is divided into k mutually exclusive groups called
strata
— The population is divided into strata based on a certain characteristics
— It is used to keep homogeneity within stratum and heterogeneity across strata
• Cluster Random Sampling: uses cluster to divide the population
— Mostly clusters are city blocks, woredas or kebeles
— There is similarity across clusters but high heterogeneity within each cluster
— The simplest form of cluster sampling is selecting n clusters and consider all elements in the
selected cluster as sample element
• Exercise: What is the difference between cluster and stratified sampling?
— In stratified random sampling, we select elements for each stratum. But in cluster sampling,
some clusters are selected
— In stratified sampling, we reduce sampling error but not the case in cluster sampling since
clusters are the same
• Exercise: DKT Ethiopia has decided to conduct survey about attitude towards contraception. The
target population for this study is dwellers in Addis Ababa whose age has crossed 15. What sampling
technique is appropriate? Why?
• Systematic random sampling: It is a means of selecting every kth element in the population as
sample element
— First, you need to have a complete list of elements in the population
— Then you have to decide k such that k > 1
N
k=
n
round it to the nearest integer

• Bias in probability sampling: The sampling error mostly caused by sampling bias
1 Non-response bias: Occurs when the respondents fail to answer the questions in the survey
2 Response bias: Occurs when respondents have provided inaccurate answer
3 Selection bias: Occurs when some elements in the population has got higher chance of
selection
4 Self-Selection bias: A type of bias in which individuals voluntarily select themselves into a
group, thereby potentially biasing the response of that group
5 Coverage bias: Occurs when population elements don’t appear in the sampling frame
• Purposive Sampling: The elements are selected by the judgment of the researcher
— It is also know as judgment, selective or subjective sampling
• Quota Sampling: The population is segmented into groups just as in stratified sampling. Then
judgment is used to select elements in each segment
— Quota sampling is the non-probability version of stratified sampling

Methods of Data Collection—Data Type
• The disadvantage of non-probability sampling is that since the sample is not representative, it is hard
to generalize the results
• Once we have decided to conduct either census or sample survey, we have to think about the possible
ways of collecting data
— Having good data is very essential for reaching at sound conclusion
• We have two methods of data collection
— Primary Data Collection Methods: questionnaire, interview, observation, focus group
discussion, etc
— Secondary Data Collection Methods
• Primary data is a data that has been collected for the first time by the researcher. i.e., First hand
information
• Secondary data is a data that has been collected and analyzed by somebody and is given for third
party for further analysis

Methods of Data Collection—Data Type
• All methods of primary data collection are depending on a set of questions. i.e., need questions to be
formulated first
• Questionnaire is a set of printed or written questions with a choice of answers, devised for the
purposes of a survey or statistical study
• Questionnaire can contain both close-ended and open-ended questions
— An open-ended question is the one in which you do not provide any standard answers to
choose from
— A closed-ended question is the one in which you provide the response categories, and the
respondent just chooses one
• Having a good questionnaire has an impact on the quality of data collection
• A planned, thoughtful process based on systematic principles

Methods of Data Collection—Questionnaire Design
• Excellent questionnaire involves a simultaneous integration of four layers

— Questions
— Objectives
— Words
— Layout or format
• The respondents have a pronounced effect on
— Type of questions you can ask
— Type of words you can use
— Concepts you can explore
— Methodology you can use

• Step 1: Decide what information is required

— Review the proposal of the research and make a listing of all the objectives and what
information is required
• Step 2: Make a rough listing of the questions
— A list is now made of all the questions that could go into the questionnaire
— The aim at this stage is to be as comprehensive as possible in the listing and not to worry about
the phrasing of the questions
• Step 3: Refine the question phrasing
— The questions must now be developed close to the point where they make sense and will
generate the right answers

• Target the vocabulary and grammar to the population being surveyed

— For studies within a specific organization, use the jargon used in that organization
— Be careful to avoid language that is familiar to you, but might not be to your respondents
— Avoid unnecessary abbreviations
• Avoid ambiguity, confusion, and vagueness
— Make sure it is absolutely clear what you are asking and how you want it answered. For
example, What is your income? daily, monthly or what?
— Avoid indefinite words or response categories. For instance, Do you jog regularly? What does
”regularly" mean?

• Avoid emotional language, prestige bias and leading questions

— Watch out for loaded words that have a history of being attached to extreme situations. For
example, avoid questions like “What should be done about murderous terrorists who threaten
the freedom of good citizens and the safety of our children?"
— Watch for prestige markers that cue the respondent to give the right answer. For example, the
question “Most doctors say that cigarette smoke causes lung disease for those near a smoker.
Do you agree?", tends to provoke yes answers because people trust doctors
— Avoid leading questions like “You don’t smoke, do you?"
• Avoid double-barreled questions
— Make each question about one and only one topic. For example, don’t ask Does your company
have pension and health insurance benefits? because if their company has only one of those
benefits, it is unclear whether the respondent will say yes or no

• Don’t assume that the respondents are expert on themselves (unless you have no choice)
— Suppose you want to test the idea that students give better evaluations to teachers who tell a
lot of jokes in class. The wrong way to investigate this is to ask “Do you rate a teacher higher if
the teacher tells many jokes?" because this assumes that the student is completely conscious of
everything they do and why
• Avoid asking questions beyond a respondent’s capabilities
— People have cognitive limitations, especially when it comes to memory of past events.
— It is pointless to ask people about things that are not natural ways for them to think
• Avoid false premises
— Asking “What should be done to enforce PM. Meles’s vision?" assumes that PM. Meles has
vision, which the respondent may not agree on. This puts the respondent in a tough spot
• Avoid asking about future intentions
— Hypothetical questions like If a new grocery store were to open down the street, would you shop
there? are notoriously unrelated to actual future behavior

• Avoid negatives and especially double negatives

— Negatives like Students should not be required to take a comprehensive exam to graduate are
often difficult for many respondents to process, especially if they agree with the predicate,
because then they are disagreeing with not doing something, which is confusing!
— Double negatives like It is not a good idea not to turn in homework on time yield very unreliable
data because people are unsure about whether to put a yes or no even if it is clear in their
minds whether turning homework in on time is a good idea.
• Step 4: Put the questions into an appropriate sequence
— The ordering of the questions is important as it brings logic and flow
— Question Placement: It’s a good idea to put difficult, embarrassing or threatening questions
towards the end. his has two benefits. First, it makes them more likely to answer, and, second,
if they get mad and leave, at least you’ve gotten most of your questions answered!
— Put related questions together to avoid giving the impression of lack of meticulousness

• Step 5: Finalise the layout of the questionnaire
— The questionnaire now needs to be fully formatted with clear instructions to the respondent
— There needs to be enough space to write in answers and the response codes need to be well
separated from each other so there is no danger of circling the wrong one
• Step 6: Pretest and revise
• Secondary data sources are governmental organizations such as CSA, NGOs, books, social medias
like Facebook
Advantage
• Faster, less expensive, and less activities (i.e., field trip)
Disadvantage
• Not easily available, not adequate
• May not meet the needs of researcher
• Outdated information and inaccurate or bias
Methods of Data Presentation—Frequency Distribution
• Frequency distribution is the organization of raw data in table form, using classes and frequencies
• Objectives of Frequency Distribution
— To organize the data in a meaningful, intelligible way
— To enable the reader to determine the nature or shape of the distribution
— To facilitate computational procedures for measures of average and spread
— To enable the researcher to draw charts and graphs for the presentation of data
— To enable the reader to make comparisons between different data set
• Based on the type and nature of the variable, frequency distribution can be categorized as categorical
frequency distribution, ungroup and grouped frequency distribution
• Categorical Frequency Distribution is used to tabulate categorical variables
— The major components are class, tally and frequency and percentage

• The steps to construct categorical frequency distribution are

1 Make sure the variable is in nominal or ordinal scale of measurement
2 Make a table as show below
Class Tally Frequency Percent
3 Put distinct values of a data set in the first column

4 Tally the data and place the result in the second column
5 Count the tallies and place the results in the third column
6 Find the percentage of values in each class by using the formula
f
%= 100%
n
where f is frequency and n is total number of values

• Example: Twenty five army inductees were given a blood test to determine their blood type. The
data set is given as follows:
A B B AB O
O O B AB B
B B O A O
A O O O AB
AB A O B A
Construct a frequency distribution for the above data.

• Solution:
Blood Type Tally Frequency Percentage
A ; 5 20%
B ;:: 7 28%
AB :::: 4 16%
O ;:::: 9 36%

• Exercise: Exercise: A survey was taken on how much trust people place in the information they read
on the Internet. Construct a categorical frequency distribution for the data. A trust in everything
they read, M trust in most of what they read, H trust in about one-half of what they read, S trust in
a small portion of what they read
M M M A H M S M H M
S M M M M A M M A M
M M H M M M H M H M
A M M M H M M M M M
• Ungrouped Frequency Distribution is the quantitative counterpart of categorical frequency

distribution
— It is used to present quantitative variable with small range and few distinct values

• Steps to construct ungrouped frequency distribution

1 Identify distinct values and put them on class column
2 Count the number of times each observation has occurred and put it under frequency column
3 Calculate the relative frequency
fi
rf = Pk , i = 1, 2, · · · , k
i=1 fi
Class Frequency Relative frequency Cumulative frequency
• Cumulative frequency tells us how many observation are accumulated up to and including a
particular distinct value
— Two varieties of cumulative frequencies—less than type and more than type

• For quantitative variable with small range and few distinct values (conventionally not more than 20
distinct values), ungrouped frequency distribution is appropriate
• Example: The data shown here represent the number of miles per gallon (mpg) that 30 selected
four-wheel-drive sports utility vehicles obtained in city driving. Construct a frequency distribution
12 17 12 14 16 18 16 18 12 16 17 15 15 16 12
15 12 15 15 19 13 16 18 16 14 15 16 16 12 14
• Solution: There are six distinct values
mpg Frequency Relative frequency Cumulative frequency
12 6 0.20 6
13 1 0.03 7
14 3 0.10 10
15 6 0.20 16
16 8 0.27 24
17 2 0.07 26
18 3 0.10 29
19 1 0.03 30
• Exercise: The following data represent the number of hours of TV viewing per week (X) for 75
people
0 1 5 4 2 3 1 0 1 5 2 3 1 4 5
2 1 2 2 3 4 6 5 7 1 2 0 1 0 1
5 4 2 1 4 5 6 3 2 1 5 8 6 3 1
6 3 0 1 0 2 5 4 3 4 1 4 0 2 5
0 2 4 8 3 5 4 7 5 0 1 2 3 1 4
Answer the following questions:

1 What is the variable of interest in this question?
2 Construct ungrouped frequency distribution
3 How many people are watching TV at least 3Hrs per week?
4 How many people are watching TV at most 4Hrs per week?

• Grouped Frequency Distribution is used to tabulate quantitative variable with high range
— The class is supposed to contain more than one observation. i.e., class limits
• The major components are:
— Class limits
— Class boundaries
— Class mark
— Frequency
— Relative frequency
— Cumulative frequency
• The general steps can be illustrated by using the following dataset
57 61 57 57 58 57 61 54 68 56 61
51 49 64 50 48 65 52 56 46 52 69
54 49 51 47 55 55 54 42 51 64 46
56 55 51 54 51 60 62 43 55 54 47

1 Compute the range Range = M ax − M in
Range = 69 − 42 = 27
2 Determine the number of classes (K)

— Convention: K must be between 5 and 20 classes
— Use Sturge’s Rule,
K = 1 + 3.32 log10 (n)
— It is preferable but not absolutely necessary that the class width be an odd number
— Classes must be mutually exclusive
— Classes must be continuous
— Classes must be exhaustive
— Classes must be equal in width
K = 1 + 3.32 log10 (44) = 6.5 ≈ 7
— Round up K every time you get fraction number

3 Compute the class width,

Range 27
W = = = 3.9 ≈ 4
K 7
Note: Round up W
4 Determine the lower class limit of the first class. Usually the minimum value is taken as lower class
limit. i.e., LCL1 = M in
LCL1 LCL2 LCL3 LCL4 LCL5 LCL6 LCL7

42 46 50 54 58 62 66
LCLi = LCLi−1 + W, i = 1, 2, · · · , K
5 Compute the upper class limit

U CLi = LCLi+1 − U
where U is called unit of measurement

• Unit of measurement is the absolute difference between one observation in the data set and some
other value that is supposed to come next
— Suppose the data set is full of integer values, then the unit of measurement becomes one
— Suppose the data set is 14.2, 12, 26. Then the unit of measurement is 14.3 − 14.2 = 0.01
U CL1 U CL2 U CL3 U CL4 U CL5 U CL6 U CL7
45 49 53 57 61 65 69
U CLi+1 = U CLi + W, i = 1, 2, · · · , K − 1
• Note that
W = |LCLi − LCLi+1 |
= |U CLi − U CLi+1 |
= U CLi − LCLi + U
One is the maximum possible value of unit of measurement

• Then the class limit becomes
Class limit
42 − 45
46 − 49
50 − 53
54 − 57
58 − 61
62 − 65
66 − 69
6 Compute the upper and lower class boundaries
U CBi = U CLi + U/2

, i = 1, 2, · · · , K
LCBi = LCL − U/2
For instance,
U CB1 = U CL1 + U/2 = 45 + 0.5 = 45.5
• Then the class boundaries are
Class boundaries
41.5 − 45.5
45.5 − 49.5
49.5 − 53.5
53.5 − 57.5
57.5 − 61.5
61.5 − 65.5
65.5 − 69.5
• The boundaries are not part of the original data set

• The boundaries are not mutually exclusive. i.e., U CBi = LCBi+1
W = U CBi − LCBi , i = 1, 2, · · · , K

7 Compute the class mark or class midpoint

LCLi + U CLi
CMi = , i ∈ [1, K]
2
U CBi + LCBi
=
2
CMi+1 = CMi + W
For instance,
42 + 45 41.5 + 45.5
CM1 = = = 43.5
2 2
CM2 = CM1 + W = 43.5 + 4 = 47.5
Note that class mark is the representative of each class

8 Count the number of observations that belong to each class limit.

— How many observation in the data set are belong to 42 − 45 class? 2!
9 Compute the relative frequency of each class limit. At this
Class Limit Class Boundary Class Mark Frequency Relative Frequency

42 − 45 41.5 − 45.5 43.5 2 2/44 = 0.045
46 − 49 45.5 − 49.5 47.5 7 7/44 = 0.159
50 − 53 49.5 − 53.5 51.5 8 0.182

54 − 57 53.5 − 57.5 55.5 16 0.364
58 − 61 57.5 − 61.5 59.5 5 0.114
62 − 65 61.5 − 65.5 63.5 4 0.091
66 − 69 65.5 − 69.5 67.5 2 0.045

10 Count less that and more than type cumulative frequencies
— The lower class limit can be used as a reference frame for finding more than type cumulative
frequencies: For instance, How many observations are greater than or equal to 42? 44
— The upper class limit can be used as reference frame for finding less than type cumulative
frequencies. For instance, How many observations are less than or equal to 45? 2
Class Limit Frequency Less than Cumulative Frequency More than Cumulative Frequency
42 − 45 2 2 44
46 − 49 7 9 42
50 − 53 8 17 35
54 − 57 16 33 27
58 − 61 5 38 11
62 − 65 4 42 6
66 − 69 2 44 2

Grouped Frequency Distribution
Class Limit Class Boundary Class Mark Frequency Relative Frequency lcf mcf
42 − 45 41.5 − 45.5 43.5 2 0.045 2 44
46 − 49 45.5 − 49.5 47.5 7 0.159 9 42
50 − 53 49.5 − 53.5 51.5 8 0.182 17 35
54 − 57 53.5 − 57.5 55.5 16 0.364 33 27
58 − 61 57.5 − 61.5 59.5 5 0.114 38 11
62 − 65 61.5 − 65.5 63.5 4 0.091 42 6
66 − 69 65.5 − 69.5 67.5 2 0.045 44 2
Exercise: This data represent the record high temperatures in degrees Fahrenheit for each of
the 50 states. Construct a grouped frequency distribution for the data using 7 classes.
112 100 127 120 134 118 105 110 109 112 110 118 117 116 118 122
107 112 114 115 118 117 118 122 106 110 114 114 105 109 116 108
110 121 113 120 119 111 104 111 120 113 120 117 105 110 118 112
114 114
Methods of Data Presentation—Diagrammatic Presentation
“ One picture is worth a thousand words

_________________________________
~Fred R. Barnard (1846 – 1896)
• Although presenting tables of numbers can be very informative, they can lack visual impact
— Delivers message instantly
— Summarizing the key features of the data, and representing it as a picture

• Pie chart is a type of graph in which a circle is divided into sectors that each represent a proportion
of the whole
— It is best used to present the proportions of a sample
— It is most useful where one or two results dominate the findings
— It can represent data summary as actual numbers or percentages
— Do not use when there are a large number of categories
• Example: Consider the following data of employment category of 474 employees of a company.
Manager
Category Frequency Percent
Custodial
Clerical 363 76.6
Custodial 27 5.7 Lorem ipsum
Manager 84 17.7 Clerical
Total 474 100

• Degree of each slice is computed as:

o fi
= Pn 360o
i=1 fi
For instance,
363
Clerical = 360o = 276o
474
• Pictogram: The pictorial representation of an event, concept, object, or place using symbols and/or
illustrations
• Pictograms are set out in the same way as bar charts, but instead of bars they use columns of
pictures to show the numbers involved.

• Example: Truism data of Ethiopia
Year 2015 2016 2017 2018 2019

Tourist number 864000 871000 933000 849000 812000
https://data.worldbank.org/indicator/ST.INT.ARVL?end=2019&locations=ET&start=2015
• Keep in mind that one tourist icon is equal with 50,000 tourists

• Bar Graph: a graph with rectangular bars. Each bar’s length or height is proportional to the bars’
represented values
— Simple bar graph: This type of graph is appropriate to represent one variable
— Cluster bar graph: Used to present two categorical variables. The bars are placed adjacent to
each other in each category of the variable on X axis
— Stacked bar graph: It is used to present two categorical variables. The graph looks like simple
bar but partitioned into components of the second variable
• Example: Consider the following data presented in the table below. Draw simple, cluster, and
stacked bar graphs
Employment Category
Gender Clerical Custodial Manager Total
Male 157 27 74 258
Female 206 0 10 216
Total 363 27 84 474

Simple bar graph examples
400
300
350
250
300
200
250
200 150
150 100
100
50
50
0
0 Male Female
Clerical Custodial Manager

Cluster bar graph examples

250
Male Female
250
200
200
150
150
100
100
50 50
0 0
Clerical Custodial Manager Male Female

Stacked bar graph examples

300
Male Female 250
400
350 200
300
250 150
200
100
150
100 50
50
0 0
Clerical Custodial Manager Male Female

• Line Graph: This graph uses line segment to connect two adjacent points. When the x axis is
representing time (like year, seconds, minutes, weeks, quarters) and the other axis shows numerical
values, the line graph can be called time series plot
• Example: Consider the following data and draw the line graph
Education in years Average salary 40000
8 13064.15 35000
12 13241.87
Average salary
14 15625.00 30000
15 15610.60 25000
16 22338.47
20000
17 26904.55
18 32240.00 15000
19 34764.07
10000
20 36240.00 8 12 14 15 16 17 18 19 20 21
Education in Years
21 37500.00
• Frequency Polygon: It is a line graph where the X axis is the class mark and the Y axis is either
frequency or relative frequency
— The graph touches the X axis at the beginning and at the end
• Example: Consider the following grouped frequency distribution and draw frequency polygon.
Class Limit Class Boundary Class Mark Frequency Relative Frequency

42 − 45 41.5 − 45.5 43.5 2 0.045
46 − 49 45.5 − 49.5 47.5 7 0.159
50 − 53 49.5 − 53.5 51.5 8 0.182
54 − 57 53.5 − 57.5 55.5 16 0.364
58 − 61 57.5 − 61.5 59.5 5 0.114
62 − 65 61.5 − 65.5 63.5 4 0.091
66 − 69 65.5 − 69.5 67.5 2 0.045

0.40
20 0.35
0.30
15
Relative Frequency
0.25
Frequency
0.20
10
0.15
0.10
5
0.05
0.00
0 39.5 43.5 47.5 51.5 55.5 59.5 63.5 67.5 71.5
39.5 43.5 47.5 51.5 55.5 59.5 63.5 67.5 71.5
Class Mark Class Mark
• Ogive (oh-jive), sometimes called a cumulative frequency polygon,is the line graph where the X axis
represents the class boundaries and the Y axis represents either less than type or more than type
cumulative frequency
• Example: Consider the previous grouped frequency distribution and draw Ogive
Class Limit Class Boundary Class Mark Frequency Relative Frequency lcf mcf
42 − 45 41.5 − 45.5 43.5 2 0.045 2 44
46 − 49 45.5 − 49.5 47.5 7 0.159 9 42
50 − 53 49.5 − 53.5 51.5 8 0.182 17 35
54 − 57 53.5 − 57.5 55.5 16 0.364 33 27
58 − 61 57.5 − 61.5 59.5 5 0.114 38 11
62 − 65 61.5 − 65.5 63.5 4 0.091 42 6
66 − 69 65.5 − 69.5 67.5 2 0.045 44 2
Table: Grouped frequency distribution

Less than type ogive
50 More than type ogive
40
Cumulative frequency
30
20
10
0
41.5 45.5 49.5 53.5 57.5 61.5 65.5 69.5
Class Boundary
• Histogram: It is a special type of bar graphs where there is no detachments between bars
— The X axis is class boundary and Y axis is either frequency or relative frequency
• Example: Consider the grouped frequency distribution that was used for ogive example and draw
histogram.
20
15
Frequency
10
0
41.5 45.5 49.5 53.5 57.5 61.5 65.5 69.5
Class Boundary

• Stem and Leaf Plot: It is a graph that uses the digit of each number to show the shape of the data;
each data value is broken down into a stem (the digit on the left) and leaf (the digit on the right)
• Example: Use stem and leaf plot to graph the following test score of probability and statistics course
77 34 45 49 67 87 45 44 98 89
55 65 67 87 99 66 77 45 84 40
50 69 80 68 55 65 43 68 66 87
Stem Leaf
3 4
4 0345559
5 055
6 556677889
7 77
8 047779
9 89

Chapter Two
Summarizing Data
Summarizing Data
Outline
1 Measures of central tendency: objectives of measuring central tendency

2 Types of measures of central tendency
— Mean (arithmetic, weighted, geometric and harmonic)
— Mode
— Median
3 Measures of Location: quantiles (quartiles, deciles and percentiles)
4 Measures of dispersion
— Range
— Variance
— Standard deviation
— Coefficient of variation
5 Standard scores

Summarizing Data
Measures of Central Tendency—Introduction
• Complex data analysis in statistics begins with knowing the data. i.e., describing the data
• As long as we are obsessed with exploring data, we have to focus on the following five issues
— Center: Finding a single value that represent the center of a data. i.e., mean, median, mode,
harmonic mean, geometric mean
— Variation: Finding the scattered-ness of a data. i.e., variance, coefficient of variation, quartile
deviation, etc
— Distribution: Determining the shape of a data. i.e., skewness and kurtosis
— Outliers: Checking or detecting the presence wild observations in the data. i.e., 1.5IQR
— Time: Population characteristics changed through time.
Objectives of Measures of Central Tendency

• To summarize a set of data by a single value
• To facilitate comparison among different data sets
• To use for further statistical analysis or manipulation. i.e., for instance, simple arithmetic mean is
useful to calculate variance
Summarizing Data
Parameter and Statistic
• A statistic is a characteristic or measure obtained by using the data values from the sample.
Denoted by capital letters such as X̄ and S 2
• A parameter is a characteristic or measure obtained by using the data values from the population.
Denoted by Greek letters such as µ, σ 2
• Variables are denoted by capital letters. Suppose X is a variable having n observations. i.e.,
x1 , x2 , · · · , xn
• Let xi represent the ith observation of variable X. i is called index or subscript. The sum of all
observations of variable X is denoted by Greek capital letter sigma (Σ)
n
X
x1 + x2 + · · · + xn = xi
i=1
• Note: When no confusion can result, we often denote the sum of all observations of X simply by
X X X
x= xi = xi
i
Summarizing Data
Basic properties of summation

• Suppose X and Y are variables with n observations each
n
X
(xi + yi ) = (x1 + y1 ) + (x2 + y2 ) + · · · + (xn + yn )
i=1
= x1 + x2 + + · · · + xn + y1 + y2 + · · · + yn
Xn Xn
= xi + yi
i=1 i=1
Pn Pn Pn
Likewise, i=1 (xi − yi ) = i=1 xi − i=1 yi
• ( ni=1 xi )2 = (x1 + x2 + · · · + xn )2
P
• ni=1 xi yi = x1 y1 + x2 y2 + · · · + xn yn
P

Summarizing Data
• Suppose X has n observations and let α be a non-zero arbitrary constant number, then
n
X
αxi = αx1 + αx2 + · · · + αxn
i=1
= α(x1 + x2 + · · · + xn )
n
X
= α xi
i=1
• If α is a non-zero constant, then

n
X
α = (α + α + · · · + α)
i=1
= nα

Summarizing Data
• Suppose X is a variable with n observation and α and β are constant numbers

n
X
(αxi + β) = (αx1 + β) + (αx2 + β) + · · · + (αxn + β)
i=1
= αx1 + αx2 + · · · + αxn + β + β + · · · + β
= α(x1 + x2 + · · · + xn ) + β + β + · · · + β
n
X
= α xi + nβ
i=1
Pn Pn
Likewise, i=1 (αxi − β) = α i=1 xi − nβ
• ni=1 x2i = x21 + x22 + · · · + x2n
P

Summarizing Data
Characteristics of a typical average and measure of dispersion

• It should be easy to calculate and understand
• It should be rigidly defined. In other words, it should have one and only one interpretation so that
personal bias of the investigator does not affect the value of its usefulness
• It should be representative of the data under consideration
• It should have sampling stability. i.e., it should not be affected by sampling fluctuations.
• It should not be affected by extreme values
• It should be amenable for further algebraic manipulation
• Note: If a particular measure of central tendency or dispersion has failed to show some of these
characteristics, the failure will be considered as a disadvantage

Summarizing Data
Measures of Central Tendency—Types of MCT
• Simple arithmetic mean is defined as the sum of all observations divided by the total number of
observation.
• Simple arithmetic mean is the most familiar measure of central tendency. It is the first measure of
central tendency (MCT) that comes into our mind when we think of average
• The simple arithmetic mean which is computed from the sample is denoted by bar over the head of
the variable
— If the variable is denoted by X, then the mean will be denoted as X̄ (pronounced X-bar)
n
1X 1
X̄ = xi = (x1 + x2 + · · · + xn )
n i=1 n
where n is total number of observations in the sample for X

Summarizing Data
• The simple arithmetic mean computed from the population is denoted by Greek letter mu (µ)
N
1 X 1
µ= xi = (x1 + x2 + · · · + xN )
N i=1 N
where N is total number of observations in the population. Note that X̄ is sample statistic and µ is
population parameter
• Example: The number of tourists who have visited Ethiopia from 2009 to 2014 are 427000, 468000,
523000, 597000, 681000 and 770000, respectively. Compute the average number of tourists per year.
• Solution: Let X be number of tourists, then the average number of tourists is
n
1X 1
X̄ = Xi = (x1 + x2 + x3 + x4 + x5 )
n i=1 6
1
= (427000 + 468000 + 523000 + 597000 + 681000 + 770000)
6
1
= (3466000)
6
≈ 577667
Summarizing Data
• Interpretation: On average 577,667 tourists were visiting Ethiopia every year from 2009 to 2014.
• Example: The grades of a student on six examinations were 84, 91, 72, 68, 87 and 78. Find the
arithmetic mean of the grades.
• Solution: Suppose Y represents students’ grade
n
1X 1
Ȳ = yi = (y1 + y2 + · · · + y6 )
n i=1 6
1
= (84 + 91 + 75 + 68 + 87 + 78)
6
1
= (480)
6
= 80
Interpretation: The average grade of six students is 80.

Summarizing Data
• Exercise: The data represent the number of days o per year for a sample of individuals selected from
nine different countries. Find the mean.
20, 26, 40, 36, 23, 42, 35, 24, 30
• Suppose the data has k distinct observations. Let x1 occurs f1 times, x2 occurs f2 times,..., xk
occurs fk times as in the table below
Distinct values (X) Frequency (f )

x1 f1
x2 f2
.. ..
. .
xi fi
.. ..
. .
xk fk

Summarizing Data
• Then, the simple arithmetic mean is computed as follows:
1
X̄ = (f1 x1 + f2 x2 + · · · + fk xk )
f1 + f2 + · · · + fk
k
!
1 X
= Pk fi xi
i=1 fi i=1
• Example: The number of cylinder of 32 automobiles have been recorded and the following table has
been constructed
# of Cylinder # of automobiles
4 11
6 7
8 14
Compute the average number of cylinder for 32 automobiles

Summarizing Data
Measures of Central Tendency—Types of MC
• Solution: Let X represent the number of cylinder in a car. k = 3
k
!
1 X
X̄ = Pk fi xi
i=1 fi i=1
f1 x1 + f2 x2 + f3 x3
=
f1 + f2 + f3
11(4) + 7(6) + 14(8)
=
11 + 7 + 14
= 6.2 ≈ 6
Interpretation: The average number of cylinder per car is 6.

• Mean for Grouped Data: The class mark (class midpoint) is the representative of the class limit.
So, class mark is taken as data value (xi )

Summarizing Data
• Example: Consider the following grouped frequency distribution and compute the arithmetic mean
class limit class mark (xi ) frequency (fi )

5−9 7 3
10 − 14 12 4
15 − 19 17 1
20 − 24 22 3
• Solution: The simple arithmetic mean is

k
!
1 X
X̄ = Pk fi xi
fi i=1
i=1
f1 x1 + f2 x2 + f3 x3 + f4 x4
=
f1 + f2 + f3 + f4
3(7) + 4(12) + 1(17) + 3(22)
=
3+4+1+3
= 13.8
Summarizing Data
• Exercise: The hourly compensation costs (in U.S. dollars) for production workers in selected
countries are represented below. Compute the simple arithmetic mean
Class frequency
02.48 − 07.48 7
07.49 − 12.49 3
12.50 − 17.50 1
17.51 − 22.51 7
22.52 − 27.52 5
27.53 − 32.53 5
• Exercise: Consider the grouped frequency distribution (Table 1) in Chapter 1

1 Compute the mean of raw data and
2 Compute the mean of the grouped data
3 Compare the values in 1 and 2. Is there a difference? Why?

Summarizing Data
Properties of Simple Arithmetic Mean
• The algebraic sum of the deviation of a set of numbers from their arithmetic mean is zero. i.e.,
n
X
(xi − X̄) = 0
i=1
• The sum of the square of deviations of a set of numbers xi from the mean is always the least. i.e.,
n
X n
X
(xi − X̄)2 < (xi − A)2 , A 6= X̄
i=1 i=1
• Suppose the data is partitioned into k groups where the first group has n1 observation with the
corresponding mean of X̄1 , the second group has n2 observation with mean X̄2 ,..., the kth group has
nk observations with mean X̄k , then the combined (pooled) mean is computed as
k
!
n1 X̄1 + n2 X̄2 + · · · + nk X̄k 1 X
X̄c = = Pk ni X̄i
n1 + n2 + · · · + nk i=1 ni i=1
Summarizing Data
• Example: The average age of 45 female students in a class is 25 years and the average age of 35
male students is 30. What is the average age of the class?
• Solution:
nf X̄f + nm X̄m 45(25) + 35(30)
X̄c = = = 27.2 Y rs
nf + nm 45 + 35
• Example: A company has different rate of wage per hour based on the experience of employees. If
60 employees have been paid 15.23 Birr per hour and 23 employees are paid 25 Birr per hour. What
is the average hourly payment in this company?
• Solution: Let wage be denoted by X, n1 = 60, X̄1 = 15.23, n2 = 23, X̄2 = 25
n1 X̄1 + n2 X̄2 60(15 : 23) + 23(25)

X̄c = = = 17.94 Birr
n1 + n2 60 + 23
• Exercise: Four groups of students, consisting of 15, 20, 10, and 18 individuals, reported mean
weights of 162, 148, 153, and 140 pounds (lb), respectively. Find the mean weight of all the students.

Summarizing Data
• Example: Suppose the mean of variable X is 23. A new variable Y has been created as
yi = xi + 3, i = 1, 2, · · · , n. Show that the mean of Y is 26
• Solution: By definition, the mean of Y is
n n
1X 1X
Ȳ = yi = (xi + 3)
n i=1 n i=1
" n # n
1 X 1X 3n
= xi + 3n = xi +
n i=1 n i=1 n
= X̄ + 3 = 23 + 3
= 26
• Exercise: Suppose the mean of X is X̄ . Show that the mean of Z = aX + b is aX̄ + b where a and
b are non-zero constant numbers

Summarizing Data
• Suppose each observation of variable X may not have equal importance. We have to assign weights
for observations
• Suppose variable X has n observations where x1 has a weight of w1 , x2 has a weight of w2 , ..., xn
has a weight of wn , then the mean of X is
w1 x1 + w2 x2 + · · · + wn xn
X̄w =
w1 + w2 + · · · + wn
n
!
1 X
= Pn w i xi
i=1 wi i=1
• When all observations have equal weights, simple arithmetic and weighted means are the same. i.e.,
if w1 = w2 = · · · = wn , then X̄w = X̄
— Simple arithmetic mean is a special type of weighted arithmetic mean

Summarizing Data
• Example: A teacher attaches weights 2 to homework (HM) 3 to midterm exam (MT) and 5 for final
exam (FE). If a student score 90, 50 and 60 for HM, MT and FE, respectively, what is his/her
average academic performance?
• Solution:
n
!
1 X
X̄w = Pn wi xi
i=1 wi i=1
w1 x1 + w2 x2 + w3 x3 2(90) + 3(50) + 5(60)
= =
w1 + w2 + w3 2+3+5
= 63
Interpretation: The average performance of a student is 63.
• Exercise: Mr. Tazu has the following grade in the first semester. Find his semester grade
Course Code CrHr Grade
Stat3022 3 A
Econ4032 4 B
Mgmt3033 3 C
Summarizing Data
• The geometric mean of a set of n observation of variable X is
n
! n1
√ Y
G= n
x1 x2 · · · xn = xi
i=1
• Geometric mean is often used in business and economics to find average rates of change, average
rates of growth, or average ratio
• Example: Suppose a variable X has the following three observations. Find the geometric mean
5, 10, 6
Solution:
√
G = 3
x1 · x2 · x3
1
= (5 × 10 × 6) 3
= 6.69

Summarizing Data
• Example: A price of a commodity increased by 5%, 8% and 77% for the three consecutive years.
What was the average yearly price increase?
• Solution: Let y0 be the price of the original year.
y1 = y0 + 0.05y0 = 1.05y0
y1
= 1.05
y0
y2 = y1 + 0.08y1 = 1.08y1
y2
= 1.08
y1
y3 = y2 + 0.77y2 = 1.77y2
y3
= 1.77
y2
The geometric mean for y1 , y2 and y3 is
√ √
G = 3 y1 × y2 × y3 = 3 1.05 × 1.08 × 1.77 = 1.26

Summarizing Data
• Exercise: Suppose you have an investment which earns 10% the first year, 50% the second year, and
30% the third year. What is its average rate of return?
• The harmonic mean of variable X with n observation is the reciprocal of the arithmetic mean of the
reciprocated observations
n
H=
Pn 1
i=1
xi
• Example: Find the simple arithmetic mean, geometric mean and harmonic mean for the following
data: 2, 4, 8, 6
n 4
H = 1 1 1 1 = 1 1 1 1
x1 + x2 + x3 + x4 2 + 4 + 8 + 6
= 4.17
X̄ = 5 and G = 4.43

Summarizing Data
• Exercise: Four students drive from Jimma to Addis Ababa at a speed of 40 km/hr. Since they need
to reach statistics class on time, they return at a speed of 60 km/hr. What is their average speed for
the round trip?
• The geometric mean of x1 , x2 , · · · , xn is less than or equal to their arithmetic mean but is greater
than or equal to their harmonic mean. In symbols,
H ≤ G ≤ X̄
The equality signs hold true when all the numbers x1 , x2 , · · · , xn are identical. i.e.,
x1 = x2 = · · · = xn
• Exercise: Show that H = G = X̄ if x1 = x2 = · · · = xn
• Mode is the most frequently observed value of a variable. i.e., mode is the value of X with the
highest frequency
— Mode may not exist; even if it does exist, it may not be unique
— Mode is denoted by hat over the head of a variable like X̂

Summarizing Data
• Example: The set 2, 2, 5, 7, 8, 9, 9, 9, 10, 10, and 11 has mode 9. The set 3, 5, 8, 10, 12, 15, and
16 has no mode. The set 2, 3, 4, 4, 4, 5, 5, 7, 7, 7, and 9 has two modes, 4 and 7
• A distribution having only one mode is called unimodal and two modes is called bimodal
• Exercise: Consider the following ungrouped frequency distributions. What is/are their mode(s)?
Age f Weight f Salary f

15 2 25 6 1500 78
20 10 40 25 2500 45
25 5 56 5 3500 78
30 2 74 22 10000 32
• Mode for Grouped Data: The mode can be computed as:

∆1
X̂ = LCBm + W
∆1 + ∆ 2
where LCBm is the lower class boundary of the modal class, W is the corresponding weight and
fmodal is frequency of modal class, ∆1 = fmodal − fmodal−1 , ∆2 = fmodal − fmodal+1
Summarizing Data
• Modal class is a class limit with the highest frequency. The modal class contains the mode
• Example: Consider the following grouped frequency distribution and compute the mode
Class limit Class boundary f Solution:

6 − 11 5.5-11.5 2 ∆1 = 7 − 1 = 5
12 − 17 11.5 − 17.5 ·
18 − 23 17.5 − 23.5 ¼ ∆2 = 7 − 4 = 3
24 − 29 23.5 − 29.5 ¹ W = 23.5 − 17.5 = 6
30 − 35 29.5 − 35.5 3
LCBm = 17.5
36 − 41 35.5 − 41.5 2
∆1

X̂ = LCBm + W
∆1 + ∆2
5

= 17.5 + 6 = 21.25
5+3
Summarizing Data
• Exercise: Compute the mode of the following grouped frequency distributions
class limit f class limit f class limit f

10 − 14 5 10 − 14 12 10 − 14 12
15 − 19 12 15 − 19 10 15 − 19 9
20 − 24 6 20 − 24 6 20 − 24 6
25 − 29 10 25 − 29 10 25 − 29 12
Median
The median of a data set is the measure of center that is the middle value when the original data
values are arranged in order of increasing (or decreasing) magnitude.
• The median is denoted by tilde over the head of the variable (like X̃ )

Summarizing Data
• Median for Raw Data: The steps are

1 Sort the data either in ascending or descending order
2 Determine whether the n is even or odd
— If n is odd, then the medial will be
th
n+1

X̃ = x n+1 = observation
2 2
— If n is even, then the median will be
x n2 + x n2 +1
X̃ =
2
i.e., the average of the middle two observations
• Example: The median of 2, 5, 3 , 6, 9, 7, 8 is 6. The median of 2, 5, 0, 5, 6, 9 is 5

Summarizing Data
• Exercise: What is the median of the following data sets

salary f lcf Hrs f lcf Grade f lcf
2500 10 10 2 2 2 2.53 3 3
2800 5 15 3 6 8 2.75 25 28
3200 6 21 4 9 17 3.25 30 58
3600 3 24 5 4 21 3.52 45 103
• Median for Grouped Data: The median can be obtained by using the following formula
n
−C

X̃ = LCBmed + 2 W
fmed
where LCBmed is the lower class boundary of the median class, C is the lcf which comes before the
median frequency (fmed ), W is the class width,

Summarizing Data
• Example: Find the median for the following grouped data

Class limit Class boundary f lcf
06 − 11 5.5-11.5 2 2
12 − 17 11.5 − 17.5 2 ¹ n
n = 20, = 10, C=4
18 − 23 17.5 − 23.5 ¼ 11 2
24 − 29 23.5 − 29.5 4 15
W = 6, LCBmed = 17.5
30 − 35 29.5 − 35.5 3 18
36 − 41 35.5 − 41.5 2 20
n
−C

2
X̃ = LCBmed + W
fmed

10 − 4
= 17.5 + 6
7
= 22.6

Summarizing Data
• Exercise: Consider the following grouped data compute median
Grade f lcf Wight(Kg) f lcf
10 − 19 5 5 40 − 49 6 6
20 − 29 2 7 50 − 59 5 11
30 − 39 9 16 60 − 69 2 20
40 − 49 0 16 70 − 79 25 45
50 − 59 10 26 80 − 89 9 54
60 − 69 8 34 90 − 99 3 57
• Empirical Relationships between Mean, Mode and Median

— For unimodal frequency curves that are moderately skewed (asymmetrical), we have the
following empirical relation
X̄ − X̂ = 3(X̄ − X̃)
—
https://flexbooks.ck12.org/cbook/ck-12-cbse-math-class-10/section/14.4/primary/lesson/
mode-of-grouped-data
Summarizing Data
• For negatively skewed distribution, the median and the mode would be to the right of the mean. i.e.,
X̄ < X̃ < X̂
• In a positively skewed frequency distribution, the median and mode would be to the left of the mean.
i.e.,
X̄ > X̃ > X̂
• For symmetric distribution, mean, median and mode are equal
X̄ = X̃ = X̂

Measures of Location
Summarizing Data
Percentile
Percentile is one of the measures of location where it divides a data set into 100 equal parts and
denoted by Pi , i = 1, 2, · · · , 99
• Percentile for Raw Data: The steps are

1 Sort the data in ascending order. i.e., from minimum to maximum. Let’s consider the following
example to illustrate the steps.
2, 5, 6, 8, 9, 10, 12, 13, 15, 18
2 Determine Pi . For instance, P50 (50th percentile) this implies i = 50
i(n + 1)
k=
100
50(10 + 1)
k= = 5.5
100

Summarizing Data
3 Split k into integer (I) and fraction part (f ),
k = 5 + 0.5
Compute I + 1
4 Find the I th and (I + 1)th observation in the sorted data set
Pi = I th Obs + [(I + 1)th Obs − I th Obs]f
P50 = 9 + (10 − 9)0.5 = 9.5
• Example: Find P25 , P30 , P46 and P85 for the following data
5, 6, 7, 9, 3, 4, 7, 7, 5, 6, 10, 23, 45, 14, 22, 36

Summarizing Data
• Solution: The sorted data set is
3, 4, 5, 5, 6, 6, 7, 8, 8, 9, 10, 14, 22, 23, 36, 45
— For P25 , i = 25, n = 16
25(16 + 1)
k= = 4.25, I = 4, f = 0.25
100
The 4th and 5th observations are 5 and 6, respectively. Then
P25 = 5 + (6 − 5)0.25 = 5.25
— For P30 , i = 30
30(16 + 1)
k= = 5.1, I = 5, f = 0.1
100
The 5th and 6th observations are 6 and 6, respectively. Then,

Summarizing Data
P30 = 6 + (6 − 6)0.1 = 6
• For P46 , i = 46
46(16 + 1)
k= = 7.82, I = 7, = 0.82
100
P46 = 7 + (8 − 7)0.82 = 7.82
• For P85 , i = 85
85(16 + 1)
k= = 14.45, I = 14, f = 0.45
100
P85 = 23 + (36 − 23)0.45 = 28.85

Summarizing Data
• Percentile for Group Data: The ith percentile can be computed as

n
i 100 − C

Pi = li + W
fpi
where li is the lower class boundary of the ith percentile class, fpi is the frequency of the ith
percentile class and C is the lcf that comes before the ith percentile class
• Example: Find the 20th, 75th and 60th percentile for the following data set
Weight (Kg) f lcf

40 − 49 6 6
50 − 59 5 11
60 − 69 9 20
70 − 79 25 45

Summarizing Data
• Solution:
— The 20th percentile
n 20(45)
i = =9
100 100
The 9th observation will lie in 50 − 59 class. Thus, C = 6, W = 10, fp20 = 5 and l20 = 49.5
 n 
20 −C
P20 = l20 +  100  W = 49.5 + 9 − 6 10 = 55.5
fP20 5
— The 75th percentile,

n 20(45)
i = = 33.75
100 100
The (33.75)th observation will lie in 70 − 79 class. Thus, C = 20, W = 10 and fp75 = 25 and
l75 = 69.5

Summarizing Data
n
 
 75 100 − C 
 W = 69.5 + 33.75 − 20 10 = 75

P75 = l75 + 
 fP75  25
• The 60th percentile

n 60(45)
i = = 27
100 100
The 27th observation will lie in 70 − 79 class. Thus, C = 20, W = 10 and fp60 = 25 and l60 = 69.5
 n 
60 −C
P60 = l60 +  100  W = 69.5 + 27 − 20 10 = 72.3
fP60 25
Note: The 50th percentile is equal with the median. i.e., P50 = X̃

Summarizing Data
Decile
Decile divides a data set into ten equal parts. It is denoted by Di , i = 1, 2, · · · , 9
Quartile
Quartile divides a data set into four equal parts. It is denoted by Qi , i = 1, 2, 3
• The first decile is equal with the 10th percentile. Moreover,
D2 = P20 , D3 = P30 , D4 = P40 , Di = Pi×10
• The first quartile is equal with the 25th percentile. Moreover,
Q2 = P50 , Q3 = P75
• Note: As you have noticed, decile and quartile are part of percentile.

Summarizing Data
• Exercise: The accompanying frequency distribution summarizes a sample of human body
temperatures (in o F ).
Temperature Frequency
96.5 − 96.8 1
96.9 − 97.2 8
97.3 − 97.6 14
97.7 − 98.0 22
98.1 − 98.4 19
98.5 − 98.8 32
98.9 − 99.2 6
99.3 − 99.6 4
Compute
1 D1 and D3
2 35th percentile
3 Q1 and Q3
4 50th and 69th percentiles
Measure of Dispersion
Summarizing Data
• Dispersion is the scatteredness or spreadness of the individual items in a given series. The term
dispersion is generally used in two senses
— Dispersion refers to the variations of the items among themselves
— Dispersion refers to the variation of items around an average
• If the difference between the value of items and the average is large, the dispersion will be high and
on the other hand if the difference between the value of the items and averaging is small, the
dispersion will be low
• Objectives of Measures of Dispersion
1 To determine the reliability of an average: If the variation is small , the average will closely
represent the individual values and it is highly representative on the other hand, if the variation
is large, the average will be quite unreliable
2 To compare the variability of two or more data sets: A high degree of variation would
mean less consistency or less uniformity as compared to the data having less variation

Summarizing Data
• Objectives of Measures of Dispersion

3 For facilitating the use of other statistical measures: Measures of dispersion serve the basis
of many other statistical measures such as correlation, regression, testing of hypothesis etc.
4 Basis of statistical quality control: The extent of the dispersion gives indication to the
officials whether the variation in the quality of the product is due to random factors or some
defect in the manufacturing process
• Measures of variation are classified as absolute and relative
• Absolute Measures of Dispersion:
— They are expressed in the same unit in which the original data are given such as kilograms,
tones etc
— These measures are not suitable for comparing the variability in two distributions having
variables expressed in different units.

Summarizing Data
• Relative Measures of Dispersion:
— They are the ratio of a measure of absolute dispersion to an appropriate average or the selected
items of the data
— They are unit-less and can be used to compare the degree of dispersion of different data set
Absolute Measure Relative Measure

of Dispersion of Dispersion
Based on Based on All

Based on Based on All Selected Item Items
Selected Item Items
Range & Mean deviation & Coefficient of Range, Coefficient of Mean

Inter quartile range Standard deviation Coefficient of quartile devaition, coefficient
deviation of variation

Summarizing Data
Range
It is the simplest measures of dispersion. It is defined as the difference between the largest and smallest
value in the data set
R = M ax − M in
Relative Range
The relative measures of range, also called coefficient of range, is defined as
M ax − M in
RR =
M ax + M in
Note: For grouped frequency distribution M ax is the upper class limit of the last class and M in is the
lower class limit of the first class

Summarizing Data
• Example: Five students obtained the following marks in statistics: 20, 35, 25, 30, 15. Find the range
and coefficient of range
• Solution: M ax = 35, M in = 15
M ax − M in R
RR = =
R = M ax − M in M ax + M in M ax + M in
= 35 − 15 20
=
= 20 35 + 15
= 0.4
• Example: Compute range for the following grouped data set
size 5 − 10 11 − 15 16 − 20 21 − 25 26 − 30
Frequency 4 9 15 30 40
M ax = 30, M in = 5, R = M ax − M in = 30 − 5 = 25

Summarizing Data
Inter-Quartile Range
IQR is the range of the middle 50% of the observations . i.e., the difference between the upper quartile
and lower quartile
IQR = Q3 − Q1
Quartile deviation
Quartile deviation, also called semi-inter-quartile range is half of the difference between the upper and
lower quartile
Q3 − Q1 IQR
QD = =
2 2
Coefficient of Quartile Deviation

The relative measure of quartile deviation also called the coefficient of quartile deviation is
Q3 − Q1
Coef f.QD =
Q3 + Q1

Summarizing Data
• Example: Find inter-quartile deviation, quartile deviation and coefficient of quartile deviation from
the following data.
15, 18, 20, 24, 27, 28, 30
• Solution: Q3 = 28, Q1 = 18
IQR = Q3 − Q2 = 28 − 18 = 10
IQR
QD = =5
2
Q3 − Q1 10
Coef f.QD = = = 0.22
Q3 + Q1 46
• Exercise: Compute quartile deviation, inter-quartile range and coefficient of quartile deviation for the
following grouped frequency distributions.

Summarizing Data
Weight(Kg) No. of managers Score No. of Students
45.5 − 48.4 6 10 − 19 24
48.5 − 51.4 4 20 − 29 16
51.5 − 54.4 26 30 − 39 13
54.5 − 57.4 1 40 − 49 15
57.5 − 60.4 25 50 − 59 43
60.5 − 63.4 3 60 − 69 18
Mean Deviation
Consider a set of observations of variable X, x1 , x2 , · · · , xn . mean or average deviation (MD)a is
defined as
n
1X
MD = |xi − A|
n i=1
where A is a constant that represents mean, median and mode.
a
It is also called mean absolute deviation
Summarizing Data
Measure of Dispersion—Mean Deviation
• The mean deviation about the mean is defied as:

n
1X
M DX̄ = |xi − X̄|
n i=1
• The mean deviation about the median is defied as:

n
1X
M DX̃ = |xi − X̃|
n i=1
• The mean deviation about the mode is defied as:

n
1X
M DX̂ = |xi − X̂|
n i=1

Summarizing Data
• Mean Deviation for Grouped Frequency Distribution
k
!
1 X
M D = Pk fi |xi − A|
i=1 fi i=1
where fi and xi are the frequency and class mark of the ith class, respectively
Coefficient of Mean Deviation

The relative measure of mean deviation, also called the coefficient of mean deviation is obtained by
dividing mean deviation by particular average used in computing mean deviation. i.e.,
M DX̄
Coef f.M DX̄ =
X̄
M DX̃
Coef f.M DX̃ =
X̃

Summarizing Data
• Exercise: What is the formula for coefficient of mean deviation about the mode?
• Example: Compute mean deviation and coefficient of mean deviation about the mean, median and
mode for the following data:
28, 23, 56, 89, 55, 63, 47, 56, 41, 46, 22
• Solution: The mean, median and mode are:

28 + 23 + · · · + 22
X̄ = = 47.82, X̃ = 47, X̂ = 56
11
Mean deviation about the mean
n 11
1X 1 X
M DX̄ = |xi − X̄| = |xi − 47.82|
n i=1 11 i=1
1
= (|x1 − 47.82| + |x2 − 47.82| + · · · + |xn − 47.82|)
11
= 14.53

Summarizing Data
Mean deviation about the median
n 11
1X 1 X
M DX̃ = |xi − X̃| = |xi − 47|
n i=1 11 i=1
1
= (|x1 − 47| + |x2 − 47| + · · · + |xn − 47|)
11
= 14.45
Mean deviation about the mode

n 11
1X 1 X
M DX̂ = |xi − X̂| = |xi − 47|
n i=1 11 i=1
1
= (|x1 − 56| + |x2 − 56| + · · · + |xn − 56|)
11
= 15.45

Summarizing Data
The coefficient of mean deviation about the mean is
M DX̄ 14.53
Coef f.M DX̄ = = = 0.304
X̄ 47.82
The coefficient of mean deviation about the median and mode are
M DX̃ 14.45
Coef f.M DX̃ = = = 0.307, Coef f.M DX̂ = 0.276
X̃ 47
Exercise: Compute coefficient of mean deviation about the median for the following grouped
frequency distributions and identify which frequency distribution is more consistent
Weight(Kg) No. of managers Score No. of Students

45.5 − 48.4 6 10 − 19 24
48.5 − 51.4 4 20 − 29 16
51.5 − 54.4 26 30 − 39 13
54.5 − 57.4 1 40 − 49 15
57.5 − 60.4 25 50 − 59 43
60.5 − 63.4 3 60 − 69 18
Summarizing Data
Measure of Dispersion—Variance and Standard Deviation
Variance
Suppose variable X has n observations x1 , x2 , · · · , xn , the variance which is computed from the sample
is defined as
n
1 X
S2 = (xi − X̄)2
n − 1 i=1
where n − 1is the number of independent observations called degree of freedom
• The variance which is computed from a population is denoted by Greek letter sigma (σ 2 )
N
1 X
σ2 = (xi − µ)2
N − 1 i=1
• Reading Assignment: In some statistics books, the sample variance is defined as the mean of the
square deviation of each observation from the center (mean). i.e.,
n
1X
S2 = (xi − X̄)2
n i=1
This formula is not advised as an estimator of the population variance. Why?
Summarizing Data
• The standard deviation of a statistical data is defined as the positive square root of the variance.
i.e., v
u n
u 1 X
S=t (xi − X̄)2
n − 1 i=1
• Alternative formula for variance

n n
1 X 1 X 2
S2 = (xi − X̄)2 = (x + 2xi X̄ + X̄ 2 )
n − 1 i=1 n − 1 i=1 i
n n n
!
1 X
2
X X
2
= x + 2xi X̄ + X̄
n − 1 i=1 i i=1 i=1
n n
!
1 X X
= x2 − 2X̄ xi + nX̄ 2
n − 1 i=1 i i=1
n
!
1 X
2 2
= x − nX̄
n − 1 i=1 i

Summarizing Data
• Example: Compute the variance and standard deviation of the following data
2, 4, 6, 8
• Solution: The arithmetic mean is

n
1X 1
X̄ = xi = (2 + 4 + 6 + 8) = 5
n i=1 4
The variance is
n
2 1 X
S = (xi − X̄)2
n − 1 i=1
1
(2 − 5)2 + (4 − 5)2 + (6 − 5)2 + (8 − 5)2

=
4−1
= 6.67
The standard deviation is √
S= 6.67 = 2.58
Summarizing Data
• Example: Compute the variance and standard deviation of the following data
48, 36, 63, 45, 12, 32, 45, 65
• Solution: The mean is X̄ = 43.25. The variance is

n
1 X
S2 = (xi − X̄)2
n − 1 i=1
(48 − 43.25)2 + (36 − 43.25)2 + · · · + (65 − 43.25)2
=
8−1
= 292.5
The standard deviation is √

S= 292.5 = 17.103

Summarizing Data
• Exercise: The number of highway miles per gallon of the 10 worst vehicles is shown.
12, 15, 13, 14, 15, 16, 17, 16, 17, 18
Compute variance and standard deviation

• Variance for Grouped Data
k
1 X
S 2 = Pk fi (xi − X̄)2
i=1 fi − 1 i=1
where xi and fi are the class mark and frequency of of the ith class. k is number of classes
• Exercise: Compute the variance and standard deviation for the following grouped frequency
distributions and identify which frequency distribution is more dispersed.

Summarizing Data
Weight(Kg) # of Students Score # of Students

45.5 − 48.4 6 10 − 19 24
48.5 − 51.4 4 20 − 29 16
51.5 − 54.4 26 30 − 39 13
54.5 − 57.4 1 40 − 49 15
57.5 − 60.4 25 50 − 59 43
60.5 − 63.4 3 60 − 69 18
63.5 − 66.4 8 70 − 79 2
66.5 − 69.4 12 80 − 89 4
Empirical Rule
For moderately symmetrical data
2 4
QD ≈ S, MD ≈ S
3 5

Summarizing Data
• The range can be used to approximate the standard deviation. The approximation is called the range
rule of thumb
M ax − M in R
S≈ =
4 4
• Note: The range rule of thumb is only an approximation and should be used when the distribution of
data values is unimodal and roughly symmetric
• Suppose the variance of X is S 2 and the variance of Z = aX, where a is constant, will be a2 S 2
n
1 X
Sz2 = (zi − Z̄)2
n − 1 i=1
n
1 X
= (axi − aX̄)2
n − 1 i=1
n n
!
1 X 2 1 X
= a (xi − X̄)2 = a2 (xi − X̄)2
n − 1 i=1 n − 1 i=1
= a2 S 2
Summarizing Data
Measure of Dispersion—Properties of Variance
• Exercise: Suppose the variance of X is S 2 and the variance of Z = X + a, where a is constant, will
be S 2 . Show that this is true!
• In a symmetrical distribution
1 About 68.27% of the observations lie within one standard deviation from the mean
2 X̄ ∓ 2S includes about 95.45% of the observations
3 About 99.73% of the observations lie within three standard deviation from the mean
Chebyshev’s Theorem
The proportion of values from a data set that will fall within k standard deviations of the mean will be
at least 1 − (1/k2 ), where k is a number greater than 1 (k is not necessarily an integer).
• This theorem states that at least three-fourths, or 75%, of the data values will fall within 2 standard
deviations of the mean of the data set

Summarizing Data
Measure of Dispersion—Properties of Variance
• Example: The mean price of house rent in a certain neighborhood is 4,000 Br, and the standard
deviation is 200 Br. Find the price range for which at least 75% of the houses will sell.
• Solution: Chebyshev’s theorem states that three-fourths, or 75%, of the data values will fall within 2
standard deviations of the mean. Thus,
4000 + 2(200) = 4400Br

4000 − (200) = 3600Br
Hence, at least 75% of all homes rented in the area will have a price range from 3600Br to 4400Br.
• Exercise: A survey of local companies found that the mean amount of travel allowance for executives
was $0.25 permile. The standard deviation was $0.02. Using Chebyshev’s theorem, find the minimum
percentage of the data values that will fall between $0.20 and $0.30

Summarizing Data
Measure of Dispersion—Pooled Variance
• Pooled Variance: Suppose the a data set is partitioned into two mutually exclusive groups. Let S12
is the variance of n1 observations in the first group and S22 is the variance of n2 observations of the
second group, the variance of all n = n1 + n2 observations will be
(n1 − 1)S12 + (n2 − 1)S22

Sp2 =
n1 + n2 − 2
Coefficient of Variations
The coefficient of variation, denoted by CV , is the standard deviation divided by the mean. The result
is expressed as a percentage
S
CV = × 100%
X̄
• Example: Compare the variation in heights of men to the variation in weights of men, using these
sample results obtained from a data set. For men, the heights yield X̄ = 68.34in. and Sx = 3.02in.
the weights yield Ȳ = 172.55lb and Sy = 26.33lb.
Mario F. Triola (2010). Elementary Statistics Using Excel. 4.ed. Pearson Education, Inc. USA
Summarizing Data
Measure of Dispersion—Pooled Variance
• Solution: Height
Sx 3.02in
CV = × 100% = × 100% = 4.42%
X̄ 68.31in
Wight
Sy 26.33lb
CV = × 100% = × 100% = 15.26%
Ȳ 172.55lb
We can see that heights have considerably less variation than weights
• Exercise: The weekly income (in Birr) of 10 men and 15 women workers are listed below. Whose
weekly income is more dispersed?
Women
Men 254 250 123 352 142 22
14 19 20 30 100
458 100 200 235 224 162
125 236 300 142 63
364 122 12 32

Summarizing Data
Measure of Dispersion—Z-score (Standardized value)
• A Z-score (Standardized value) is the number of standard deviations that a given value x is above or
below the mean.
xi − X̄
z=
S
• Example: What is the Z-score for the value of 14 in the following sample data set?
3, 8, 6, 14, 4, 12, 7, 10
• Solution: Let the variable is X.

8
X 8
X
xi = 64, x2i = 614, X̄ = 8, S = 3.82
i=1 i=1
Then
xi − X̄ 14 − 8
z= = = 1.57
S 3.82
Interpretation: The value 14 lies 1.57 standard deviation above the center
Summarizing Data
Measure of Dispersion—Z-score (Standardized value)
• Example: A student scored 65 on a calculus test that had a mean of 50 and a standard deviation of
10; she scored 30 on a history test with a mean of 25 and a standard deviation of 5. Compare her
relative positions on the two tests.
• Solution:
Calculus result − Mean 65 − 50
Zcalculus = = = 1.5
S 10
History result − Mean 60 − 25
Zhistory = = = 1.0
S 5
Since the z score for calculus is larger, her relative position in the calculus class is higher than her
relative position in the history class
Allal Bluman (2012). Elementary Statistics: A step by step

Summarizing Data
Measure of Dispersion—Outliers
Outliers
An outlier is an extremely high or an extremely low data value when compared with the rest of the data
values.
• Note: An outlier can strongly affect the mean and standard deviation of a variable
• A number of rule of thumps are being proposed to identify the unusual observations in a data set
— If we know the standard deviation and mean of a collected data, we can roughly estimate the
minimum and maximum usual sample values as follows:
Minimum usual value = X̄ − 2S

Maximum usual value = X̄ + 2S
— Ordinary values: −2 ≤ z ≤ 2

Summarizing Data
• 1.5IQR rule says that any observation which is outside of
[Q1 − 1.5IQR, Q3 + 1.5IQR]
will be a potential outlier. That is, any observation which is less than Q1 − 1.5IQR or greater than
Q3 + 1.5IQR will be a potential outlier
• Example: Check the following dataset for outliers
5, 6, 12, 13, 15, 18, 22, 50
• Solution:
Q1 = 7.5, Q3 = 21, IQR = Q3 − Q1 = 21 − 7.5 = 13.5
Q1 − 1.5IQR = 7.5 − 1.5(13.5) = −12.75

Q3 + 1.5IQR = 21 + 1.5(13.5) = 41.25
Check the data set for any data values that fall outside the interval from −12.75 to 41.25. The value
50 is outside this interval; hence, it can be considered as an outlier
Summarizing Data
• The graphical technique that works with 1.5IQR to detect the presence of outliers is called Box and
Whisker plot or Box plot in short
• Box plot can provide 5-number summary. i.e., minimum, Q1, Q2 = median, Q3 and Maximum
• Procedure to Construct Box Plot:
1 Find the 5-number summary
2 Construct a scale with values that include the minimum and maximum
3 Constrict a box (rectangle) extending from Q1 to Q3 and draw a line in the box at the median
value
4 Draw lines extending outward from the box to the minimum and maximum data values
• In the modified box plot, the minimum and maximum values are replaced by Q1 − 1.5IQR and
Q3 + 1.5IQR, respectively

Summarizing Data
• Example: For the 24 amounts of nicotine (in mg per cigarette). Find the 5-number summary and
draw box plot
Brand Nicotine Brand Nicotine

American Filter 1.2 Benson Hedges 1.2
Camel 1.0 Capri 0.8
Carlton 0.1 Cartier Vendome 0.8
Chelsea 0.8 GPC Approved 1.0
Hi-Lite 1.0 Kent 1.0
Lucky Strike 1.1 Malibu 1.2
Marlboro 1.2 Merit 0.7
Newport Stripe 0.9 Now 0.2
Old Gold 1.4 Pall Mall 1.2
Players 1.1 Raleigh 1.3
Richland 1.3 Rite 0.8
Silva Thins 1.0 Tareyton 1.0

Summarizing Data
• Solution: M in = 0.1, M ax = 1.4, Q1 = 0.8, Q2 = 1, Q3 = 1.2
The two dots at the left side of the box plot are outliers

Summarizing Data
Shape of Distribution—Skewness
• Skewness is a measure of symmetry, or more precisely, the lack of symmetry or departure from
symmetry. i.e., it is a means of measuring the horizontal movement of the distribution
• Skewness can be measured in absolute terms by taking the difference between arithmetic mean and
mode
— The absolute measure of skewness is
Sk = Mean − Mode
• If the value of arithmetic mean is greater than mode, Skewness is positive and if the value of mode is
greater than mean, the skewness is negative

 > 0 , The distribution is positively skewed
Sk = = 0 , The distribution is symmetric
< 0 , The distribution is negatively skewed


Summarizing Data
Symmetric
Positively Negatively
Skewed skwed
• Empirical Rule: For moderately skewed distribution
Mean − Mode = 3(Mean − Median)

Summarizing Data
• Pearsonian Coefficients of Skewness: Karl Pearson’s formula for skewness indicates direction as
well as the extent of skewness.
Mean − Mode
Sk =
Standard Deviation
• Bowley’s Coefficients of Skewness: Bowley’s coefficient of skewness is based on quartiles. Thus a
measure of skewness based on the distance from the median is defined as follows:
(Q3 − Q2 ) − (Q2 − Q1 ) Q3 − 2Q2 + Q1
Sk = =
(Q3 − Q2 ) + (Q2 − Q1 ) Q3 − Q1
• Example: The sum of fifteen observations, whose mode is 8, was found to be 150 with coefficient of
variation of 20%
1 Calculate the Pearsonian coefficient of skewness and give appropriate conclusion
2 Are smaller values more or less frequent than bigger values for this distribution?
3 If a constant 4 was added on each observation, what will be the new Pearsonian coefficient of
skewness? Show your steps. What do you conclude from this?

Summarizing Data
• Solution: X̂,
P
xi = 150, X̄ = 10
S
CV = 20% = × 100%
X̄
1 Based on the given, we know that S = 2
X̄ − X̂ 10 − 8
Sk = = =1
S 2
The distribution is positively skewed.
2 According to the Sk value, smaller values are more frequently distributed
3 If a constant is added to the raw data set, then the mean and mode will shift by that constant
number. However, the standard deviation doesn’t change. Therefore, X̄ = 14, X̂ = 12 and
S=2
X̄ − X̂ 14 − 12
Sk = = =1
S 2

Summarizing Data
• Example: In a moderately asymmetrical distribution, the mode minus the mean is 2.4, the coefficient
of variation and skewness are 20% and −0.5, respectively. Find mean, median, mode and standard
deviation
• Solution: Mode − Mean = 2.4, CV = 20% and Sk = −0.5
Mean − Mode −2.4
Sk = = = −0.5
S S
S = 4.8
Let’s find the mean based on CV equation
S 4.8
CV = × 100% = × 100% = 20%
X̄ X̄
X̄ = 24
We know that Mode − Mean = 2.4. Therefore,
Mode − Mean = 2.4 = Mode − 24 = 2.4
Mode = 26.4
Summarizing Data
• Skewness based on central moments(moment about the mean)
M30
Sk = 03/2
M2
where
n n
1X 1X
M30 = (xi − X̄)3 , M20 = (xi − X̄)2
n i=1 n i=1
• Exercise: The median and the mode of a mesokurtic distribution are 32 and 34, respectively. The
4th moment about the mean is 243. Compute the Peasonian coefficient of skewness and identify the
type of skewness. Assume n ≈ n − 1. The rth moment about the mean is
n
1X
Mr0 = (xi − X̄)r , r = 0, 1, 2, · · ·
n i=1
• Exercise: For a moderately skewed frequency distribution, the mean is 10 and the median 8.5. If the
coefficient of variation is 20%, find the Pearsonian coefficient of skewness and the probable mode of
the distribution
Summarizing Data
Shape of Distribution—Kurtosis
• Kurtosis in Greek language mean bulginess, it measures the flatness of the curve. Three terms are
used for indicating flatness,
— Mesokurtic stands for a normal curve,
— Leptokurtic for a peaked curve and
— Platykurtic for a curve less peaked than normal
Leptokurtic
β>0
Mesokurtic
β=0
Platykurtic
β<0

Summarizing Data
• Kurtosis is denoted by β and is computed as follows
M40
β= −3
(M20 )2
where
n n
1X 1X
M40 = (xi − X̄)4 , M20 = (xi − X̄)2
n i=1 n i=1
• Example: If the standard deviation of a symmetric distribution is 10, what should be the value of the
fourth moment in order the distribution to be
1 Leptokurtic
2 Platykutic
3 Mesokrtic

Chapter Three
Elementary Probability
Summarizing Data
Outline
1 Deterministic and non-deterministic models

2 Review of set theory: sets, union, intersection, complementation, De Morgan’s rules
3 Random experiments, sample space and events
4 Finite sample spaces and equally likely outcomes
5 Counting techniques
6 Definitions of probability
7 Derived theorems of probability

Deterministic and Non-deterministic Models
• Model is a simplification of reality. Deterministic models generate the exact same outcomes under a
given set of initial conditions while in stochastic models the outcomes will differ due to inherent
randomness.
• Probability as a general concept can be defined as the chance of an event occurring. i.e., it is a
quantitative measure of uncertainty
Probability is the basis of inferential statistics

Review of Set Theory
• Probability Experiment: It is a process that leads to a well-defined results called outcomes
— We know all possible outcomes of the experiment before performing the experiment
— What makes the experiment random is that we don’t know exactly which possible outcome
comes first or second or ...
• Example: Tossing a fair coin is a probability experiment with well known outcomes

• Set is a collection of distinct objects. This means that {1, 2, 3} is a set but {1, 1, 3} is not because 1
appears twice in the second collection.
• Sample Space: It is the set of all possible outcomes of a probability experiment and denoted by S or
Ω
• Example: The sample space of the experiment of tossing a fair cone once has two possible
outcomes—head (H) or tail (T)
Ω = {H, T }
• Example: The possible outcomes of throwing a fair die has six possible outcomes. i.e.,
Ω = {1, 2, 3, 4, 5, 6}
• Note: The sample space is the universal set—the biggest set

• Example: Tossing a fair coin two times has four possible outcomes
Ω = {HH, HT, T H, T T }

• Example: Throwing a pair of fair dice has 36 possible outcomes
1 2 3 4 5 6
1 (1, 1) (1, 2) (1, 3) (1, 4) (1, 5) (1, 6)
2 (2, 1) (2, 2) (2, 3) (2, 4) (2, 5) (2, 6)
3 (3, 1) (3, 2) (3, 3) (3, 4) (3, 5) (3, 6)
4 (4, 1) (4, 2) (4, 3) (4, 4) (4, 5) (4, 6)
5 (5, 1) (5, 2) (5, 3) (5, 4) (5, 5) (5, 6)
6 (6, 1) (6, 2) (6, 3) (6, 4) (6, 5) (6, 6)
• Example: List all possible outcomes of tossing a fair coin three times
Ω = {HHH, HHT, HT H, HT T, T HH, T HT, T T H, T T T }

1st toss
2nd toss
3rd toss

• Suppose we have two sets A and B, A is subset of B is denoted as A ⊆ B where every element of A
is also an element of B. i.e.,
A ⊆ B = {∀x : if x ∈ A, then x ∈ B}
• Event: It is a subset of sample space (contains one or more outcomes which are in the sample space)
and is defined for a particular purpose
— Simple event is an event having only single outcome
— Compound event consisting of one or more outcomes or simple events
— Event is denoted by capital letters except S such as A, B, F etc.
• Example: Let event B is defined as at least one head in the experiment of tossing a coin two times.
B = {HH, HT, T H}

• The union of two sets A and B is the • The intersection of two sets A and B is
collection of all objects that are in either the collection of all objects that are in
set. It is written A ∪ B. Using curly both sets. It is written A ∩ B. Using
brace notion curly brace notion
A ∪ B = {∀x : (x ∈ A) or (x ∈ B)} A ∩ B = {∀x : (x ∈ A) and (x ∈ B)}
Ω Ω
A B A A∩B B
• Mutually exclusive events are events which do not have the same element in common
A ∩ B = {}

• The compliment of a set A is the collection of objects in the universal set that are not in A. The
compliment is written as Ac . In curly brace notation
Ac = {x : (x ∈ Ω) and (x ∈
/ A)}
A B
• Example: Suppose the sample space of throwing a fair die is
Ω = {1, 2, 3, 4, 5, 6}
if the event E is {2, 4, 6}, then Ac = {1, 3, 5}

• Equally-Likely Events are those events which have equal chance of occurrence
• Null event is the event which doesn’t have element (outcome) in it. i.e., it is an empty set denoted
by either ∅ or {}
• Exercise: Write the shaded region in the following Venn diagrams using compliment, union and
intersection
Ω
Ω
A B
A B
C

• Let A be the event (set), then
— Ωc = ∅, ∅c = Ω — ∅∪A=A — ∅∩A=∅
— (Ac )c = A — A ∪ Ac = Ω — A ∩ Ac = ∅
— Ω∪A=Ω — A∪A=A — Ω∩A=A
• Let A, B and C be events (sets), then
— Associative law
A ∪ (B ∪ C) = (A ∪ B) ∪ C
A ∩ (B ∩ C) = (A ∩ B) ∩ C
— Commutative law
A∪B = B∪A
A∩B = B∩A
— Distributive law
A ∪ (B ∩ C) = (A ∪ B) ∩ (A ∪ C)
A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C)
• Example: Let Ω = {s1 , s2 , s3 , s4 , s5 , s6 , s7 , s8 }, A = {s1 , s2 , s3 }, B = {s2 , s3 , s4 , s5 }, and

C = {s3 , s4 , s5 , s8 }. Get the following sets
1 Ac , B c , (Ac )c , and C c
2 A ∪ B, A ∪ C, and A ∩ B ∪ C
3 A ∩ B, A ∩ C and A ∩ B ∩ C
4 A ∩ B c , B ∩ Ac and (A ∩ B c ) ∪ (A ∩ B)
• Solution:
1 Ac = {s4 , s5 , s6 , s7 , s8 }, B c = {s1 , s6 , s7 , s8 } and C c = {s1 , s2 , s6 , s7 }
2 A ∪ B = {s1 , s2 , s3 , s4 , s5 }, A ∪ C = {s1 , s2 , s3 , s4 , s5 , s8 }, and A ∩ B ∪ C = {s2 , s3 , s4 , s5 , s8 }
3 A ∩ B = {s2 , s3 }, A ∩ C = {s3 } and A ∩ B ∩ C = {s3 }
4 Exercise

• De Morgan Law:
(∪i Aj )c = ∩j Acj , (∩j Aj )c = ∪j Acj
Suppose we have two sets, A and B. The De Morgan law says that
(A ∪ B)c = Ac ∩ B c
(A ∩ B)c = Ac ∪ B c
• Example: Consider the above example and show that (A ∪ B)c = Ac ∩ B c and (A ∩ B)c = Ac ∪ B c

Counting Techniques
Counting Techniques
• The probability of an event A is denoted by P (A). i.e., the probability of any event · is generally
denoted by P (·)
• Computing the probability of the occurrence an event needs to know all outcomes in the event and
the sample space
#A
P (A) =
#Ω
• In probability, there are four basic principles of counting
— Addition
— Multiplication
— Permutation
— Combination
Addition
Suppose two experiments, say E1 and E2 , are performed independently. If E1 has n1 possible outcomes
and E2 has n2 possible outcomes, then we will have a total of n1 + n2 possible outcomes

Counting Techniques
• Example: Suppose you have 6 roads and 3 railways from Addis Ababa to Ambo. How many possible
ways do you have to go Ambo?
Number of ways = 6 roads + 3 railways = 9
Multiplication
Suppose two experiments, say E1 and E2 , are performed simultaneously. The first experiment has n1
possible outcomes and the second experiment has n2 outcomes. For each possible outcome of the first
experiment, we will have n2 possible outcomes in the second experiment. We will have a total of
n1 × n2 possible outcomes

Counting Techniques
• Example: Say the only clean clothes you’ve got are 2 t-shirts and 4 pairs of jeans. How many
different combinations can you choose?

Counting Techniques
• Example: A small community consists of 10 women, each of whom has three children. If one women
and one of her children are to be chosen as mother and child of the year, how many different choices
are possible?
• Example: How many different 7-place license plates are possible if the first 3 places are to be
occupied by letters and the final 4 by numbers? Ans. 175,760,000
• Exercise: In the above example, how many license plates would be possible if repetition among
letters or numbers were prohibited? Ans. 78,624,000
• Exercise: You want to buy a car: you have two choices of body style, 5 color and 3 models (standard
model, sports model with bigger engine and luxury model with leather seats). How many possible
choices do you have to buy one car? Ans. 30

Counting Techniques
Permutation
A permutation is an arrangement of objects without repetition where order is important
• If we have three objects, say A, B and C, we can arrange them in 6 different ways.
A B C
B C A C A B
C B C A B A
ABC ACB BAC BCA CAB CBA

Counting Techniques
• In general, if we have n objects and want to arrange, then we will have a total of n!. If n is positive
integer, then
n! = n × (n − 1) × (n − 2) × · · · × (1)
• Note that
n! = n(n − 1)!
0! = 1
• Example: In how many ways can we arrange 4 individuals in 4 seats?
4! = 4(3)(2)(1) = 24 ways

Counting Techniques
• We shall now determine the number of permutations of a set of n objects when certain of the objects
are indistinguishable from each other. Then the formula is:
n!
n1 ! × n2 ! × · · · × nr !
Among n objects n1 are alike, n2 are alike, ..., nr are alike.
r
X
ni = n
i=1
• Example: In how many way can we arrange the letters in pepper?
6! 6(5)(4)3!
= = 60 ways
3!2!1! 3(2)(2)

Counting Techniques
• If we need to arrange r objects (r ≤ n) among n distinct objects, we will have a total of n Pr

different ways
n n!
Pr =
(n − r)!
• Example: Suppose a business man has a choice of five locations to establish his business. He wishes
to arrange only the top three locations. How many different ways can he arrange them?
5 5! 5(4)(3)2!
P3 = = = 60 ways
(5 − 3)! 2!
• Example: A license plate begins with three letters. If the possible letters are A, B, C, D and E,
how many different permutations of these letters can be made if no letter is used more than once?
5
P3 = 60 ways

Counting Techniques
Combination
Combination is a way of selecting items from a collection, such that (unlike permutations) the order of
selection does not matter
• Selecting r objects (r ≤ n) among n objects can be done by

n n n!
Cr = =
r r!(n − r)!
• Example: From a class of 20 students we need to select 3 for a committee. How many possibilities
do we have to form a committee?

20 20! 20(19)(18)17!
= = = 1140 possibilities
3 3!(20 − 3)! 3!17!

Definitions of Probability
• Example: From a group of 5 women and 7 men, how many different committees consisting of 2
women and 3 men can be performed? What if 2 of the men are feuding and refuse to serve on the
committee together?
5 7
= 350
2 3

2 5 2 5 5
+ = 300
0 3 1 2 2
• There are three approaches to calculate a probability of an event. These are:
1 The classical approach
2 The frequentist approach
3 The subjective approach

• Classical Approach: If a procedure has n different simple events, each with an equal chance of
occurring, and event A can occur in s of these ways, then
s #(A)
P (A) = =
n #(Ω)
• The assumptions of classical approach

— The outcomes must be equally-likely
— The experiment should never be repeated more than once
— The sample space should be finite
• Example: Toss a fair coin once and find the probability of the occurrence of head
• Example: If we wanted to determine the probability of getting an even number when rolling a die, 3
would be the number of favorable outcomes because there are 3 even numbers on a die (and
obviously 3 odd numbers). The number of possible outcomes would be 6 because there are 6 numbers
on a die. Therefore, the probability of getting an even number when rolling a die is 3/6, or 1/2

• Frequentist Approach: This approach is also called empirical approach. i.e., probability calculation
is depending on data
• If after n repetition of an experiment, where n is very large, an event is observed to occur in h of
these, then the probability of an event is
h
n
• Conduct an experiment a large number of times, and count the number of times event A actually
occurs, then an estimate of P (A) is
number of times A occurred

P (A) ≈
number of times trial was repeated
• Example: Suppose a coin was tossed 1000 times and the result was 587 tails. The relative frequency
of tails is 587/1000. Another 1000 tosses lead to 511 tails. Then the relative frequency of tails is
(587 + 511)/(1000 + 1000) = 1098/2000 . Proceeding, in this manner we obtain a sequence of numbers,
which gets closer and closer to the number defined as the probability of a trial in a single toss.

#A
P (A) = lim
n→∞ n
• Axioms of Probability: The probability of an event, say A must satisfy the following axioms
1 Axiom 1: The probability of any event A must be non-negative, that is, P (A) ≥ 0
2 Axiom 2: The probability of the sample space is 1, that is, P (Ω) = 1
3 Axiom 3: Given mutually exclusive events A1 , A2 , A3 , · · · that is, where Ai ∩ Aj = {}, i 6= j
— The probability of a finite union of the events is the sum of the probabilities of the individual
events, that is:
P (A1 ∪ A2 ∪ · · · ∪ Ak ) = P (A1 ) + P (A2 ) + · · · + P (Ak )
— The probability of a countably infinite union of the events is the sum of the probabilities of the
individual events, that is:
P (A1 ∪ A2 ∪ · · · ) = P (A1 ) + P (A2 ) + · · ·

• Example: Suppose you throw two dice. We are interested in the sum of the upper face of the dice.
Let E be the event that the sum of the dice is odd. Find P (E)
• Example: In the experiment of tossing a coin three times, what is the probability of getting at most
one head?
• Example: If two dice are rolled, what is the probability that the sum of the upturned faces will equal
7?
• Example: If 3 balls are “randomly drawn” from a bowl containing 6 white and 5 black balls, what is
the probability that one of the balls is white and the other two black?
• Example: A committee of 5 is to be selected from a group of 6 men and 9 women. If the selection is
made randomly, what is the probability that the committee consists of 3 men and 2 women?

• Exercise: An urn contains n balls, one of which is special. If k of these balls are withdrawn one at a
time, with each selection being equally likely to be any of the balls that remain at the time, what is
the probability that the special ball is chosen?
• Exercise: Suppose that A and B are mutually exclusive events for which P (A) = 0.3 and
P (B) = 0.5. What is the probability that
1 either A or B occurs?
2 A occurs but B does not?
3 Both A and B occur?
• Exercise: Sixty percent of the students at a certain school wear neither a ring nor a necklace.
Twenty percent wear a ring and 30 percent wear a necklace. If one of the students is chosen
randomly, what is the probability that this student is wearing a ring or a necklace?

Derived Theorems of Probability
• Rule 1: Suppose A1 and A2 are two events such that A1 ⊆ A2 , then P (A1 ) ≤ P (A2 )
A2
A1
• Rule 2: For every event A, 0 ≤ P (A) ≤ 1. i.e., the probability is a number between 0 and 1
• Rule 3: For ∅, the empty set, P (∅) = 0. i.e., impossible event has zero probability
• Rule 4: If Ac is the compliment of A, then
P (Ac ) = 1 − P (A)
• Rule 5: If A and B are two events, then
P (A ∪ B) = P (A) + P (B) − P (A ∩ B)
Derived Theorems of Probability
• More generally, if A , B and C are three events, then
P (A ∪ B ∪ C) = P (A) + P (B) + P (C) − P (A ∩ B) − P (A ∩ C) − P (B ∩ C) + P (A ∩ B ∩ C)
• Example: From a group of 3 freshmen, 4 sophomores, 4 juniors, and 3 seniors a committee of size 4
is randomly selected. Find the probability that the committee will consist of
— 1 from each class;
— 2 sophomores and 2 juniors
— only sophomores or juniors
• Example: A customer visiting the suit department of a certain store will purchase a suit with
probability 0.22, a shirt with probability 0.30, and a tie with probability 0.28. The customer will
purchase both a suit and a shirt with probability 0.11, both a suit and a tie with probability .14, and
both a shirt and a tie with probability 0.10. A customer will purchase all 3 items with probability
0.06. What is the probability that a customer purchases
— none of these items?
— exactly 1 of these items?

Chapter Four
Conditional Probability and
Independence
Conditional Probability and Independence
Outline
1 Conditional probability
2 Multiplication theorem
3 Bayes’ theorem
4 Total probability theorem
5 Independent events

Conditional Probability
• Sometimes the occurrence of some event may depend on the occurrence of some other event
• Suppose event A is depending on event B. The probability of A is expressed as P (A|B) to mean the
probability of A given that B has occurred
P (A ∩ B)
P (A|B) = , P (B) > 0
P (B)
• The probability that both A and B have occurred together is
P (A ∩ B) = P (A)P (B|A)
= P (B)P (A|B)
• If A and B are statistically independent, then
P (A ∩ B) = P (A)P (B)
i.e., P (A|B) = P (A) and P (B|A) = P (B)

• P (A|B), for fixed B, satisfies the various postulates of probability
1 0 ≤ P (A|B) ≤ 1
2 P (Ω|B) = 1. By the definition of conditional probability, we know that
P (Ω ∩ B) P (B)
P (Ω|B) = = =1
P (B) P (B)
3 If A ∩ C = ∅, then P (A ∪ C|B) = P (A|B) + P (C|B)
P ((A ∪ C) ∩ B) P ((A ∩ B) ∪ (C ∩ B))

P (A ∪ C|B) = =
P (B) P (B)
P (A ∩ B) + P (C ∩ B)
= = P (A|B) + P (C|B)
P (B)
4 If B = Ω, then P (A|Ω) = P (A)
P (A ∩ Ω) P (A)
P (AΩ) = = = P (A)
P (Ω) 1
• Example: Suppose that an office has 100 calculating machines. Some of these machines are electric
(E) while others are manual (M ). And some of the machines are new (N ) while others are used (U ).
E M Total
N 40 30 70
U 20 10 30
Total 60 40 100
A person enters the office and picks a machines randomly and discover that it is manual. What is the
probability that it is new?
• Example: If P (A) = 0.5, P (B) = 0.6, and P (A ∩ B c ) = 0.4, compute
1 P (A ∩ B) and P (A|B)
2 P (A ∪ B c )
3 P (B|A ∪ B c )

• Example: A jar contains black and white marbles. Two marbles are chosen without replacement.
The probability of selecting a black marble and then a white marble is 0.34, and the probability of
selecting a black marble on the first draw is 0.47. What is the probability of selecting white marble
on the second draw, given that the first marble drawn was black?
• Example: The probability that it is Friday and that a student is absent is 0.03. Since there are 5
schooldays in a week, the probability that it is Friday is 0.2. What is the probability that a student is
absent given that today is Friday?
• Exercise: Suppose that we roll a pair of fail dice, so each of the 36 possible outcome is equally likely.
Let A denotes the event that the first die lands on 3, let C be the event that the sum of the dice is 7.
Are A and C independent ?

Multiplicative, Baye’s and Total Probability Theorems
• Multiplicative Theorem: Suppose we have n events, A1 , A2 , · · · , An with P (∩n−1

i=1 Ai ) > 0). Then
P (∩ni=1 Ai ) = P (An |A1 ∩ A2 ∩ · · · ∩ An−1 )P (An−1 |A1 ∩ A2 ∩ · · · ∩ An−2 ) · · · P (A2 |A1 )P (A1 )
Suppose we have three events, say A, B and C, then
P (A ∩ B ∩ C) = P (A)P (B|A)P (C|A ∩ B)
• Example: An urn contains 10 identical balls, of which 5 are black, 3 are red and 2 are white. Four
balls are drawn one at a time without replacement. Find the probability that the first ball is black,
the second is red, the third is white and the fourth is black
P (B1 ∩ R2 ∩ W3 ∩ B4 ) = P (B1 )P (R2 |B1 )P (W3 |B1 ∩ R2 )P (B4 |B1 ∩ R2 ∩ W3 )

5 3 2 4
= × × ×
10 9 8 7
= 0.024

• Exercise: Consider a lot consisting of 20 defective and 80 non-defective items. If we choose two
items at random without replacement. What is the probability that both items are defective?
• The events B1 , B2 , · · · , Bk represent a partition of the sample space Ω if
1 Bi ∩ Bj = ∅, ∀i 6= j
2 ∪i Bi = Ω
3 P (Bi ) > 0, ∀i
• Let A be the event with respect to Ω and let B1 , B2 , · · · , Bk be partition of Ω. Then
A = (A ∩ B1 ) ∪ (A ∩ B2) ∪ · · · ∪ (A ∩ Bk )
This implies that

P (A) = P (A ∩ B1 ) + P (A ∩ B2) + · · · + P (A ∩ Bk )

B1 B8 B2
B7 B9
B5
B6 B3
B4
• Total Probability Theorem: Let B1 , B2 , · · · , Bk be partition of Ω. Then for any event A
k
X
P (A) = P (A|Bi )P (Bi )
i=1

• Example: A certain item is manufactured by three factories, say 1, 2, and 3. It is known that 1 turns
out twice as many items as 2, and that 2 and 3 turns out the same number of items. It is also known
that 2% of the items produced by 1 and 2 are defective, while 4% of those manufactured by 3 are
defective. All the items produced are put into one stockpile and then one item is chosen at random.
What is the probability that this item is defective?
• Solution: Let us introduce the following events: A = {the item is defective},
B1 = {The item came from 1}, B2 = {The item came from 2}, and B3 = {The item came from 3}.
The required probability is
P (A) = P (A|B1 )P (B1 ) + P (A|B2 )P (B2 ) + P (A|B3 )P (B3 )
where P (B1 ) = 1/2, P (B2 ) = P (B3 ) = 1/4, P (A|B3 ) = 0.04 and P (A|B1 ) = P (A|B2 ) = 0.02.
P (A) = 0.025

• Exercise: The proportion of motorists at a given gas station using regular unleaded gasoline, extra
unleaded, and premium unleaded over a specified period of time are 40%, 35% and 25%, respectively.
The respective proportion of filling their tanks are 30%, 50%, and 60%. What is the probability that
a motorist selected at random from among the patrons of the gas station under consideration and for
the specified period of time will fill his/her tank? Ans: 0.445
• Baye’s Theorem: Let B1 , B2 , · · · , Bk be partition of Ω. Then for any j = 1, 2, · · · , k
P (A|Bi )P (Bi )
P (Bi |A) = Pk
i=1 P (A|Bi )P (Bi )
• Example: Consider one of the examples above. Suppose that one item is chosen from the stockpile
and is found to be defective. What is the probability that it was produced in factory 1?
P (A|B1 )P (B1 )
P (B1 |A) =
P (A|B1 )P (B1 ) + P (A|B2 )P (B2 ) + P (A|B3 )P (B3 )
(0.02)(0.5)
=
(0.05)(0.5) + (0.02)(0.25) + (0.04)(0.25)
= 0.40
• Exercise: Suppose that the probability that both of a pair of twins are boys is 0.30 and that the
probability that they are both girls is 0.26. The probability of the first child being a boy is 0.52, what
is the probability that:
1 The second twin is a boy, given that the first is a boy?
2 The second twin is a girl, given that the first is a girl?
3 The second twin is a boy?
4 The first is a boy and the second is girl?
• Exercise: Let Bi , i = 1, · · · , 5 be partition of the sample space Ω and suppose that:
i 5−i
P (Bi ) = and P (A|Bi ) = , i = 1, 2, · · · , 5
15 15
Compute the probabilities P (Bi |A) =, i = 1, 2, · · · , 5

Chapter Five
One-Dimensional Random Variables
Outline
1 Random variable: definition and distribution function

2 Discrete random variables
3 Continuous random variables
4 Cumulative distribution function and its properties

Random Variable
Random Variable—Definition and distribution function
Random Variable
Given a random experiment with an outcome space Ω , a function X that assigns one and only one real
number X(ω) = x to each outcome in Ω is called random variable
X
ω X(ω)
• The above definition has made it clear that the random variable X can be a function which maps
each outcome of a probability experiment to a real number system
— Random variable is a variable whose values are associated with chance

Random Variable—Definition and distribution function
• Example: Consider the experiment of tossing a fair coin two times. The sample space is
Ω = {HH, HT, T H, T T }
Define the random variable X as follows: X is the number of heads. Hence, X(HH) = 2,
X(HT ) = 1 = X(T H) and X(T T ) = 0
• Solution: Now the possible values of a random variable X are 0, 1, 2.
1 2 1
P (X = 0) = P ({T T }) = , P (X = 1) = P ({HT, T H}) = , P (X = 2) = P ({HH}) =
4 4 4
Based on the definition of X, P (X = 2) means the probability of getting two heads.
• The collection of pairs (xi ; P (xi )), i = 1, 2, · · · , is sometimes called probability distribution of X.
The probability distribution of the first example is
X = xi 0 1 2
P (X = xi ) 0.25 0.50 0.25

Discrete Random Variable
• Exercise: Roll a four-sided die twice, and let X equal the larger of the two outcomes if they are
different and the common value if they are the same. What are the possible values of X and the
corresponding probabilities?
• Random variable can be classified as discrete and continuous
Let X be a random variable. If the number of possible values of X is finite or countably infinite, we call
X a discrete random variable
• Let X be a discrete random variable. The probability mass function P (X = xi ) must satisfy the
following conditions
1 P = xi ) ≥ 0
P(X
∞
2 i=1 P (X = xi ) = 1
• Example: Consider the experiment of tossing a die two times. Let Y be the random variable which
denotes the absolute difference of the upturned faces. What is the probability mass function of Y ?

• Example: Suppose the following is the probability distribution of X
y -1 0 1 2
P (y) c 2c 0.5c 3c
What is the value of c?

• Exercise: Let the pmf of X is defined by
(1 + |x − 3|)
P (X = x) = , x = 1, 2, 3, 4, 5
11
Find
1 P (X > 2)
2 P (X < 1)
3 P (2 ≤ X < 4)
4 P (X ≥ 4|X ≥ 2)

• Exercise: For each of the following, determine the constant c so that f (x) satisfies the conditions
being a pmf for random variable X
1 f (x) = xc , x = 1, 2, 3, 4
2 f (x) = cx, x = 1, 2, · · · , 10
3 f (x) = c(0.25)x , x = 1, 2, 3 · · ·
c
4 f (x) = (x+1)(x+2) , x = 0, 1, 2, 3, · · ·
• Exercise: Let X be the number of accidents per week in a factory. Let pmf of X be
c
f (x) = , 0, 1, 2, 3, · · ·
(x + 1)(x + 2)
Find the conditional probability of X ≥ 4, given that X ≥ 1

Continuous Random Variable
The random variable X is said to be continuous random variable if there exists a function f , called
probability density function (pdf ) of X, satisfying the following conditions:
1f (x) ≥ 0, ∀x
R∞
2 −∞ f (x)dx = 1
3 For any a and b with −∞ < a < b < ∞, we have

Z b
P (a ≤ X ≤ b) = P (a < X < b) = f (x)dx
a

• A random variable X is said to be continuous if it assumes any value in a given interval (a, b)
• For any single value a Z a
P (X = a) = f (x)dx = 0
a
• As a consequence, including or excluding the endpoints of an interval has no effect on its probability
P (a < X < b) = P (a ≤ X < b) = P (a < X ≤ b) = P (a ≤ X ≤ b)
• Example: Consider the following probability density function

c 0≤X≤1
f (x) =
0 Otherwise
What is the value c?
Z ∞
f (x)dx = 1
−∞
Z 1 1
cdx = cx = c(1 − 0) = 1

0 0
The value of c is 1.
• Example: Suppose the pdf of X is

1 −x

5e
5 0≤x<∞
f (x) =
0 Otherwise
Show that f (x) is pdf

∞
x ∞
Z
1 −x
e 5 dx = −e− 5 = 1
0 5 0
For any value if x ∈ [0, ∞], f (x) ≥ 0. The two conditions are satisfied. Therefore, f (x) is pdf
• Exercise: The percentage of alcohol in a certain compound may be considered as a random variable,
where X, 0 < X < 1, has the following pdf :
f (x) = 20x3 (1 − x), 0<x<1
Verify that this is pdf

• Example: Let X be a continuous random variable with pdf given by



 ax 0 ≤ x < 1,
a 1 ≤ x ≤ 2,

f (x) =

 −ax + 3a 2 ≤ x ≤ 3.
0 Otherwise

Determine the constant a.

• Solution:
Z ∞ Z 1 Z 2 Z 3
f (x)dx = axdx + adx + −ax + 3adx = 1
−∞ 0 1 2
ax2 1 ax2
2 3
= + ax + − + 3ax = 1

2 0 1 2 2
1
The evaluation has reviled that a = 2

• Example: The diameter on an electric cable, say X, is assumed to be a continuous random variable
with pdf
f (x) = 6x(1 − x), 0 ≤ x ≤ 1
Compute the following probabilities
1 P (0 < X < 0.5)
2 P (X ≤ 0.5| 13 < x < 32 )
• Solution: Z 0.5 1 1
P (0 ≤ x ≤ 0.5) = 6x(1 − x)dx = x2 (3 − 2x) =

0 0 2
1 2 P (X ≤ 0.5 ∩ 13 < x < 32 )

P (X ≤ 0.5| <x< ) =
P 31 < x < 32

3 3
P 13 < x < 21

=
P 13 < x < 32


Z 1/2
1 1

P <x< = 6x(1 − x)dx
3 2 1/3
1/2
= x2 (3 − 2x) = 0.2407

1/3
Z 2/3
P (1/3 < x < 2/3) = 6x(1 − x)dx
1/3
2/3
= x2 (3 − 2x) = 0.4815

1/3
Therefore,
1 2 0.2407

P X ≤ 0.5 | < x < = = 0.4999
3 3 0.4815
• Exercise: Consider the pdf of the above example. Determine a number b such that
P (X < b) = 2P (X > b)
• Exercise: Suppose the random variable X has the following pdf
1
f (x) = , −a ≤ x ≤ a
2a
where a > 0. Whenever possible, determine a so that the following are satisfied
1 P (X > 1) = 13
2 P (X < 12 ) = 0.3
3 P (X > 1) = 0.5
4 P (|X| < 1) = P (|X| > 1)
• Exercise: The continuous random variable X has pdf
f (x) = 3x2 , −1 ≤ x ≤ 0
b

If b is a number satisfying −1 < x < 0. Compute P X > b|X < 2

• Exercise: Let X be the life length of an electronic device (measured in hours). Suppose that X is a
continuous random variable with pdf
k
f (x) = , 2, 000 ≤ x ≤ 10, 000
xn
1 For n = 2, determine k
2 For n = 3, determine k
3 For general n, determine k
• Exercise: The claims submitted to an insurance company over a specified period of time t is a RV
with pdf
c
f (x) = , x > 0, c > 0
(1 + x)4
1 Determine the constant c

2 Compute P (1 ≤ X ≤ 4)

Cumulative Distribution Function
Let X be a random variable, discrete or continuous. We define F to be cumulative distribution function
of the random variable X (cdf ) where F (x) = P (X ≤ x)
• If X is discrete random variable, X

F (x) = P (xj )
j
where the sum is taken over all indices j satisfying xj ≤ x

• If X is continuous random variable with pdf f
Z x
F (x) = f (s)dx
−∞
• Example: Let X has the pmf

x
f (x) = , x = 1, 2, 3, 4
10

Solution: The cdf is 1.0



 0 if x < 1, 0.8

 1
if 1 ≤ x < 2,


10 0.6



 3
F (x) = if 2 ≤ x < 3, F(x)
 10 0.4
6


if 3 ≤ x < 4,



10


0.2

1 if x ≤ 4

0.0 x
1.0 2.0 3.0 4.0
Note that
P (X ≤ 3) = P (X = 1) + P (X = 2) + P (X = 3)
The graph of cdf for discrete random variable is always a step graph
• Exercise: Suppose that the random variable X assumes the three values 0; 1 and 2 with probabilities
1/3, 1/6 and 1/2, respectively. Derive the cdf
• Example: Suppose that X is a continuous random variable with pdf
f (x) = 3x2 , 0≤x≤1
• Solution: The cdf is 

 0R if x ≤ 0
x
F (x) = 3s2 ds = x3 if 0 < x ≤ 1
 0
1 if x > 1
• Example: If the pdf of a RV X is
f (x) = λe−λx , x > 0, λ>0
Construct F (x)

• Exercise: Show that the following are probability density functions
1 f1 (x) = e−x , x > 0
2 f2 (x) = 2e−2x , x > 0
3 f (x) = (θ + 1)f1 (x) − θf2 (x), 0<θ<1
• Exercise: Prove or disprove: If f1 (x) and f2 (x) are pdfs and if θ1 + θ2 = 1, then θ1 f1 (x) + θ2 f2 (x)
is pdf
• Exercise: Let
f (x) = Ke−αx (1 − e−αx ), x>0
1 Find K such that f (·) is a density function

2 Find the corresponding cdf
3 Find P (X > 1)
• Exercise: Let
f (x) = θx + 0.5, −1 ≤ x ≤ 1
For what range of values of θ f (x) is a pdf
• The function F is nondecreasing. i.e., if x1 ≤ x2 , we have F (x1 ) ≤ F (x2 )
F(x)
1.0
0.8
0.6
0.4
0.2
x
0.0
0.0 0.3 0.6 0.9 1.2 1.5
The graph of 
 0 if x/ ≤ 0
F (x) = x2 if 0 < x ≤ 1
1 if x > 1


Cumulative Distribution Function—Properties
• limx→−∞ F (x) = 0 and limx→∞ F (x) = 1. Usually it can be written as F (−∞) = 0 and F (∞) = 1
• For continuous case Z x
F (−∞) = lim f (s)ds = 0
x→−∞ −∞
Z x
F (∞) = lim f (s)ds = 1
x→∞ ∞
• Example: Let F (x) = x2 , 0<x≤1
lim x2 = 0, lim x2 = 1
x→0 x→1
• Exercise: Let
F (x) = e3x , −∞ < x ≤ 0
show that limx→−∞ F (x) = 0 and limx→0 F (x) = 1

• Let F be the cdf of a continuous random variable with pdf f . Then
∂F (x)
f (x) =
∂x
∀x at which F is differentiable
• Example: Suppose that a continuous random variable has cdf F given by

0 if x ≤ 0
F (x) =
1 − e−x if x > 0
• Solution: The pdf is f (x) = F 0 (x)
f (x) = e−x , x≥0
• Example: Determine the pdf f for the following cdf. Verify also that f is a pdf
1 F (x) = x5 , 0 ≤ x ≤ 5
2 F (x) = e3x , −∞ < x ≤ 0
• Let X be a discrete random variable with possible values x1 , x2 , · · · and suppose that it is possible to
label these values so that x1 < x2 < · · · . Let F be the cdf of X. Then
P (X = xi ) = F (xi ) − F (xi − 1)
• Example: Let X be a discrete random variable with cdf


 0 if x < 0
 1
3 if 0 ≤ x < 1
F (x) = 1
if 1 ≤ x < 2
 2


1 if x ≥ 2
We know that
1 1 1
P (X = 1) = F (1)−F (0) = − = The pdf is
2 3 6
x 0 1 2
1 1 1 1 1
P (X = 2) = F (2)−F (1) = 1− = P (x) 3 6 2
2 2
1 1
P (X = 0) = −0=
3 3
Chapter Six
Functions of Random Variables
Outline
1 Equivalent events
2 Functions of discrete random variables and their distributions
3 Functions of continuous random variables and their distributions

Equivalent Events
Equivalent Events
Equivalent Events
Suppose A is the event associated with range space of Y , RY . Let B ⊂ RX be defined as follows:
B = {x ∈ RX : H(x) ∈ A}
If A and B are related in this way, then they are equivalent events
• Two events A and B are equivalent if and only if they have equal chance of occurrence
√
• Example: Suppose H(x) = x2 and events B : {X > 2} and C : {Y > 2} are equivalent.
√
P (Y > 2) = P (X 2 > 2) = P (X > 2)
• Example: Let X be a continuous random variable with pdf
f (x) = 5e−5x , x>0
Suppose H(x) = 2x + 3. Show that P (Y > 5) is equivalent with P (X > 1).

Functions of Discrete Random Variables
Functions of Discrete Random Variables and their Distributions
• If X is a discrete random variable and Y = H(X), then it follows immediately that Y is also a
discrete random variable
• Example: Consider the probability distribution of X (# of heads) defined based on tossing a fair coin
two times
X 0 1 2
P (X) 0.25 0.50 0.25
Let Y = 2X + 1. The probability distribution of Y is
Y 1 3 5
P (Y ) 0.25 0.50 0.25
P (Y = 1) = P (2x + 1 = 1) = P (X = 0)
• Example: Suppose that the random variable X assumes the three values −1, 0 and 1 with
probabilities 13 , 12 and 16 , respectively. Let Y = X 2 . Construct the probability distribution of Y .

Functions of Discrete Random Variables and their Distributions
• Exercise: Let X have possible values 1, 2, · · · , n and suppose that P (X = n) = 1
2n . Let

1 if X is even
Y =
−1 if X is odd
Find P (Y = 1) and P (Y = −1) Hint: Use geometric series

• Note: X can be continuous random variable while Y is discrete
• Example: Suppose the pdf of a random variable X is

3 2 1 3
f (x) = x − x , 0<x≤2
4 3
Let Y = 1 if X ≤ 1 and Y = −1 ifX > 1

• Solution: Z 1
3 1 1 1 1 3
P (Y = 1) = x2 − x3 dx = x3 − x4 =
0 4 3 4 4 0 16

Functions of Continuous Random Variables
Functions of Continuous Random Variables and their Distributions
• General Procedure to find pdf

1 Obtain G, the cdf of Y , where G(y) = P (Y ≤ y)
2 Differentiate G(y) with respect to y in order to obtain g(y)
3 Determine those values of y in the range space of Y for which g(y) > 0
• Example: Suppose that a continuous random variable X has pdf f (x) = e−x , x > 0 Find the pdf
of Y = 3x + 1
• Solution:

(y − 1)
P (Y ≤ y) = P (3x + 1 ≤ y) = P X≤
3
Z (y − 1)/3 (y − 1)/3
e−x dx = −e−x = 1 − e−
(y − 1)/3
=

0 0
∂ 1 −(y − 1)/3
1 − e−
(y − 1)/3
= e
∂y 3

• The pdf of Y is
1 −(y − 1)/3
f (y) = e , y>1
3
• Example: Suppose that the pdf of a random variable X is
1
f (x) = , 1≤x≤3
2
Derive the pdf of Y = ex
• Example: Consider the pdf of a random variable X is
1
f (x) = , −1 ≤ x ≤ 1
2
. Derive the pdf of Y = X 2

• Exercise: Suppose that the pdf of a random variable X is
f (x) = 1, x ∈ (0, 1)
. Find the pdf of the following random variables:

1 Y = x2 + 1
1
2 Y = x+1
• Theorem: Let X be a continuous random variable with pdf f , where f (x) > 0 for a < X < b.
Suppose that y = H(x) is strictly monotone (increasing or decreasing) function of X. Assume that
this function is differentiable for all X. Then the random variable Y has a pdf g given by,

∂x
g(y) = f (x)
∂y
where x is expressed in terms of y

• Example: Suppose that the random variable X has the following pdf
f (x) = 2x, 0<x<1

(y−1)
and y = 3x + 1. x = 3 and dx = 13 . Then

∂x y−1 1 2
g(y) = f (x) = 2 = (y − 1) , 1<y<4
∂y 3 3 9
• Example: Suppose that that pdf of X is
1
f (x) = , −1 < x < 1
2
. Find the pdf of the following random variables:
1 Y = sin (π/2)X
2 Y = cos (π/2)X

Chapter Seven
Two Dimensional Random Variables
Yabebal Ayalew
Statistics Department, Addis Ababa University(AAU)
Office: Freshman building, Room 115 email: yabebala@gmail.com
Curse Title: Probability and Statistics for Engineers Course Code: Stat2171
Semester: I Credit Hour: 5 ECTS

Probability and Statistics

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Probability and Statistics

Uploaded by

Copyright:

Available Formats

Probability and Statistics for Engineers

College of Natural & Computational Science

3 Statistics Department Probability and Statistics 22.2.2022

“ Statistical thinking will one day be as necessary

~Samuel S. Wilks (1906-1964)

• The Merriam-Webster’s Collegiate Dictionary defines statistics as:

6 Statistics Department Probability and Statistics 22.2.2022

7 Statistics Department Probability and Statistics 22.2.2022

8 Statistics Department Probability and Statistics 22.2.2022

10 Statistics Department Probability and Statistics 22.2.2022

• Statistical laws are not exact

11 Statistics Department Probability and Statistics 22.2.2022

• Statistics doesn’t study cause-and-effect type relationship

12 Statistics Department Probability and Statistics 22.2.2022

1 Formulating the problem

• Interval Scale of Measurement

17 Statistics Department Probability and Statistics 22.2.2022

Interesting Facts about Ethiopia

20 Statistics Department Probability and Statistics 22.2.2022

21 Statistics Department Probability and Statistics 22.2.2022

• Producing the results

23 Statistics Department Probability and Statistics 22.2.2022

24 Statistics Department Probability and Statistics 22.2.2022

27 Statistics Department Probability and Statistics 22.2.2022

28 Statistics Department Probability and Statistics 22.2.2022

29 Statistics Department Probability and Statistics 22.2.2022

30 Statistics Department Probability and Statistics 22.2.2022

• Excellent questionnaire involves a simultaneous integration of four layers

31 Statistics Department Probability and Statistics 22.2.2022

• Step 1: Decide what information is required

32 Statistics Department Probability and Statistics 22.2.2022

• Target the vocabulary and grammar to the population being surveyed

33 Statistics Department Probability and Statistics 22.2.2022

• Avoid emotional language, prestige bias and leading questions

34 Statistics Department Probability and Statistics 22.2.2022

35 Statistics Department Probability and Statistics 22.2.2022

• Avoid negatives and especially double negatives

36 Statistics Department Probability and Statistics 22.2.2022

38 Statistics Department Probability and Statistics 22.2.2022

• The steps to construct categorical frequency distribution are

3 Put distinct values of a data set in the first column

39 Statistics Department Probability and Statistics 22.2.2022

Construct a frequency distribution for the above data.

40 Statistics Department Probability and Statistics 22.2.2022

• Ungrouped Frequency Distribution is the quantitative counterpart of categorical frequency

41 Statistics Department Probability and Statistics 22.2.2022

• Steps to construct ungrouped frequency distribution

Class Frequency Relative frequency Cumulative frequency

42 Statistics Department Probability and Statistics 22.2.2022

Answer the following questions:

44 Statistics Department Probability and Statistics 22.2.2022

45 Statistics Department Probability and Statistics 22.2.2022

2 Determine the number of classes (K)

K = 1 + 3.32 log10 (44) = 6.5 ≈ 7

— Round up K every time you get fraction number

3 Compute the class width,

LCL1 LCL2 LCL3 LCL4 LCL5 LCL6 LCL7

5 Compute the upper class limit

47 Statistics Department Probability and Statistics 22.2.2022

One is the maximum possible value of unit of measurement

6 Compute the upper and lower class boundaries

U CBi = U CLi + U/2

• Then the class boundaries are