You are on page 1of 17

Probability and

Statistics
Stage 2. Statistical
Measures
Week 1
Lesson 1. Statistical Measures

Measures of Dispersion. They


The Statistical Measures expect Measures of Central Tendency . measure the degree of dispersion
"summarize" the information of the Statistical measures correspond of the values of the variable; that
"sample" to get a better knowledge about to values that are generally is, they indicate the extent to
the population located in the central part of a which the data differ from each
data set. Statistical measures are other.
intended to "summarize" the
They are classified in: information of the "sample" in
order to have a better knowledge
⮚ Measures of Central Tendency of the Population. (They allow
data to be analyzed around a
⮚ Measures of Dispersion central value). These include the
arithmetic mean, mode and
median.
Lesson 2. Statistical measures in non-clustered
data. Measures of central tendency.
Mode (MO): It's the value of the variable with the Breed fi
highest frequency. Example 2.1
Chihuahua 33
The owner of a veterinary clinic set
It is represented with the symbol for a sample. German 39
out to do research on what breeds of
If the data set has one mode it is called: Unimodal. dogs are seen in his clinic over the Shepherd
course of a year. He identifies the Bulldog 16
If it has two modes it is called: Bimodal.
trend. Pug 9
If it has more than two modes it is called: Solution: Poodle 12
Multimodal.
The most attended dog breed was Chow 29
If it has no mode, it is called: Amodal. german shepherd, The mode Mo =
german shepherd. Doberman 25
Dalmatia 7
Total 170
Example 2.4 Solution:
The arithmetic mean: Obtain the arithmetic mean of the Add a column of products .
grades obtained in mathematics by a
Also called average. It is represented by
group of 43 high school students.
the symbol for a sample.
Grades(x) Students
(f)
If the data come from a frequency table, 5 2 10
the mean is obtained with the formula: Grades (x) Students (f)
6 3 18

5 2 7 6 42

6 3 8 15 120
Where indicates the addition of
the products of each data with its 7 6 9 12 108
respective frequency. 10 5 50
8 15
9 12 Total 348

10 5
Total

The arithmetic mean is: 8.9


Example 2.5
The Median: It is
represented by the symbol In an evaluation, the health status of patients admitted to a health clinic was rated as healthy
for a sample. (H), mild (Md), moderate (M) and severe (S). In an inspection of 20 patients, the following
health statuses were determined for each patient.
For qualitative data it is Md M S H H M M M M M M M S M H H S M S M
ordered according to its d d d d d

nominal value and the Solution:


median will be the central
data. The results are on an ordinal scale where healthy (H) is the lowest level and severe (S) is the
highest. We ordered the data:
For quantitative data its
H H H H M M M M M M M M M M M M S S S S
order is ascending and the d d d d d d

median is the central value,


The values in the middle of the sample are located at positions 10 and 11 and are Md and M.
if the number of data is
even, the median is the Therefore, the median representing the health status of the 20 patients are:
average of the central data. mild (Md) and moderate (M).
Example 2. 6
Two students surveyed their classmates about the hours they spend studying during the day.

2 1 4 3 1 2 4 1 2 3 4 3 2 3 2 1 3 4 4 3

What is the median study hours of the surveyed students?

Solution:
The results are sorted:

1 1 1 1 2 2 2 2 2 3 3 3 3 3 3 4 4 4 4 4

The values in the middle of the sample are located at positions 10 and 11 and are 3 and 3.
Therefore, the median representing the hours of study is:
Week 2
Lesson 3. Statistical measures in grouped
data. Measures of central tendency.
Mode (MO): It is represented with the symbol Height (mtr) Students
for a sample. Example 2.8
The following distribution shows the heights of 35 1.25 – 1.30 2
In grouped data, we choose the highest frequency randomly selected students.
1.30 – 1.35 3
and determine the mode with the formula:
Solution: The highest frequency is 7, where: 1.35 – 1.40 3
1.40 – 1.45 5
1.45 – 1.50 4
1.50 – 1.55 6
1.55 – 1.60 7
-Lower limit of mode
1.60 – 1.65 5
-Modal frequency-previous frequency Substitute Total 35
-Modal frequency-post frequency

-Class breadth
Solution:
Example 2.9
Add a column with the class marks
The arithmetic mean: Obtain the arithmetic mean of the and another one with the products
following distribution showing the
Also called average. It is represented by heights of 35 randomly chosen
the symbol for a sample.
students. Height (mtr) Students Class mark

Height (mtr) Students 1.25 – 1.30 2 1.275 2.55


For grouped data, the absolute frequencies
and the class mark are considered. The 1.25 – 1.30 2 1.30 – 1.35 3 1.325 3.975
arithmetic mean is obtained with the
1.30 – 1.35 3 1.35 – 1.40 3 1.375 4.125
formula:
1.35 – 1.40 3 1.40 – 1.45 5 1.425 7.125
1.40 – 1.45 5
1.45 – 1.50 4 1.475 5.9
1.45 – 1.50 4
1.50 – 1.55 6 1.525 9.15
Where indicates the addition of 1.50 – 1.55 6
the products of the class mark with its 1.55 – 1.60 7 1.575 11.025
1.55 – 1.60 7
respective frequency. 1.60 – 1.65 5 1.625 8.125
1.60 – 1.65 5
Total 35 51.975
Total 35

The arithmetic mean is :


Example 2.10
The Median: Represented by the
symbol for a sample. Continuing with the previous example, calculate the median:

For data the sample is obtained Height (m) Students Cumulative


with the formula: frequency
1.50 1.25 – 1.30 2 2
1.30 – 1.35 3 5
=6
1.35 – 1.40 3 8
Where: 17 1.40 – 1.45 5 13
Lower limit of the interval containing the 1.45 – 1.50 4 17
median. 0.05
1.50 – 1.55 6 23
=Frequency of the interval containing the 35 1.55 – 1.60 7 30
median.
1.60 – 1.65 5 35
=The addition of the frequencies before the Total 35
interval containing the median.

Class amplitude

Total data
Week 3

Lesson 4. Statistical measures in non-


clustered data. Measures of variation.

The dispersion measures that will calculate is:

a)Range
b)Mean deviation
b)Variance
c) Standard deviation and
d) Coefficient of variation.
Example 2.13

Range The data display the values of 10 personal loans, in pesos, at one finance
company:
The range (R) or route, shows the 2100, 2100, 2300, 2400, 2400, 2500, 2600, 2900, 2900, 3000.
amplitude of the data. What is the range of these data?

Formula: Solution:
R = major data - minor data The data ordered, calculate the range:
R = major data - minor data = 3000 - 2100 = 900

Desviación media Example 2.14


For the data: 15,18,19,20. Determine the average deviation.
The Mean Deviation (MD) is defined as Solution:
the average of the differences in Obtain the mean of the data:
absolute value of each data with the
mean.

Formula: Obtain the mean deviation of the data set:


(MD) MD
Variance Example
Calculate the variance and standard deviation of the following samples:
For a population:
a) Sample A: 15, 16, 17.
Solution: Obtain the sample mean:
Standard
deviation Calculate the variance of the sample:
 
Where:
Is the data  
Is the mean of the population Calculate the deviation of the sample:
Total data of the population b) Sample B: 13, 17, 18
For a sample: Solución: Obtain the sample mean:
Calculate the variance of the sample:
 
Standard
deviation
 
Where: Calculate the deviation of the sample:
Is the data Conclusion: Sample A with three data, its average is 16; its deviation of 1 means that on average it is 1
unit away from the mean. For sample B, its average is 16; its deviation of 2.65 means that on average it
Is the mean of the sample is 2.65 units away from the mean. Sample A is better because it has less variability.
Total data of the sample
Example
Coefficient of Calculate the coefficient of variation of the samples in the above example:
variation a) Sample A: 15, 16, 17.
Solution: Summarize: Mean: Deviation:
It consists of a measure of
the relative variability of a
sample obtained by the Then the coefficient of variation CV is:
quotient of the standard
deviation by its mean. It is  
represented by CV.

b) Sample B: 13, 17, 18


Solution: Summarize: Mean: Deviation:

Where:
The coefficient of variation CV is:
The standard deviation

The mean of the population  

Conclusion: The data of sample A are more homogeneous, according to the value of
the coefficient of variation.
Lesson 5. Statistical measures in grouped
data. Measures of variation.
Variance and standard deviation
For a population: For a sample:

   
Standard Standard
deviation deviation
Where: Where:

It's the class mark


It's the class mark
It's the population average It's the sample average

It's the frequency of each class The frequency of each class


Total population data Total sample data
Solution
Example 2.15 1. Calculates the class mark for each interval.
Determine the variance, 2. Then it calculates the sample mean .
standard deviation and 3. It also calculates the distances of each data from the mean:
coefficient of variation of the 4. Then these distances squared:
pooled data of a sample: 5. And finally the distances squared by the frequencies :
Interval          
Class 38 - 42 3
interval 43 - 47 7
38 - 42 3 48 - 52 17
43 - 47 7 53 - 57 13
48 - 52 17 58 - 62 6
53 - 57 13 63 - 67 4
58 - 62 6 Total 50
63 - 67 4  
 
Total 50
 
 
Bibliography:
García, O. Gutiérrez, J. (2019). Etapa 2 Medidas estadísticas.
Probabilidad y Estadística.
(pp. 70-122). Monterrey, México. Patria Educación.

You might also like