You are on page 1of 139

Chapter I

STATISTICS
Prepared by:
Larry Jay B. Valero, LPT
Statistics
 Statistics is derived from the Latin word "status" meaning state.

 Statistics is a collection of quantitative data, such as statistics of crimes,


statistics of enrolment, statistics of unemployment.

 Statistics is also the study of how to collect, organize, analyze, and


interpret numerical information from data.
2 kinds statistics

1.Descriptive Statistics

2.Inferential Statistics
Descriptive Statistics

 methods concerned with the collection, description, and analysis of a


set of data without drawing conclusions of inferences about a larger
set.
Example:
1. A bowler wants to find his bowling average for past 12 games.
2. A housewife wants to determine the average weekly amount she spent on groceries in the
past 3 months.
3. A politician wants to know the exact number of votes he received in the last election.
4. The Surgeon General studies the relationship between the cigarette smoking and heart
disease.
Inferential Statistics
 methods concerned with making predictions or inferences about a
larger set of data using only the information gather from a subset of
this larger set
1. A bowler wants to estimate his chance of winning a game based on
his current season averages and the averages of his opponents
2. A housewife wants to predict based on last year’s grocery bills, the
average weekly amount she will spend on groceries for this year.
3. A politician would like to estimate, based on an opinion poll, his
chance of winning in the upcoming election.
4. As a recent cut-backs by the oil producing nation, we can expect the
price of the gasoline to double in the next year.
POPULATION VS.
SAMPLE
Population
 collection of all elements under consideration in a statistical study.
 A population data set contains all members of a specified group (the
entire list of possible values.
 Totality of all the observations

 Example:
All people living in the Philippines.
All students in CVSU.
Sample
 A sample data set contains a part or a subset of a population.
 The size of the sample is always less than the size of the population from
which it is taken.
 Subset of a population.

 Example:
Some people living in the Philippines
Some students in CvSU
DATA COLLECTION
METHODS
Direct or Interview method
 a person to person encounter between the interviewee and the
interviewer.
 Interviewer- the one who gathers the information
 Interviewee- the source of information.

 Interview can be done personal, through phone or internet access.


Indirect or Questionnaire method
 a technique in which questionnaire is used to elicit the information or
data needed.
 The questionnaire is consist of questions printed or type in the definite
order on a form or a set of forms.
Registration Method
 obtains data from the records of the government agency authorized by
law to keep such data or information and made these available to the
researchers.

Example:
 Birth and Death Rates – National Statistics Office (NSO)
 Number of Registered Cars – Land Transportation Office (LTO)
 List of Registered Voters – Commission on Elections ( COMELEC)
Observation Method
 a technique in which data particularly those pertaining to the
behaviours of individuals or group of individuals during the given
situation are best describe through observation.

 Observing the children’s behaviour


 Observing the costumers movement
 Observing the traffic count
  
Experimental Methods
 a system used to gather data from the results of performed series of
experiments on same controlled and experimental variables. This is
commonly used in scientific inquiries.
LEVELS OF
MEASUREMENT
1. Nominal
 Classificatory scale
 characterized by data that consists of names, labels, or categories only.
 Naming
 Ex.
name
civil status
gender
religion
address
degree program
2. Ordinal
 Ranking scale
 The ordinal level of measurement contains the properties of nominal
level, and in addition, the number assigned to categories of any
variable maybe ranked or ordered in some low to high manner.
Example:
Military rank
Job position
Year Level
Teaching ratings
Size of t-shirt
3. Interval
 It is like the ordinal level, with the additional property that meaningful
amounts of differences between data can be determined.
 An interval scale must have a common and constant unit of
measurement. Furthermore the unit of measurement is arbitrary and
there is “no true zero point” .
 Example:
Temperature( in degree celcius and fahrenheit)
IQ score
SAT score
4. Ratio
 The ratio level of measurement contains all properties of the interval
level and in addition, it has a’ true zero” point.
Example:
 Distance
 Weight
 Height
 Weekly Allowance
DIFFERENT WAYS OR
FORMS TO PRESENT
DATA
Textual Form
 Makes use of words, sentences and paragraph in presentation.
 It is commonly used when there are only few numerical data to be
enumerated or to be compared with other data.
Tabular form
 is a systematic presentation of data in rows and columns.
 It is used when related numerical facts need to be classified in `arrays
Graphical Form
 It shows numerical values or relationships in a pictorial form.
Parts of statistical table
Table 1. Relationship Between Academic Performance
and the identified variables
Variable Correlation Coefficient Significance Remarks
GPA 0.7461 0.000 HS
MI 0.4015 0.000 S
IQ 0.9891 0.000 S
Gender 0.1452 0.084 NS

NS - Not Significant
S – Significant
HS- Highly Significant

HEADING – Shows table number, title and head note


TITLE- brief statement of the nature, classification and time reference of the
information presented and the area to which the statistics refer.
HEAD NOTE – enclosed in bracket between the title and the top rule of the table
Table 1. Relationship Between Academic Performance
and the identified variables
Variable Correlation Coefficient Significance Remarks
GPA 0.7461 0.000 HS
MI 0.4015 0.000 S
IQ 0.9891 0.000 S
Gender 0.1452 0.084 NS

NS - Not Significant
S – Significant
HS- Highly Significant

 BOX HEAD- portion that contains the column heads which describe
the data in each column
 STUB- First column on the left of the table, which describes the data
on the given row
Table 1. Relationship Between Academic Performance
and the identified variables
Variable Correlation Coefficient Significance Remarks
GPA 0.7461 0.000 HS
MI 0.4015 0.000 S
IQ 0.9891 0.000 S
Gender 0.1452 0.084 NS

NS - Not Significant
S – Significant
HS- Highly Significant

FOOTNOTE- statement inserted at the bottom of the table


Source Note – exact citation of the source of data which is usually
include acknowledging the origin of the data.
Different types of graphs
 Line graph is used when:
-data cover a long period of time
-several series are compared
-movements are to be emphasized
-trends are to be established.
Different types of graphs
 Bar Graph
is used when numerical values of an item over a period of time are
compared
It consists of regular bars where the height of bars represents quantity or
frequency for each category.
Different types of graphs
 Pie Graph
is used to show percentage or the composition by parts of a whole.
Different types of graphs
 Pictograph or Pictogram
is used to immediately suggest the nature of data.
Thank you for listening 
DATA ANALYSIS AND
INTERPRETATION
Prepared by: Larry Jay B. Valero, LPT
Descriptive Statistics
 Three methods of describing a set of values
a. measures of central tendency
b. measures of dispersion
c. measures of skewness and kurtosis
Inferential Statistics
 Two methods of Inferential Statistics
a. Hypothesis Testing
b. Estimation of Parameter(s)
Measures of Central Tendency
Measures of Central Tendency
 single number that represents the typical score of the data.
 are measures indicating the center of a set of data which are arranged in order of magnitude.
 Three measures of Central Tendency
a. Mean
b. Median
c. Mode
A. Mean/Arithmetic Mean/Average
 the most popular and well known measure of central tendency
 The average value of all the data in the set
Mean for ungrouped data
  defined as the sum of all the scores or data divided by the number of scores in the data.
 denoted by a symbol “” for population mean and “” for sample mean.
Population mean  Sample mean

Where Where
Example:
  The items listed below represent the scores of seven BS Mathematics students during the final examination.
Compute the mean score
89, 75, 90, 85, 78, 87, 80

 Suppose BS Applied Mathematics has 10 students and the height (in cm) are as follows: 170, 165, 155, 160, 150,
149, 152, 161, 163, 175. Find the mean height of the students.
B. Median
 The middle score for a set of data arranged in order of magnitude.
 best used when data has several extreme entries.
Median for ungrouped data
  defined as the middle value when a set of observed values have been arranged in either ascending or
descending order.
 denoted by Md

If n is ODD

If n is EVEN
Example:
  The items listed below represent the scores of seven BS Mathematics students during the final examination.
Compute the median score
89, 75, 90, 85, 78, 87, 80

75 78 80 85 87 89 90
n=7

Md = 85
Example:
  Suppose BS Applied Mathematics has 10 students and the height (in cm) are as follows: 170, 165, 155, 160, 150,
149, 152, 161, 163, 175. Find the median height of the students.

149 150 152 155 160 161 163 165 170 175
n = 10
C. Mode
 The most frequent score in the data set
 The most popular option
Mode for ungrouped data
 The mode is a value which occurs most often or the most frequently occurring observation
 Denoted by Mo
Example:
 Consider the data set 1 2 2 2 8 1 4 10
Mo = 2
Since there is only one mode, then the distribution is Unimodal
 Consider the data set 1 2 2 8 1 4 10
Mo = 1, 2
Bimodal
 Consider the data set 1 2 3 8 6 4 10
This data has no mode
GROUPED DATA
Mean
  
where:
Median
  
where:
Mode
  
where:
Class Intervals Relative Frequency <CF >CF

18-26 2 22

27-35 1 31

36-44 15 40

45-53 5 49

54-62 8 58

63-71 6 67

72-80 3 76
Total 40

 
𝑋 =
∑ 𝑓 𝑖 𝑥𝑖
𝐺
𝑛
Class Intervals 𝑓  𝑖 𝑥 𝑖 Relative Frequency <CF >CF

18-26 2 22 44

27-35 1 31 31

36-44 15 40 600

45-53 5 49 245

54-62 8 58 464

63-71 6 67 402

72-80 3 76 228
Total 40

 
𝑋𝐺=
∑ 𝑓 𝑖 𝑥 𝑖   2014
¿
∑ 𝑓 𝑖 𝑥𝑖=2014
 

𝑛 40  50.35
Class Intervals 𝑓  𝑖 𝑥 𝑖 𝐶𝑙𝑎𝑠𝑠
  𝐵𝑜𝑢𝑛𝑑𝑎𝑟𝑦  ¿ 𝑐𝑓
18-26 2 22 44  17.5 −26.5 2

27-35 1 31 31  26.5 −35.5 3

36-44 15 40 600  35.5 −44.5 18

45-53 5 49 245  44.5 −53.5 23 Median Class

54-62 8 58 464  53.5 −62.5 31

63-71 6 67 402  62.5 −71.5 37

72-80 3 76 228  71.5 −80.5 40


Total 40
 
𝑀𝑑 𝐺 = 𝐿𝑚𝑑 + 𝑐 [ (
𝑛
2
) − ¿ 𝐶𝐹 𝑏

𝑓 𝑚𝑑 ] ∑ 𝑓 𝑖 𝑥𝑖=2014
 

Median Class : 
 𝑛 40
2
¿  ¿  20
2
Class Intervals 𝑓  𝑖 𝑥 𝑖 𝐶𝑙𝑎𝑠𝑠
  𝐵𝑜𝑢𝑛𝑑𝑎𝑟𝑦  ¿ 𝑐𝑓
18-26 2 22 44  17.5 −26.5 2

27-35 1 31 31  26.5 −35.5 3

36-44 15 40 600  35.5 −44.5 18

45-53 5 49 245  44.5 −53.5 23 Median Class

54-62 8 58 464  53.5 −62.5 31

63-71 6 67 402  62.5 −71.5 37

72-80 3 76 228  71.5 −80.5 40


Total 40 40
 
𝑀𝑑 𝐺 = 𝐿𝑚𝑑 + 𝑐 [ (
𝑛
2
) − 𝐶𝐹 𝑏
]  𝐿𝑚𝑑 =44.5 c 
∑ 𝑓 𝑖 𝑥𝑖=2014
   
𝑀𝑑 𝐺 = 44.5+9 [ (
2
) − 18
5 ]
𝑓 𝑚𝑑  
Median Class :  n   𝐶𝐹 𝑏 =18
 𝑛 40  𝑓
2
¿  ¿  20 𝑚𝑑 =5
2
Class Intervals 𝑓  𝑖 𝑥 𝑖 Class Boundary <cf

18-26 2 22 44  17.5 −26.5 2

27-35 1 31 31  26.5 −35.5 3

36-44 15 40 600  35.5 −44.5 18 Modal Class

45-53 5 49 245  44.5 −53.5 23 Median Class

54-62 8 58 464  53.5 −62.5 31

63-71 6 67 402  62.5 −71.5 37

72-80 3 76 228  71.5 −80.5 40


Total 40

𝑜𝐺 = 𝐿𝑚𝑜+ 𝑐
𝑓 𝑚𝑜 − 𝑓 𝑏
[
2 𝑓 𝑚𝑜 − 𝑓 𝑎 − 𝑓 𝑏 ] ∑ 𝑓 𝑖 𝑥𝑖=2014
 

Modal Class : 
Class Intervals 𝑓  𝑖 𝑥 𝑖 Class Boundary <cf

18-26 2 22 44  17.5 −26.5 2

27-35 1 31 31  26.5 −35.5 3

36-44 15 40 600  35.5 −44.5 18 Modal Class

45-53 5 49 245  44.5 −53.5 23 Median Class

54-62 8 58 464  53.5 −62.5 31

63-71 6 67 402  62.5 −71.5 37

72-80 3 76 228  71.5 −80.5 40


Total 40 15 −1
𝑜𝐺 = 𝐿𝑚𝑜+ 𝑐
𝑓 𝑚𝑜 − 𝑓 𝑏
[
2 𝑓 𝑚𝑜 − 𝑓 𝑎 − 𝑓 𝑏 ]𝐿
  𝑚0 =35.5
∑ 𝑓 𝑖 𝑥𝑖=2014
 


 
𝑀𝑑 𝐺 =35.5+9
[ 2 ( 15 ) − 5 −1 ]
 
Modal Class :  𝑓  𝑚𝑜=15  𝑓 𝑏 =1
 𝑓 𝑎=5
Thank you for listening 
Frequency Distribution Table
Prepared by: Elaine C. Ricohermoso, LPT
TERMS
 Array
is an arrangement of the numerical data/values according to order of
magnitude either ascending or descending order.
 Frequency Distribution Table
is a condensed version of an array.
It categorizes the numerical data into intervals or classes.
 Classes
are mutually exclusive categories defining the lower limit and the upper
limit with equal intervals
TERMS
 Class frequency
is the number of observations in each class.
 Class mark
class midpoint
Cumulative frequency
tells the sum of frequencies in a particular class of interest.
Relative frequency
tells the percentage of observations in a particular class of interest.
Steps in Constructing a Frequency Distribution Table
1.   Make an array

2. Determine the range R of the numerical data.


R = | Highest value – Lowest value |
3. Determine the number of classes K to which the data are to be grouped using
the Sturge’s Approximation: (Round-up)
K = 1 + 3.322 Log N where N= total number of values to be grouped.
4. Determine the class size C. (Round-off)

5. Determine the lower limit of the first class.


6. Construct the class intervals and determine the class frequencies.
Example
 Raw scores of 50 students in 200 item test.
144 112 156 122 168 172 141 159 127 154
156 145 134 137 123 149 144 160 136 139
142 138 159 151 147 150 126 152 147 136
135 132 146 133 150 122 139 149 152 129
131 155 116 140 145 135 160 125 172 163

1. Make an array
112 125 132 136 139 144 147 151 156 160
116 126 133 136 140 145 149 152 156 163
122 127 134 137 141 145 149 152 159 168
122 129 135 138 142 146 150 154 159 172
123 131 135 139 144 147 150 155 160 172
1. Make an array
112 125 132 136 139 144 147 151 156 160
116 126 133 136 140 145 149 152 156 163
122 127 134 137 141 145 149 152 159 168
122 129 135 138 142 146 150 154 159 172
123 131 135 139 144 147 150 155 160 172
2.
 Determine
  the Range R.
R = |172 – 112 | = 60
3. Determine the number of classes K using the Sturge’s Approximation: (Round-up)

= 1 + 3.322 log 50 = 6.64 7


4. Determine the class size C. (Round-off)

=
5. Determine the lower limit of the first class.
112
112 125 132 136 139 144 147 151 156 160
116 126 133 136 140 145 149 152 156 163
2. R= 60
122 127 134 137 141 145 149 152 159 168 3. K = 7
122 129 135 138 142 146 150 154 159 172 4. C = 9
5. first lower limit is 112
123 131 135 139 144 147 150 155 160 172
Class Intervals Frequency Class Mark Class Boundary Relative Frequency <CF >CF

112 – 120 2

121 – 129 7

130 – 138 10

139 – 147 12

148 – 156 11

157 – 165 5

166 – 174 3

Total 50
Class Intervals Frequency Class Mark Class Boundary Relative Frequency <CF >CF

112-120 2 116

121-129 7 125

130-138 10 134

139-147 12 143

148-156 11 152

157-165 5 161

166-174 3 170
Total 50

  𝐿𝑜𝑤𝑒𝑟 𝐶𝑙𝑎𝑠𝑠 𝐼𝑛𝑡𝑒𝑟𝑣𝑎𝑙 𝑜𝑓 𝑡h𝑒 𝑖𝑡h 𝑐𝑙𝑎𝑠𝑠+𝑈𝑝𝑝𝑝𝑒𝑟 𝐶𝑙𝑎𝑠𝑠 𝐼𝑛𝑡𝑒𝑟𝑣𝑎𝑙 𝑜𝑓 𝑡h𝑒 𝑖𝑡h 𝑐𝑙𝑎𝑠𝑠
𝐶𝑙𝑎𝑠𝑠 𝑀𝑎𝑟𝑘 =
2
Class Intervals Frequency Class Mark Class Boundary Relative Frequency <CF >CF

112-120 2 116 111.5 – 120.5

121-129 7 125 120.5 – 129.5

130-138 10 134 129.5 – 138.5

139-147 12 143 138.5 – 147.5

148-156 11 152 147.5 – 156.5

157-165 5 161 156.5 – 165.5

166-174 3 170 165.5 – 174.5


Total 50

𝑠 𝐵𝑜𝑢𝑛𝑑𝑎𝑟𝑦= ( 𝐿𝑜𝑤𝑒𝑟 𝐶𝑙𝑎𝑠𝑠 𝐼𝑛𝑡𝑒𝑟𝑣𝑎𝑙 𝑜𝑓 𝑡h𝑒 𝑖𝑡h 𝑐𝑙𝑎𝑠𝑠 −0.5 ) −(𝑈𝑝𝑝𝑒𝑟 𝐶𝑙𝑎𝑠𝑠 𝐼𝑛𝑡𝑒𝑟𝑣𝑎𝑙 𝑜𝑓 𝑡h𝑒 𝑖𝑡h 𝑐𝑙𝑎𝑠𝑠+0.5)
Class Intervals Frequency Class Mark Class Boundary Relative Frequency <CF >CF

112-120 2 116 111.5 – 120.5 4 2 50

121-129 7 125 120.5 – 129.5 14 9 48

130-138 10 134 129.5 – 138.5 20 19 41

139-147 12 143 138.5 – 147.5 24 31 31

148-156 11 152 147.5 – 156.5 22 42 19

157-165 5 161 156.5 – 165.5 10 47 8

166-174 3 170 165.5 – 174.5 6 50 3


Total 50 100

 Relative Frequency
DATA ANALYSIS AND
INTERPRETATION
Descriptive Statistics
 Three methods of describing a set of values
a. measures of central tendency
b. measures of dispersion
c. measures of skewness and kurtosis
Measures of Central Tendency
Measures of Central Tendency
 single number that represents the typical score of the data.
 are measures indicating the center of a set of data which are arranged in order of magnitude.
 Three measures of Central Tendency
a. Mean
b. Median
c. Mode
A. Mean/Arithmetic Mean/Average
 the most popular and well known measure of central tendency
 The average value of all the data in the set
Mean for ungrouped data
  defined as the sum of all the scores or data divided by the number of scores in the data.
 denoted by a symbol “” for population mean and “” for sample mean.
Population mean  Sample mean

Where Where
Example:
  The items listed below represent the scores of seven BS Mathematics students during the final examination.
Compute the mean score
89, 75, 90, 85, 78, 87, 80

 Suppose BS Applied Mathematics has 10 students and the height (in cm) are as follows: 170, 165, 155, 160, 150,
149, 152, 161, 163, 175. Find the mean height of the students.
B. Median
 The middle score for a set of data arranged in order of magnitude.
 best used when data has several extreme entries.
Median for ungrouped data
  defined as the middle value when a set of observed values have been arranged in either ascending or
descending order.
 denoted by Md

If n is ODD

If n is EVEN
Example:
  The items listed below represent the scores of seven BS Mathematics students during the final examination.
Compute the median score
89, 75, 90, 85, 78, 87, 80
75 78 80 85 87 89 90
n=7

Md = 85
Example:
  Suppose BS Applied Mathematics has 10 students and the height (in cm) are as follows: 170, 165, 155, 160, 150,
149, 152, 161, 163, 175. Find the median height of the students.
149 150 152 155 160 161 163 165 170 175
n = 10
C. Mode
 The most frequent score in the data set
 The most popular option
Mode for ungrouped data
 The mode is a value which occurs most often or the most frequently occurring observation
 Denoted by Mo
Example:
 Consider the data set 1 2 2 2 8 1 4 10
Mo = 2
Since there is only one mode, then the distribution is Unimodal
 Consider the data set 1 2 2 8 1 4 10
Mo = 1, 2
Bimodal
 Consider the data set 1 2 3 8 6 4 10
This data has no mode
GROUPED DATA
Mean
  
where:
Median
  
where:
Mode
  
where:
Class Intervals Relative Frequency <CF >CF

18-26 2 22

27-35 1 31

36-44 15 40

45-53 5 49

54-62 8 58

63-71 6 67

72-80 3 76
Total 40

 
𝑋 =
∑ 𝑓 𝑖 𝑥𝑖
𝐺
𝑛
Class Intervals 𝑓  𝑖 𝑥 𝑖 Relative Frequency <CF >CF

18-26 2 22 44

27-35 1 31 31

36-44 15 40 600

45-53 5 49 245

54-62 8 58 464

63-71 6 67 402

72-80 3 76 228
Total 40

 
𝑋𝐺=
∑ 𝑓 𝑖 𝑥 𝑖   2014
¿
∑ 𝑓 𝑖 𝑥𝑖=2014
 

𝑛 40  50.35
Class Intervals 𝑓  𝑖 𝑥 𝑖 𝐶𝑙𝑎𝑠𝑠
  𝐵𝑜𝑢𝑛𝑑𝑎𝑟𝑦  ¿ 𝑐𝑓
18-26 2 22 44  17.5 −26.5 2

27-35 1 31 31  26.5 −35.5 3

36-44 15 40 600  35.5 −44.5 18

45-53 5 49 245  44.5 −53.5 23 Median Class

54-62 8 58 464  53.5 −62.5 31

63-71 6 67 402  62.5 −71.5 37

72-80 3 76 228  71.5 −80.5 40


Total 40
 
𝑀𝑑 𝐺 = 𝐿𝐶𝐵 𝑚𝑑 + 𝑐 [ (
𝑛
2
) − 𝐶𝐹 𝑏

𝑓 𝑚𝑑 ] ∑ 𝑓 𝑖 𝑥𝑖=2014
 

Median Class : 
 𝑛 40
2
¿  ¿  20
2
Class Intervals 𝑓  𝑖 𝑥 𝑖 𝐶𝑙𝑎𝑠𝑠
  𝐵𝑜𝑢𝑛𝑑𝑎𝑟𝑦  ¿ 𝑐𝑓
18-26 2 22 44  17.5 −26.5 2

27-35 1 31 31  26.5 −35.5 3

36-44 15 40 600  35.5 −44.5 18

45-53 5 49 245  44.5 −53.5 23 Median Class

54-62 8 58 464  53.5 −62.5 31

63-71 6 67 402  62.5 −71.5 37

72-80 3 76 228  71.5 −80.5 40


Total 40 40
  𝑛
∑ 𝑓 𝑖 𝑥𝑖=2014
   
[ (
2
) − 18
]
𝑀𝑑 𝐺 =𝐿𝐶𝐵 𝑚𝑑 + 𝑐
2
𝑓 𝑚𝑑 [
( ) −¿ 𝐶𝐹 𝑏

Median Class : 
] 𝐿𝐶𝐵 𝑚𝑑=44.5𝐶𝐹
 

n   

𝑏 =18
𝑀𝑑 𝐺 = 44.5+9

 
5

 𝑛 40  𝑓
2
¿  ¿  20 𝑚𝑑 =5
2
Class Intervals 𝑓  𝑖 𝑥 𝑖 Class Boundary <cf

18-26 2 22 44  17.5 −26.5 2

27-35 1 31 31  26.5 −35.5 3

36-44 15 40 600  35.5 −44.5 18 Modal Class

45-53 5 49 245  44.5 −53.5 23 Median Class

54-62 8 58 464  53.5 −62.5 31

63-71 6 67 402  62.5 −71.5 37

72-80 3 76 228  71.5 −80.5 40


Total 40

𝑜𝐺 = 𝐿𝐶𝐵 𝑚𝑜+ 𝑐
𝑓 𝑚𝑜 − 𝑓 𝑏
[
2 𝑓 𝑚𝑜 − 𝑓 𝑎 − 𝑓 𝑏 ] ∑ 𝑓 𝑖 𝑥𝑖=2014
 

Modal Class : 
Class Intervals 𝑓  𝑖 𝑥 𝑖 Class Boundary <cf

18-26 2 22 44  17.5 −26.5 2

27-35 1 31 31  26.5 −35.5 3

36-44 15 40 600  35.5 −44.5 18 Modal Class

45-53 5 49 245  44.5 −53.5 23 Median Class

54-62 8 58 464  53.5 −62.5 31

63-71 6 67 402  62.5 −71.5 37

72-80 3 76 228  71.5 −80.5 40


Total 40 15 −1
𝑜𝐺 = 𝐿𝐶𝐵 𝑚𝑜+ 𝑐
𝑓 𝑚𝑜 − 𝑓 𝑏
[ ] 𝐿𝐶𝐵 𝑚0=35.5
2 𝑓 𝑚𝑜 − 𝑓 𝑎 − 𝑓 𝑏  
∑ 𝑓 𝑖 𝑥𝑖=2014
 


 
𝑀𝑜𝐺 =35.5+ 9
[ 2 ( 15 ) −5 −1 ]
 
Modal Class :  𝑓  𝑚𝑜=15  𝑓 𝑏 =1
 𝑓 𝑎=5
Thank you for listening 
MEASURES OF
DISPERSION
Prepared by:
ELAINE C. RICOHERMOSO, LPT
MEASURES OF DISPERSION

• Identify how a set of values spreads or fluctuates


• The measures of dispersion are the
a. Range
b. Variance
c. Standard deviation
d. Coefficient of variation.
A. RANGE

• The simplest measure of dispersion.


• It is the difference between the highest score and lowest score
Range for Ungrouped data:
The range of a set of data is the absolute difference between the highest and the lowest
value in the set.
The range is denoted by R.
R = |HV – LV|
where:
R – Range
HV – Highest value
LV – Lowest value
EXAMPLE:

• The items listed below represent the scores of seven BSIT students during
the final examination. Compute the range.
89, 75,90,85,78,87,80
R = |90 – 75| = 15

• Suppose BSIT has 10 students and the height (in cm) are as follows: 170,
165, 155, 160, 150, 149, 152, 161, 163, 175. Find the range of the score of
the students.
R = |175 – 149| = 26
B. VARIANCE

••  
Mean absolute deviation
• Consider the position of each observation relative to the mean.
• The variance of given data set is the average of the sum of the square
deviation of the observation from the mean.
• The variance from the population is denoted by and for the sample.
VARIANCE FOR UNGROUPED DATA:

Population Variance Sample Variance

   
Formula
Example:
•  The data below represents the score of 4 students from BSIT-3OLD.

3, 2, 2, 1.
Compute the variance.

Using the formula


 Suppose BSIT has 6 students and the height (in cm) are as follows: 170, 166, 171, 160, 150, 161. Compute the variance.

Using the formula:


C. STANDARD DEVIATION

••  
Based on the deviations of all the scores in a series
• It is always computed from the mean
• defined as the positive square root of the variance
• denoted by “” for the population standard deviation and “s” for the
sample standard deviation.
• Population standard deviation

• Sample Standard deviation


Example:
•  The data below represents the score of 4 students from BSIT-3OLD.
3, 2, 2, 1.
Compute the variance and standard deviation.
----variance
 Suppose BSIT has 6 students and the height (in cm) are as follows: 170, 166, 171, 160,
150, 161. Compute the variance and standard deviation.
D. COEFFICIENT OF VARIATION

••  
Theratio of the standard deviation and the mean and is usually expressed
in percent.
• Population

• Sample
Example:
•  The data below represents the score of 4 students from BSIT-3OLD.
3, 2, 2, 1.
Compute the variance, standard deviation and the coefficient of variation.
----variance
-----standard deviation
 
Suppose BSIT has 6 students and the height (in cm) are as follows: 170, 166, 171, 160, 150, 161. Compute the
coefficient of variation.
GROUPED DATA
•   • Range

ULHC – Upper Limit of the Highest Class


LLLC – Lower Limit of the Lowest Class
• Variance

• Standard deviation

• Coefficient of Variation
Class Intervals Relative Frequency <CF >CF

18-26 2 22
18-26 2
27-35 1 31
27-35 1
36-44
36-44
15
15 40

45-53 5 49
45-53 5
54-62 8 58
54-62 8
63-71 6 67
63-71 6
72-80 3 76
72-80 3

 
Class Intervals Relative Frequency <CF >CF

18-26 2 22
18-26 2
27-35 1 31
27-35 1
36-44
36-44
15
15 40

45-53 5 49
45-53 5
54-62 8 58
54-62 8
63-71 6 67
63-71 6
72-80 3 76
72-80 3
Total
Total

2 2
  2 𝑛∑ 𝑓 𝑖 𝑥𝑖 −( ∑ 𝑓 𝑖 𝑥𝑖 )
𝑠𝐺 =
𝑛 (𝑛 − 1 )
Class Intervals 𝑓  𝑖 𝑥 𝑖 Relative Frequency <CF >CF

18-26 2 22 44

27-35 1 31 31

36-44 15 40 600

45-53 5 49 245

54-62 8 58 464

63-71 6 67 402

72-80 3 76 228
Total 40

2 2
∑ 𝑓 𝑖 𝑥𝑖=2014
 

𝑛∑ 𝑓 𝑖 𝑥𝑖 −( ∑ 𝑓 𝑖 𝑥𝑖 )
𝑛 (𝑛 − 1 )
Class Intervals 𝑓  𝑖 𝑥 𝑖   𝑥𝑖 2  𝑓 𝑥 2
𝑖 𝑖

18-26 2 22 44   484 968

27-35 1 31 31   961 961

36-44 15 40 600   1600 24000

45-53 5 49 245   2401 12005

54-62 8 58 464   3364 26912

63-71 6 67 402   4489 26934

72-80 3 76 228   5776 17328


Total 40 2

2 2
∑ 𝑓 𝑖 𝑥𝑖=2014
 

2
∑ 𝑓 𝑖 𝑥𝑖 =109,108
 

  𝑛 ∑ 𝑓 𝑖 𝑥𝑖 −( ∑ 𝑓 𝑖 𝑥𝑖 )   40 ( 109,108 ) − ( 2014 )
𝑠 𝐺2 = ¿ ¿  197.52
𝑛(𝑛 −1) 40(40− 1)
Class Intervals 𝑓  𝑖 𝑥 𝑖   𝑥𝑖 2  𝑓 𝑥 2
𝑖 𝑖

18-26 2 22 44   484 968

27-35 1 31 31   961 961

36-44 15 40 600   1600 24000

45-53 5 49 245   2401 12005

54-62 8 58 464   3364 26912

63-71 6 67 402   4489 26934

72-80 3 76 228   5776 17328


Total 40 2
𝑠  𝐺 =√ 𝑠𝐺2
∑ 𝑓 𝑖 𝑥𝑖=2014
 
∑ 𝑓 𝑖 𝑥𝑖 =109,108
 

𝑠  𝐺2 =197.52
 
 
Class Intervals 𝑓  𝑖 𝑥 𝑖   𝑥𝑖 2  𝑓 𝑥 2
𝑖 𝑖

18-26 2 22 44   484 968

27-35 1 31 31   961 961

36-44 15 40 600   1600 24000

45-53 5 49 245   2401 12005

54-62 8 58 464   3364 26912

63-71 6 67 402   4489 26934

72-80 3 76 228   5776 17328


Total 40 2
  = 𝑠 𝐺 𝑥 100
CV
´𝑥𝐺
∑ 𝑓 𝑖 𝑥∑𝑖=2014
 

  𝑓 𝑖 𝑥  𝑖 2014
∑ 𝑓 𝑖 𝑥𝑖 =109,108
 

𝑠  𝐺2 =197.52 ´𝑥𝐺 = ¿   50.35


¿
14.05 𝑛 40
𝑠  𝐺 =14.05 ¿  𝑥100
50.35
¿27.90
  %
THANK YOU FOR LISTENING! 
Measures of Relative
Position
• Measures of position identifies the rank or position occupied by a data from an array of data collected
• Three measures of relative position

a. Percentiles

b. Deciles

c. Quartiles
A. Percentiles
•  Are values that divide a set of observations into 100 equal parts
• These values denoted by
Example:
•  Given a random sample of size, n=12
• 4 7 8 2 7 5 8 9 10 14 3 4

Find the values of . Interpret the values.

2 3 4 4 5 7 7 8 8 9 10 14

 𝑃 50 =7
This means that 50% of the values fall below 7.

 𝑃2 0=4
This means that 20% of the values fall below 4.

 𝑃82 = 9
This means that 82% of the values fall below 9.
B. Deciles
•  Are values that divide a set of observations into 10 equal parts
• These values denoted by
• D
Example:
•  Given a random sample of size, n=12
• 4 7 8 2 7 5 8 9 10 14 3 4

Find the values of . Interpret the values.

2 3 4 4 5 7 7 8 8 9 10 14

 𝐷 5=7
This means that 50% of the values fall below 7.

 𝐷 9=10
This means that 90% of the values fall below 10.

 𝐷 1fall
This means that 10% of the values
=3below 3.
C. Quartiles
•  Are values that divide a set of observations into 4 equal parts
• These values denoted by
• Q
Example:
•  Given a random sample of size, n=12
• 4 7 8 2 7 5 8 9 10 14 3 4

Find the values of . Interpret the values.

2 3 4 4 5 7 7 8 8 9 10 14

 𝑄 1= 4
This means that 25% of the values fall below 4.

 𝑄 2=7
This means that 50% of the values fall below 7.

𝑄 3= fall
This means that 75% of the  values
8 below 8.
  𝑃 50 = 𝐷 5 =𝑄 2
Thank you for Listening! 
Skewness and Kurtosis
Prepared by:
Larry Jay B. Valero, LPT
Skewness
• Is a measure or a criterion on how asymmetric the distribution of data is from
the mean.

1. Using the Measures of Central Tendency


2. Coefficient of the Pearsonian Skewness
1. Using the measure of Central Tendency
Mean = Median = Mode Mean > Median > Mode Mean < Median < Mode

The skewness is zero The skewness is positive The skewness is negative


Symmetric Positively Skewed Negatively Skewed
1. Given a random sample of size n = 4,
3 2 2 1 using MCT, tell whether the given data are symmetric, skewed to the left or skewed to the right.
2. Suppose BSE BIO has 10 students and their scores in short quiz are as follows: 4, 7, 8, 2, 8, 8, 9, 2, 5, 7 using MCT, tell
whether the given data are symmetric, skewed to the left or skewed to the right.
2. Coefficient of the Pearsonian Skewness
•  denoted by SK
• (Population)

If SK = 0 then the distribution is symmetric


SK > 0 then the distribution is positive skewed
SK < 0 then the distribution is negatively skewed
1. Given a random sample of size n = 4,
3 2 2 1 using Pearsonian Skewness, tell whether the given data are symmetric, skewed to the left or skewed to
the right.
2. Suppose BSE BIO has 10 students and their scores in short quiz are as follows: 4, 7, 8, 2, 8, 8, 9, 2, 5, 7 using Pearsonian
Skewness, tell whether the given data are symmetric, skewed to the left or skewed to the right.
Coefficient of Kurtosis
• Kurtosis measures the flatness and peakedness of the distribution of a
given data set.
• A distribution which is more peaked than the normal distribution is called
Leptokurtic distribution

• A distribution which is more flatter than the normal distribution is called


Platykurtic Distribution

• Between the two types of distribution is more “normal” in shaped ,


referred to as Mesokurtic distribution
•   (Population)

• (Sample)

K=3, the distribution is Mesokurtic


K<3, the distribution is Platykurtic
K>3, the distribution is Leptokurtic
1. Given a random sample of size n = 4,
3 2 2 1 Compute the coefficient of Kurtosis.

You might also like