You are on page 1of 48

Nature of Statistics

Introduction
Learning Outcomes

At the end of this module, you should be able to:


• Define statistics and discuss the basic statistical terms
• Elaborate steps of statistical inquiry
• Describe qualities of statistical data
• Demonstrate statistical data collection
• Construct textual, tabular and graphical presentation of data
• Report facts and observation accurately, scholarly, and honestly
Statistics
• Definition: Science of collection,
presentation, analysis, and reasonable
interpretation of data.
• presents a rigorous scientific method for
gaining insight into data
• can give an instant overall picture of data
based on graphical presentation or
numerical summarization irrespective to
the number of data points
Statistics
• another important task of statistics is to make
inference and predict relations of variables
• For example, suppose we measure the weight of 100
patients in a study. With so many measurements,
simply looking at the data fails to provide an
informative account
Basic Concepts and
Definitions
Statistical Methods
Statistical
Methods

Descriptive Inferential
Statistics Statistics
Basic Concepts and
Definitions
Descriptive
Statistics 50
$

1. Involves
• Collecting Data 25
• Presenting Data
0
• Characterizing Data
Q1 Q2 Q3 Q4

2. Purpose
X = 30.5 S2 = 113
• Describe Data
Basic Concepts and
Definitions
Inferential Population

Statistics
?

1. Involves
• Estimation
• Hypothesis
Testing
2. Purpose
• Make decisions about population
characteristics

© 2011 Pearson Education, Inc


Basic Concepts and
Definitions
Variable
• The differentiating property of subjects or respondents that
vary from one situation to another.
• A variable is a characteristic or condition that can change or
take on different values.
• Example: gender, religion, salary, socio-economic status, etc.
Basic Concepts and
Definitions
Types of variables
Variables

Qualitative Quantitative

Dichotomic Polynomic Discrete Continuous

Educational attainment Children in family, Weight and height of a


Sex (Male and Female) (Elem, HS, College,…) Strokes on a golf hole student
Basic Concepts and
Definitions
Qualitative data
Qualitative variables
• Express characteristics that cannot be measured numerically
• are generally described by words or letters
• They are not as widely used as quantitative data because many
numerical techniques do not apply to the qualitative data
• Qualitative data can be separated into two subgroups:
dichotomic (if it takes the form of a word with two options
polynomic (if it takes the form of a word with more than two options
Basic Concepts and
Definitions
Quantitative data
Quantitative data
• Amounts or values that can be counted or measured
• Analyzed using the four fundamental operations
• the result of counting or measuring attributes of a population
• Quantitative data can be separated into two subgroups:
discrete – counted numerically and takes a whole number value
- (if it is the result of counting (the number of students of a given ethnic
group in a class, the number of books on a shelf, ...)
Continuous – measured on a scale
-(if it is the result of measuring (distance traveled, weight of luggage, …)
Basic Concepts and
Definitions
Observation
Any characteristic, value, or information about the variable
Example: test scores, differences in responses

Data
Refers to the set of observations gathered from subjects or
respondents
Example: opinion, number of students vaccinated
Basic Concepts and
Definitions
Indicator
Data that directly measure the variables being studied
Example: academic performance of students in a subject –
scores in exam
Basic Concepts and
Definitions
Population vs. Sample
Population Sample
• group of all individuals, subjects or • Representative portion taken from the
objects considered in the study population where data is actually taken
Basic Concepts and
Definitions
Parameter
Attributes or properties that are common for every member in population
Example: The average age of the 5,000 college students who took the NCII
is 18 years old

Statistic
Attributes or properties that are common for every member in a sample
Example: The average age of the 5,000 college students who took the NCII, 3000
students are male. The average age of the male students is 17 years old
Describing Data with Graphs
Graph or chart
• Statistical device that present data
• Simplifies and summarizes significant details about
important aspects and implications of data it is representing
• Facilitates better interpretation and analysis
Guidelines in using graphs or
charts
1. Accurate construction from valid and reliable data for correct
interpretation
2. Clear and unambiguous for easy reading and understanding
3. Simple and uncomplicated without unnecessary data for efficient and
effective visual communication
4. Attractive, neat, appealing and professional looking with harmony and
consistency of style and elements
Graph for Quantitative Data
Bar Graph
Constructed to compare
data sizes and
frequencies
Graph for Quantitative Data
Pie Chart
Shows the part or
division of the
categories to the whole
Graph for Quantitative Data
Line graph
• Effective in showing
trend over a period of
time
• Useful tool in showing
relationship between
two or more sets of
data
Graph for Quantitative Data
Pictograph
Uses symbols or pictures
that represent a standard
value
Graph for Quantitative Data
Stem and Leaf Plots
Presents the data using
the actual numerical
values of each data point
Graph for Qualitative Data

• Most qualitative data can be counted


• Pie charts or bar graphs are the most functional
Data Description – Ungrouped
Data

This will show you the step by step process of


computation of the data description of ungrouped
data
Numerical Data Description

Data can also be described numerically through the


different numerical measures such as:
• Measures of central tendency
• Measures of Variability
• Measures of Position
Measures of Central Tendency
Measures of Central Tendency – Ungrouped Data
Measures of Central Tendency
• Mean
• Median
• Mode
Measures of Central Tendency
Example: 8, 3, 4, 6, 9, 1, 3, 2, 5
1st
1
2nd Mean: Median: Mode:
2
3rd The value
3 that appears
3 4th
the most in
4 5th
the data
5 ¿ 5 𝑡h 𝑣𝑎𝑙𝑢𝑒
6 unimodal
8
9
Measures of Central Tendency

Example:

1. 5, 8, 10, 9, 2, 14
Frequency Distribution Table
Example – Grouped data
Class Frequency
1–3 3
4–6 5
7–9 2
10 – 12 2
13 – 15 3
Frequency Distribution Table
Frequency Distribution Table
Lower Upper Cumulative Relative
Frequency Midpoint
Class Boundary Boundary frequency frequency
f x LB UB cf Rf
1–3 3 2 0.5 3.5 3 20.00
4–6 5 5 3.5 6.5 8 33.33
7–9 2 8 6.5 9.5 10 13.33
10 – 12 2 11 9.5 12.5 12 13.33
13 – 15 3 14 12.5 15.5 15 20.00
𝐿𝑜𝑤𝑒𝑟 𝑙𝑖𝑚𝑖𝑡 +𝑈𝑝𝑝𝑒𝑟 𝑙𝑖𝑚𝑖𝑡
𝑀𝑖𝑑𝑝𝑜𝑖𝑛𝑡= 𝑈𝑝𝑝𝑒𝑟 𝐵𝑜𝑢𝑛𝑑𝑎𝑟𝑦=𝑈𝑝𝑝𝑒𝑟 𝑙𝑖𝑚𝑖𝑡+0.5
𝐶𝑙𝑎𝑠𝑠𝑖𝑛𝑡𝑒𝑟𝑣𝑎𝑙𝑤𝑖𝑑𝑡h=(𝑈𝑝𝑝𝑒𝑟𝑙𝑖𝑚𝑖𝑡 −𝐿𝑜𝑤𝑒𝑟𝑙𝑖𝑚𝑖𝑡 )+1 2
𝐿𝑜𝑤𝑒𝑟 𝐵𝑜𝑢𝑛𝑑𝑎𝑟𝑦= 𝐿𝑜𝑤𝑒𝑟 𝑙𝑖𝑚𝑖𝑡−0.5 𝑅𝑒𝑙𝑎𝑡𝑖𝑣𝑒 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦 =
𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦
𝑛
∙100
Measures of Central Tendency
Measures of Central Tendency – Grouped Data
Measures of Central Tendency
• Mean
• Median
• Mode
Measures of Central Tendency
Grouped Data - Mean Mean: Frequency

Class f x fx 𝑥=
∑ 𝑓 ∙ 𝑥𝑚 Midpoint

Sum of all the


𝑛 frequency multiplied by
1–3 3 2 6 the midpoint
111
25 ¿
4–6 5 5 15
Total of frequencies

7–9 2 8 16 ¿ 7.40
10 – 12 2 11 22
13 – 15 3 14 42
𝑛=15 𝑓𝑥=111 ∑
𝐿𝑜𝑤𝑒𝑟 𝑙𝑖𝑚𝑖𝑡 +𝑈𝑝𝑝𝑒𝑟 𝑙𝑖𝑚𝑖𝑡
𝑀𝑖𝑑𝑝𝑜𝑖𝑛𝑡=
2
Measures of Central Tendency
Grouped Data - Median Class interval width Total of frequencies

( )
𝑛 Preceding cumulative
− ¿ 𝑐𝑓 𝑀𝑒
frequency
Class f LB cf 𝑀𝑑𝑛= 𝐿 𝑀𝑒 + 𝑐
2
𝑓 𝑀𝑒
Lower boundary of Frequency of

1–3 3 0.5 3 Median class median class

Median
4–6 5 3.5 8 Median class:

( )
class 𝑛
− ¿ 𝑐𝑓 𝑀𝑒
7–9 2 6.5 10 𝑀𝑒𝑑𝑖𝑎𝑛 𝑐𝑙𝑎𝑠𝑠=
𝑛
2 𝑀𝑑𝑛= 𝐿 𝑀𝑒 + 𝑐
2
𝑓 𝑀𝑒
10 – 12 2 9.5 12

( )
= 7.5 15
−3
2
13 – 15 3 12.5 15 ¿ 3.5 +3
5
n=15
¿ 6.20
𝐶𝑙𝑎𝑠𝑠𝑖𝑛𝑡𝑒𝑟𝑣𝑎𝑙𝑤𝑖𝑑𝑡h=(𝑈𝑝𝑝𝑒𝑟𝑙𝑖𝑚𝑖𝑡 −𝐿𝑜𝑤𝑒𝑟𝑙𝑖𝑚𝑖𝐿𝑜𝑤𝑒𝑟
𝑡 )+1 𝐵𝑜𝑢𝑛𝑑𝑎𝑟𝑦= 𝐿𝑜𝑤𝑒𝑟 𝑙𝑖𝑚𝑖𝑡−0.5
Measures of Central Tendency
Grouped Data - Mode Class interval width Difference between the
frequency of the modal

Class f LB
𝑀𝑜𝑑𝑒= 𝐿 𝑀𝑜 + 𝑐
Lower boundary of
( ∆1
∆1 +∆ 2 )
class and the frequency
of the preceding class

Difference between the


∆1 Modal class

<
1–3 3 0.5 frequency of the modal
class and the frequency

<
Modal of the proceeding class
4–6 5 3.5 class
∆ 1=5 − 3=2∆ 2=5 − 2=3
7 – 9 ∆2 2 6.5
10 – 12 2 9.5
Modal class is
the class with
𝑀𝑜𝑑𝑒= 𝐿 𝑀𝑜 + 𝑐
) ( ∆1
∆1 +∆ 2
the highest
¿ 3.5 + 3 (
2+ 3 )
2
13 – 15 3 12.5 frequency.
¿ 4.7

𝐶𝑙𝑎𝑠𝑠𝑖𝑛𝑡𝑒𝑟𝑣𝑎𝑙𝑤𝑖𝑑𝑡h=(𝑈𝑝𝑝𝑒𝑟𝑙𝑖𝑚𝑖𝑡 −𝐿𝑜𝑤𝑒𝑟𝑙𝑖𝑚𝑖𝑡𝐿𝑜𝑤𝑒𝑟
)+1 𝐵𝑜𝑢𝑛𝑑𝑎𝑟𝑦= 𝐿𝑜𝑤𝑒𝑟 𝑙𝑖𝑚𝑖𝑡−0.5
Measures of Central Tendency
Weights (in kg) f x fx LB cf 𝑀𝑒𝑎𝑛=68.83

( )
60
54.5– 59.5 8 57 456 54 8 2
− 19
𝑀𝑒𝑑𝑖𝑎𝑛 =64+ 6
12
59.5– 64.5 11 62 682 59 19
64.5– 69.5 12 67 804 64 31 𝑀𝑒𝑑𝑖𝑎𝑛=69.5
69.5– 74.5 17 72 1224 69 48 𝑀𝑜𝑑𝑒= 69 +6 ( 5 +11
5
)
74.5– 79.5 6 77 462 74 54
79.5– 84.5 4 82 328 79 58 𝑀𝑜𝑑𝑒=70.88
84.5– 89.5 2 87 174 84 60
𝑛=60 𝑓𝑥=4130
Measures of Variability
Measures of Variability – Ungrouped Data
Measures of Variability
• Range
• Variance
• Standard Deviation
• Coefficient of Variation
• Mean Absolute Deviation
Measures of Variability
Example: 8, 3, 4, 6, 9, 1, 3, 2, 5
𝑥 𝑥 − 𝑥( 𝑥 − 𝑥 )2
Range: Variance:
1 -3.56 12.67
2 -2.56 6.55 ¿ 𝐻𝑖𝑔h𝑒𝑠𝑡𝑣𝑎𝑙𝑢𝑒−𝑙𝑜𝑤𝑒𝑠𝑡𝑣𝑎𝑙𝑢𝑒
3 -1.56 2.43 ¿9−1
3 -1.56 2.43
4 -0.56 0.31
¿8
5 0.44 0.19
¿ 7.2 7
6 1.44 2.07
8 3.44 11.83
9
∑ ( 𝑥− 𝑥) =58.19
4.44 2 19.71
Measures of Variability
Example: 8, 3, 4, 6, 9, 1, 3, 2, 5
𝑥 𝑥 − 𝑥( 𝑥 − 𝑥 )2|𝑥 − 𝑥 |
Standard Coefficient ofMean
1 -3.56 12.67 3.56 Deviation: Variation Absolute


2 -2.56 6.55 2.56
3 -1.56 2.43 1.56 ∑ ( 𝑥 − 𝑥 ) 2
𝑠
𝐶𝑉𝑎𝑟 = ∙100
Deviation:
𝑠=
𝑛−1 𝑥 𝑀𝐴𝐷=
∑ |𝑥 − 𝑥|

√¿ √ 7.2 7
3 -1.56 2.43 1.56 2.70
58.19 ¿ ∙ 100 𝑛
4 -0.56 0.31 0.56 ¿ 4.56 19.56
5 0.44 0.19 0.44 9− 1 ¿
¿0.5921∙100 9
6
8
1.44 2.07 1.44
¿ 2.70 ¿ 59.21 ¿2.17
3.44 11.83 3.44
9 4.44
∑ ( 𝑥− 𝑥) =58.19∑ |𝑥−𝑥 |=19.56
4.44 2 19.71
Measures of Variability
Measures of Variability – Grouped Data
Measures of Variability
• Variance
• Standard Deviation
• Coefficient of Variation
Measures of Variability
Midpoint of each class
Grouped Data - Variance minus the mean
𝑥=7.40 2 ∑ 𝑓 ( 𝑥 − 𝑥 ) 2

Class f x 𝑥 − 𝑥 ( 𝑥 − 𝑥 )2𝑓 ( 𝑥 − 𝑥 )2 𝑠 =
𝑛 −1
Midpoint of each class
minus the mean
squared
1–3 3 2 -5.40 29.16 87.48 273.60
Total of frequencies
¿ Frequencies of each
4–6 5 5 -2.40 5.76 28.80 15 −1 class multiplied by
Midpoint of each class
minus the mean
7–9 2 8 0.60 0.36 0.72 273.60
¿ squared
14
10 – 12 2 11 3.60 12.96 25.92 𝑠2 =19.54 Sum of the
frequencies of each
13 – 15 3 14 6.60 43.56 130.68 class multiplied by

∑ 𝑓 ( 𝑥−𝑥 ) =273.60
Midpoint of each class
2
𝑛=15 minus the mean
squared

𝐿𝑜𝑤𝑒𝑟 𝑙𝑖𝑚𝑖𝑡 +𝑈𝑝𝑝𝑒𝑟 𝑙𝑖𝑚𝑖𝑡


𝑀𝑖𝑑𝑝𝑜𝑖𝑛𝑡=
2
Measures of Variability

Grouped Data – Standard Deviation
𝑥=7.40
𝑠=
∑ 𝑓 (𝑥−𝑥)
2
𝑠2 =19.54
Class f x 𝑥 − 𝑥 ( 𝑥 − 𝑥 ) 𝑓 (𝑥 − 𝑥 )
2 2
𝑛−1


1–3 3 2 -5.40 29.16 87.48
273.60
4–6 5 5 -2.40 5.76 28.80 ¿
15 −1


7–9 2 8 0.60 0.36 0.72
273.60
10 – 12 2 11 3.60 12.96 25.92 ¿
14
13 – 15 3 14 6.60 43.56 130.68
¿ √ 19.54
𝑛=15 ∑ 𝑓 ( 𝑥−𝑥 ) =273.60
2
𝑠=4.42
𝐿𝑜𝑤𝑒𝑟 𝑙𝑖𝑚𝑖𝑡 +𝑈𝑝𝑝𝑒𝑟 𝑙𝑖𝑚𝑖𝑡
𝑀𝑖𝑑𝑝𝑜𝑖𝑛𝑡=
2
Measures of Variability
Grouped Data – Coefficient of Variation 𝑠
Standard deviation
𝐶𝑉𝑎𝑟 = ∙100
Class f x 𝑥 − 𝑥 ( 𝑥 − 𝑥 )2𝑓 ( 𝑥 − 𝑥 )2 mean
𝑥 𝑥=7.40
𝑠=4.42
1–3 3 2 -5.40 29.16 87.48 4.42
¿ ∙100
4–6 5 5 -2.40 5.76 28.80 7.40

7–9 2 8 0.60 0.36 0.72 ¿ 0.5973 ∙ 100


10 – 12 2 11 3.60 12.96 25.92 𝐶𝑉𝑎𝑟 =59.73

13 – 15 3 14 6.60 43.56 130.68


𝑛=15 ∑ 𝑓 ( 𝑥−𝑥 ) =273.60
2

𝐿𝑜𝑤𝑒𝑟 𝑙𝑖𝑚𝑖𝑡 +𝑈𝑝𝑝𝑒𝑟 𝑙𝑖𝑚𝑖𝑡


𝑀𝑖𝑑𝑝𝑜𝑖𝑛𝑡=
2
Measures of Position
Measures of Position – Ungrouped Data
Measures of Position
• Quartiles
• Deciles
• Percentiles
Measures of Position
Example: 8, 3, 4, 6, 9, 1, 3, 2, 5
1 Quartile Decile Percentile
2
3
3 5th
4
5
6 7th

8th
¿ 7.5 th
value ¿ 5 value
th
¿ 8.8
value
th

8 9th
9
Measures of Position
Measures of Position – Grouped Data
Measures of Position
• Quartile
• Decile
• Percentile
Measures of Position
Grouped Data – Quartile Class interval width Total of frequencies

( )
𝑘𝑛 Preceding cumulative
−¿ 𝑐𝑓
Class f LB cf 𝑄 𝑘= 𝐿 𝑄 + 𝑐
4 𝑄𝑘 frequency
𝑘
𝑓𝑄 Frequency of kth
1–3 3 0.5 3
𝑘
Lower boundary of Quartile class
kth Quartile class

4–6 5 3.5 8 Find the 3rd quartile of the data


7–9 2 6.5 10 Place of kth Actual value of quartile
quartile class
( )
𝑘𝑛
class − ¿ 𝑐𝑓 𝑄
10 – 12 2 9.5 12 𝑄 3= 𝐿 𝑄 +𝑐
4 3

3
𝑓𝑄 3

13 – 15 3 12.5 15
( )
3 ∙ 15
− 10
4
𝑄 3=9.5 +3

𝑛=15
2

¿12 th
value 𝑄 3=11.3 8

𝐶𝑙𝑎𝑠𝑠𝑖𝑛𝑡𝑒𝑟𝑣𝑎𝑙𝑤𝑖𝑑𝑡h=(𝑈𝑝𝑝𝑒𝑟𝑙𝑖𝑚𝑖𝑡 −𝐿𝑜𝑤𝑒𝑟𝑙𝑖𝑚𝑖𝑡 )+𝐿𝑜𝑤𝑒𝑟


1 𝐵𝑜𝑢𝑛𝑑𝑎𝑟𝑦= 𝐿𝑜𝑤𝑒𝑟 𝑙𝑖𝑚𝑖𝑡−0.5
Measures of Position
Grouped Data – Decile Class interval width Total of frequencies

( )
𝑘𝑛 Preceding cumulative
− ¿ 𝑐𝑓
Class f LB cf 𝐷 𝑘= 𝐿 𝐷 +𝑐
10 𝐷𝑘 frequency
𝑘
𝑓𝐷 Frequency of kth
1–3 3 0.5 3
𝑘
Lower boundary of Decile class
kth Decile class
class
4–6 5 3.5 8 Find the 5th decile of the data
7–9 2 6.5 10 Place of kth Actual value of decile
decile class
( )
𝑘𝑛
−¿ 𝑐𝑓
10 – 12 2 9.5 12 𝐷5 = 𝐿 𝐷 + 𝑐
10 𝐷5

5
𝑓𝐷 5

13 – 15 3 12.5 15
( )
5 ∙15
−3
10
𝐷5 =3.5+ 3

𝑛=15
5

¿8 th
value 𝐷5 =6.20

𝐶𝑙𝑎𝑠𝑠𝑖𝑛𝑡𝑒𝑟𝑣𝑎𝑙𝑤𝑖𝑑𝑡h=(𝑈𝑝𝑝𝑒𝑟𝑙𝑖𝑚𝑖𝑡 −𝐿𝑜𝑤𝑒𝑟𝑙𝑖𝑚𝑖𝑡 )+𝐿𝑜𝑤𝑒𝑟


1 𝐵𝑜𝑢𝑛𝑑𝑎𝑟𝑦= 𝐿𝑜𝑤𝑒𝑟 𝑙𝑖𝑚𝑖𝑡−0.5
Measures of Position
Grouped Data – Percentile Class interval width Total of frequencies

( )
𝑘𝑛 Preceding cumulative
−¿ 𝑐𝑓
Class f LB cf 𝑃 𝑘= 𝐿 𝑃 +𝑐
100 𝑃𝑘 frequency
𝑘
𝑓𝑃 Frequency of kth
1–3 3 0.5 3
𝑘
Lower boundary of Percentile class
kth Percentile class
class
4–6 5 3.5 8 Find the 40th percentile of the data
7–9 2 6.5 10 Place of kth Actual value of percentile

( )
percentile class 𝑘𝑛
− ¿ 𝑐𝑓
10 – 12 2 9.5 12 𝑃 40 = 𝐿𝑃 +𝑐
100 𝑃 40

40
𝑓𝑃 40

13 – 15 3 12.5 15
( )
40 ∙15
−3
100
𝑃 40 =3.5+3

𝑛=15
5

¿6.4 th
value 𝑃 40 =5.30

𝐶𝑙𝑎𝑠𝑠𝑖𝑛𝑡𝑒𝑟𝑣𝑎𝑙𝑤𝑖𝑑𝑡h=(𝑈𝑝𝑝𝑒𝑟𝑙𝑖𝑚𝑖𝑡 −𝐿𝑜𝑤𝑒𝑟𝑙𝑖𝑚𝑖𝑡 )+𝐿𝑜𝑤𝑒𝑟


1 𝐵𝑜𝑢𝑛𝑑𝑎𝑟𝑦= 𝐿𝑜𝑤𝑒𝑟 𝑙𝑖𝑚𝑖𝑡−0.5

You might also like