Professional Documents
Culture Documents
Introduction
Descriptive statistics are used to describe the basic features of the data in a
study. They provide simple summaries about the sample and the measures. Together
with simple graphics analysis, they form the basis of virtually every quantitative analysis
of data.
Descriptive Statistics are used to present quantitative descriptions in a
manageable form. In a research study we may have lots of measures. Or we may
measure a large number of people on any measure. Descriptive statistics help us to
simplify large amounts of data in a sensible way. Each descriptive statistic reduces lots
of data into a simpler summary
Some of the common measurements in descriptive statistics are central tendency
and others the variability of the dataset.
Objectives:
1. Identify the different types of descriptive statistics
2. Describe the importance of descriptive statistics and its application in education
3. Utilize various data management tools to process and manage quantitative
data
4. Interpret data based on the result of computation.
Discussion:
A. Frequency Measurement
The easiest method of organizing data is a frequency distribution, which converts
raw data into a meaningful pattern for statistical analysis.
The following are the steps of constructing a frequency distribution:
1. Specify the number of class intervals. A class is a group (category) of interest.
No totally accepted rule tells us how many intervals are to be used. Between 5
and 15 class intervals are generally recommended. Note that the classes must
be both mutually exclusive and all-inclusive. Mutually exclusive means the
classes must be selected such that an item can’t fall into two classes, and all-
inclusive classes are classes that together contain all the data.
2. When all are to be the same width, the following rule may be used to find the
required class interval width:
W = (L - S) / N
Where:
W = Class width
L = the largest data,
S = the smallest data,
N = number of classes
Example:
Suppose the ages of a sample of 10 students are:
20.9, 18.1, 18.5, 21.3, 19.4, 25.3, 22.0, 23.1, 23.9 and 22.5
We select N = 4 and W = (25.3 – 18.1) = 1.8 which is rounded up to 2. The frequency
table is as follows:
Note: The sum of all the relative frequency must always be equal to 1.00 or 100%. In
the above example, we see that 40% of all students are younger than 24 years old, but
older than 22 years old. Relative frequency may be determined for both quantitative and
qualitative data and is a convenient basis for the comparison of similar groups of
different size.
What Frequency Distribution Tells Us?
1. It shows how the observation cluster around a central value; and
2. It shows the degree of difference between observations.
For example, in the above problem we know that no student is younger than 18 and
the age below 24 is most typical. The most common age is between 22 and 24 , which
from general information we know to be higher than usual for the students who enter
college right after high school and graduate about age 22. The students in the sample
are generally older. It is possible that the population is made up of night students who
work on their degrees on a part-time basis while holding full-time jobs.
This descriptive analysis provides us with an image of the student sample, which is
not available from raw data.
Illustration:
Consider the following set of data, which are the scores recorded for 30 participants.
We wish to summarize this date by creating a frequency distribution of the scores.
Data Set – Scores Recorded for 30 Participants
50 45 49 50 43
49 50 49 45 49
47 47 44 51 51
44 47 46 50 44
51 49 43 43 49
45 46 45 51 46
If w
e use this data and follow the suggestions for creation of a grouped frequency
distribution, we would create the following grouped frequency distribution.
Grouped Frequency Distribution for Scores in the Major Examination
(Statistics)
Class Interval Tally Interval Midpoint Frequency
57 - 59 ////// 58 6
54 - 56 /////// 55 7
51 - 53 /////////// 52 11
48 - 50 ///////// 49 9
45 - 47 /////// 46 7
42 – 44 ////// 43 6
39 – 41 //// 40 4
N= 50
CENTRAL TENDENCY
• Help you find the middle, or the average, of a data set. The 3 most
common measures of central tendency are the mode, median, and mean.
• In statistics, called a center or location of the distribution. Colloquially,
measures of central tendency are often called averages
C. Measurement of Variability
The Range
The range is the difference between the largest and smallest values in a
set of values
EXAMPLE;
Range=Upper Boundary
SCORES FREQUENCY(f) Highest interval
22-24 5 -lower boundary
19-21 6 lowest interval
16-18 7
13-15 8
Interquartile Range
Example 1:
2 3 1 4 6 8 9 10 12 3 4
Number: 11
Find the position of Q2?
position:(n+1)50%
Q2=(11+1)x.50=6
Q2 is 4
1 2 3 3 4 4 6 8 9 10 12
EXAMPLE 2:
1 2 3 3 4 5 6 6 7 8 8 9
Number: 12
Find Q1, Q2, Q3 and IQR?
Find Position: Q1=(n+1)x25% Q3=(n+1)x75%
Q1=(12+1)x.25
Q1=13x.25
Q1=3.25 Q3=(12+1)x75%
Q1=3+3=6/2=3 Q3=13x.75
Q3=9.75
Q2=(n+1)x50%
Q2=(12+1)x.50 IQR=Q3-Q1
Q2=13x.50 IQR=7.5-3=4.5
Q2=6.5
Q2=5+6=11/2=5.5
Score frequency F
(cumulative)
10-12 4 4
13-15 8 12
16-18 7 19
19-21 6 25
22-24 5 30
Formula:
Q1=LQ1+(∑f/4-FQ1-1).CQ1
fQ1
Step 1: calculate
∑f/4=30/4=7.5
Step 2: look for the value that exceed ∑f
Therefore first quartile class is 13-15
Formula: LQ1: 13
Q1=LQ1+(∑f/4-FQ1-1).CQ1 Cq1: 3
fQ1 fQ1: 8
LQ1: Lower bound of the first quartile FQ1-1: 4 Cq1: width of
the 1st quartile class fQ1: actual frequency of the first
quartile Q1=13+(7.5-4/8)x3 FQ1-1: cumulative frequency of
the class Q1=13+(.4375)x3 before 1st quartile class
Q1=13+1.3125
Q1=14.3125
Q3=LQ3+(3∑f/4-FQ3-1).CQ3 First: calculate =3∑f/4=90/4=22.5
FQ3
score frequency F(cumulative) LQ3: 19 Cq3: 3 fQ3:
10-12 4 4 6 FQ3-1: 19
13-15 8 12
16-18 7 19
Q3=19+(22.5-19/6)x3
19-21 6 25
Q3=19+(.58)x3
22-24 5 30
Q3=19+1.74
Q3=20.74
IQR=Q3-Q1
IQR =20.74-14.31=
VARIANCE
-Is the mean of the square of the deviations from the mean of a frequency distribution.
-For large quantities, the variance is computed using frequency and midpoint value for
each interval, the deviation and its square, and the product of the frequency and the
squared deviation.
Variance: Where:
𝜎2=𝛴𝑓(X-𝑥 ̅)2 f= class frequency
𝛴𝑓−1 X=class Mark
x ̅ =Class Mean
𝛴𝑓=total number of frequency
EXAMPLE:
Variance = 450/ 29=15.52 SD=√variance
√15.52=3.93
D. Measures of Position (Quartile, Decile and Percentile)
Quartiles (Qk) – are the score points divided the distribution into four equal parts.
Each observation has 4 quartiles and are denoted by Q 1, Q2, …Q4. Deciles (Dk) – are
the score points which divides the distribution into 10 equal parts.
Each observation has 10 deciles and are denoted by D 1, D2, … D10.
Percentiles (Pk) - are the score points which divides the distribution into 100 equal
parts. Each observation has 99 percentiles and are denoted by P 1, P2, … P99.
• P10 = D1
• P20 = D2
• P25 = Q1
• P50 = D5 = Q2
• P75 = Q3
• P90 = D9
D2 P20 Q1
D3 P30
D4 P40
Q2
D5 P50
D6 P60
D7 P70 Q3
D8 P80
Q4
D9 P90
D10 P100
FORMULA:
For quartile:
Qk (n + 1) = a.b
V = Xa + .b(Xa+1 – Xa) For
decile:
Dk (n + 1) = a.b
V = Xa + .b(Xa+1 – Xa) For
percentile
Pk (n + 1) = a.b
V = Xa + .b(Xa+1 – Xa)
Where:
Qk = position of nth quartile
Dk = position of nth decile Pk = position of nth
percentile n = total number of terms a = the
integer of the result from the first equation .b =
the decimal from the preceding result
V = the identified number of Qk, Dk, or Pkth position
Xa = the identified number of ath position
Example:
For Q1:
Qk (n + 1)
Q (20 + 1)
= .25(20 + 1)
= 5.25th
Solving for V;
V = Xa + .b(Xa+1 – Xa)
= X5 + .25(X6 – X5)
= 15 + .25(16 – 15)
= 15 + .25
= 15.25
Hence, the 25% of the distribution are below 15.25 (with the reference of 5 th term since
X5 = 15).
For D7:
Dk
D (n + 1)
= .70(20 + 1)
= 14.7th
Thus, a = 14 and .b = .7
Solving for V;
V = Xa + .b(Xa+1 – Xa)
For P78:
Pk (n + 1)
P
= .78(20 + 1)
= 16.38th
Solving for V;
V = Xa + .b(Xa+1 – Xa)
Example:
The test scores of 27 Grade 10 Students in Mathematics. Find the Q 2, D8 and P35
Score Frequen Cumulative Lower
s cy (f) Frequency Boundary
(cf) (L)
30-34 3
25-29 9
20-24 8
15-19 5
10-14 2
n=27
Solving for Q2
1. Determine the cumulative frequencies (copied the frequency of the lowest class
interval and add the frequencies of the next class intervals)
Scores Frequency Cumulative Lower
(f) Frequency Boundary
(cf) (L)
30-34 3 27
25-29 9 24
20-24 8 15
2. Determine 15-19 5 7 the lower
boundaries 10-14 2 2 (subtract 0.5 to the
smallest n=27 number per class
interval)
Scores Frequency Cumulative Lower
(f) Frequency Boundary
(cf) (L)
30-34 3 27 29.5
25-29 9 24 24.5
20-24 8 15 19.5
15-19 5 7 14.5
10-14 2 2 9.5
n=27
3. Calculate to determine the location of the class limit/quartile class by dividing the
total frequencies (n) and 2 for the 2nd quartile Q2= 27/2= 13.5
4. Locate the value of the Q2 class which is 13. 5 in the Cumulative Frequency (cf),
if 13.5 could not be found in the cumulative frequency, look for value higher than
13. 5, which is 15.
5. Look for the value of the lower boundary
6. Look for the value of cumulative frequency from below (cfb)
7. Look for the value of the class interval (i)
8. Look for the value of the frequency of the Q2 class
9. Then solve for Q2 using the formula
Therefore: 50% of the class got a score lower than or equal to 23.56 and 50% got a score
higher than 23.56. (Note that Q2=P50=D5=Median)
Solving for D8
1. Calculate to determine the location of the class limit/decile class by multiplying
the total frequencies (n) by 8, and divide by 10 D8= (8*27)/10= 21.6
2. Locate the value of the D8 class which is 21.6 in the Cumulative Frequency (cf), if
21.6 could not be found in the cumulative frequency, look for value higher than
21.
6, which is 24.
3. Look for the value of the lower boundary
4. Look for the value of cumulative frequency from below (cfb)
5. Look for the value of the class interval (i)
6. Look for the value of the frequency of the D8 class
7. Then solve for D8 using the formula
Therefore: 80% of the class got a score lower than or equal to 28.17 and 20% got a score
higher than 28.17.
Summary
Descriptive Statistics are very important because it helps facilitate data visualization,
if we simply presented our raw data it would be hard to visualize what the data was
showing, especially if there was a lot of it. It allows for data be presented in a
meaningful and understandable way, which in turn, allows for a simplified interpretation
of the data set in question. Descriptive statistics are used to describe the basic features
of the data in a study. They provide simple summaries about the sample and the
measures. Together with the simple graphic analysis, they form the basis of virtually
every quantitative analysis of data.
There are four major types of Descriptive Statistics: Frequency Distribution,
Measure of Central Tendency, and Measure of Dispersion or Variation. And Measure of
Position Frequency distribution is normally presented in a table or graph is accompanied
Assessment:
1. Which of the following provides a measure of central location for data?
a. Standard deviation
b. Mean
c. Variance
d. Range
5. The difference between the largest and the smallest data values is the ____. a.
Variance
b. Interquartile range
c. Range
d. Coefficient of variation
References:
• https://conjointly.com/kb/descriptive-statistics/
• Bueno, D. (2016). Introduction to Statistics (CONCEPTS AND APPLICATION
IN RESEARCH). Quezon City:Great Books Trading
• https://www.scrbbr.com
• https://www.Slideshare
• https//www.scribbr.com >statistics
• https//onlinestatbook.com>variability
• https//latrobe.libguides.com>maths
• https//stattrek.com>variability
• https://www.youtube.com/watch?v=2hAmtEFL9Jo
Prepared by:
Group 2
Abinales, Susan Abobo, Ma. Joy
R.
Agda, Benjie
Dacalos, Rochelle Gesite, Annabelle
C. Projimo, Charitess Turla, Arlene N.
Uy, Hector B.