Descriptive Statistics

Descriptive Statistics
 It is used to organize and summarize scores from samples and to tell/describe

something about a particular group of observation.
 2 Categories of Descriptive Statistics
 Measures of Central Tendency
 Measures of Dispersion
Measures of Central Tendency

 Central tendency is the middle point of a distribution.
 Measures of central tendency are also known as Measures of central location.
 Measures of central tendency yield information about the center, or middle part, of a
group of a numbers.
 MEAN, MEDIAN, MODE
Mean
 Often called as arithmetic mean, more commonly known as “average”, is equal to the
sum of all values in the data set divided by the number of values in the data set.
 Uses of Mean
1. For interval and ratio measurement
2. When greatest sampling stability is desired
3. When you want to know the “center of gravity” of a sample
For Ungrouped Data

Example 1
 Find the mean of the measurement
18, 26, 27, 29, 30
Solution:
 Substitute the measurements using the formula.
For Grouped Data

frequency of each class
Mean ( x )= x class midpoint
total number of observations
Σ fx
xW=
n
Where:
f = frequency
x = numerical value or item in a set of data
n = number of observation in the data set
Example 1:
Σ fx 2905
xW= = =58.1inches
n 50
Example 2:
Σ fx 2965
xW= = =59.3
n 50
Median
 When the data are ranked in proper order according to size, the value of the data that
occupies the middle position is the median
 Uses of Median
1. For ordinal or rank measurement
2. When there is no sufficient time to compute the mean
3. When the cases fall within the upper or lower values of the distribution and not in
how far they are from the central point.
For Ungrouped Data
1. Arrange the items (scores, responses, observations) from lowest to highest.
2. Count to the middle value.
 For an odd number of values arranged from lowest to highest, the median
corresponds to value.
 If the array contains an even number of observations, the median is the average
of the two middle values.
Example 1:
 Consider these odd numbers of numerical values:
7, 8, 8, 9, 10, 12, 23
 By inspection, the median is 9 because half of the values (7, 8, 8) are below 9 and half
(10, 12, 23) are above 9. Since n = 7 is odd, the median has rank = 4th item and is equal
to 9.
 Answer: The Median is 9.
Example 2:
 Consider these even numbers of numerical values:
12, 15, 18, 22, 30, 32.
 The two middle values are 18 and 22. If the average of the two middle numbers is taken,
that is, 18 + 22 = 40 and divided by 2, the median is 20.
 Answer: The Median is 20.
For Grouped Data

 If the data are grouped into classes, the median will fall into one of the classes as the
(n/2)th value.
 The process involves several steps and has for its general formula the following:
( )
n
−F
2
Median=L+i
f
where:
L = exact lower limit of the class containing the median (median class)
i = interval size
n = total number of items or observations
F = cumulative frequency in the class preceding the median class
f = frequency of the median class
Example:
 Since there are 100 values in the data set, the median will represent the ( n/2)th or the
(100/2)th item, that is the 50th largest value.
 Determine in which class the 50th value falls. The first two classes have a cumulative
frequency of 34 classes.
 We need another 16 values to reach 50. Thus, the 50th value falls in the next class which
contains 22 values. The median class then is 31-40.
Thus,
L = 30.5
n = 100
F = 34 ¿ 30.5+10 ( 50−34
22 )
f = 22 = 37.77
i = 10
 This means that 50% or 50 of the 100 ages will fall below 37.77 and 50% or 50 will fall
above it.
Mode
 It is the most common value in a distribution
 E.g. the mode of 3, 4, 4, 5, 5, 5, 8 is 5. Because 5 is occurring for the most of the time.
 Bimodal -- Data sets that have two modes
 Multimodal -- Data sets that contain more than two modes
 Uses of Mode
1. For nominal or categorical data
2. When the quickest value of central tendency is desired
3. When you wish to know the most typical, or most frequent case in the
distribution
Measures of Dispersion
 The extent of spread, or the dispersion of the data is described by a group of measures
called measures of dispersion, also called measures of variability.
1. Range
2. Mean Deviation
3. Standard Deviation
4. Percentage, Percentiles and Quartiles
Range
 Range is the difference between the largest and the smallest values in a set of data.
 Consider the following scores obtained by ten (10) students participating in a
research contest:
 6, 10, 12, 15, 18 18, 20, 23, 25, 28
 Thus, the R=22. The scores range from 6 to 28.
Mean Deviation
 This measure of spread is defined as the absolute difference or deviation between the
values in a set of data and the mean, divided by the total number of values in the set of
data.
 In mathematics, the term “absolute” represented by the sign “I I” simply means taking
the value of a number without regard to positive or negative sign.
Example:
 Consider a set of values which consists of 20, 25, 35, 40, 45. Solving for the mean,
20+25+35+40+45. the mean is 33. Find the average deviation.
AD = |20- 33|+ |25 - 33|+ |35 - 33|+ |40-33|+ |45 - 33|
5
= |-13|+ |-8|+ |2|+ |7|+ |12|
5
= 13+8+2+7+12
5
= 42
5
= 8.4
 Thus, on the average, each value is 8.4 units from the mean.
Standard Deviation
 The SD is a measure of the spread or variation of data about the mean. It is computed
by calculating the average distance that the average value is from the mean.
Example:
Let us consider the same data used in the illustration for using the range.
The values are 6, 10, 12, 15, 18, 18, 20, 23, 25, 28.
Solution:
1. Compute the mean
2. Subtract the mean from each score (x)

3. Square each difference from Step 2, or (x - x)2
4. Sum all the squares from Step 3 or (x - x)2

5. Divide the number in Step 4 by n — 1. The number of items or scores is denoted by n.
The quantity n-1 is called the degrees of freedom, a statistical concept that produces a
more accurate estimate of the data
6. Compute the standard deviation using the formula below:
Percentage
 It shows the value of each categories of variable and express the whole value of variable
as equal to 100.
% = part x 100
whole
Pecentiles
 They are measures that divide a group of data into 100 parts
 At least n% of the data lie below the nth percentile, and at most (100 - n)% of the data
lie above the nth percentile
 Example: 90th percentile indicates that at least 90% of the data lie below it, and at
most 10% of the data lie above it
 The median and the 50th percentile have the same value.
 Applicable for ordinal, interval, and ratio data
 Not applicable for nominal data
For Calculation:
 Organize the data into an ascending ordered array.
 Calculate the percentile location:
FOR EXAMPLE
 Raw Data: 14, 12, 19, 23, 5, 13, 28, 17
 Ordered Array: 5, 12, 13, 14, 17, 19, 23, 28
 Location of 30th percentile:
 The location index, i, is not a whole number; i+1 = 2.4+1=3.4; the whole
number portion is 3; the 30th percentile is at the 3rd location of the array; the 30th
percentile is 13.
Quartiles
 Measures that divide a group of data into four subgroups
 Q1: 25% of the data set is below the first quartile
 Q2: 50% of the data set is below the second quartile
 Q3: 75% of the data set is below the third quartile
 Q1 is equal to the 25th percentile
 Q2 is located at 50th percentile and equals the median
 Q3 is equal to the 75th percentile

Descriptive Statistics

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Descriptive Statistics

Uploaded by

Copyright:

Available Formats

Descriptive Statistics

 It is used to organize and summarize scores from samples and to tell/describe

Measures of Central Tendency

For Ungrouped Data

For Grouped Data

For Grouped Data

2. Subtract the mean from each score (x)

4. Sum all the squares from Step 3 or (x - x)2

You might also like