Professional Documents
Culture Documents
in Descriptive Statistics
1
Data Types
Numerical/Quantitative
Discrete: A variable is discrete if its set of
possible values either is finite or else can be listed
in an infinite sequence (one in which there is a first
number, a second number, and so on).
Example: the number of persons arriving for
service during a particular period. (1,2,4,6,7……)
Based on enumeration/counting
2
3
Data Types
Numerical/Quantitative
Continuous: A variable is continuous if its
possible values consist of an entire interval on the
number line.
Example: weight of an individual, reaction time
for a particular process. 2.5. 2.55, etc.
Based on measurement.
4
5
Data Types
Qualitative/Categorical
Ordinal: A natural ordering of classes; juniors,
seniors and graduate students or excellent, good,
fair, poor, worst
Arbitrary: black, green, yellow, white
6
7
Plotting Data: describing spread of data
Frequency Table and Histogram
Example:
◦ A researcher is investigating short-term memory capacity:
how many symbols remembered are recorded for 20
participants:
4, 6, 3, 7, 5, 7, 8, 4, 5,10
10, 6, 8, 9, 3, 5, 6, 4, 11, 6
8
Illustration of Frequency Table
X f p % Frequency tables can display more
11 1 0.05 5% detailed information about distribution
10 2 0.1 10%
X= Memory Score (No of symbols
9 1 0.05 5%
remembered)
8 2 0.1 10%
f = Frequency
7 2 0.1 10%
N= ∑ f = 20
6 4 0.2 20%
Percentages and proportions
5 3 0.15 15%
4 3 0.15 15%
p = fraction of total group associated
3 2 0.1 10% with each score (relative frequency)
p = f/N, ∑ pi = 1
% : p(100) =100(f/N)
Important Use
Proportion of individuals/participants who remembered upto 6
symbols = 0.1 + 0.15 + 0.15 + 0.2 = 0.6 = 60 % 9
Histogram
10
Histogram (Discrete Data)
First, determine the frequency and relative
frequency of each x value.
Mark possible x values on a horizontal scale.
Above each value, draw a rectangle whose height is
the relative frequency (or alternatively, the
frequency) of that value.
This ensures that the area of each rectangle is
proportional to the relative frequency of the value.
If the relative frequencies of x=1 and x = 5 are .35
and .07, respectively, then the area of the rectangle
above 1 is five times the area of the rectangle above
5.
11
Histogram (Simple)
12
Histogram (Simple)
Mode = Variable
with Highest
Frequency/Relative
Frequency
13
Histogram
(large and/or continuous data)
14
Grouped Frequency Distribution Tables
(Class Interval)
X f
95-99 1 ◦ Sometimes the spread of data is too
90-94 1 wide
85-89 0 ◦ Grouped tables present scores as class
80-84 1 intervals
75-79 2 About 5-20 intervals
70-74 4
An interval should preferably be of
65-69 7
equal width
60-64 0
55-59 6
50-54 3
15
Histogram with Class Intervals/Class Widths (Continuous Data)
16
Histogram with Equal Class Intervals/Class Widths
17
Histogram with Class Intervals/Class Widths
Class Formation
There are no hard-and-fast rules concerning
either the number of classes or the choice of
classes themselves.
Between 5 and 20 classes will be satisfactory for
most data sets.
Generally, the larger the number of observations
in a data set, the more classes should be used.
A reasonable rule of thumb is
No of Classes = √ No of observations
18
Histogram with Un-equal Class Intervals/Class Widths
21
Histogram -Qualitative Data
22
Histogram Shapes-Data Distribution
Unimodal: One peak
23
Frequency Distribution: the Normal
Distribution
◦ Bell-shaped: symmetrical around the mid point,
where the greatest frequency of scores occur
24
Skewness
25
26
Skewness
Measures of symmetry of data (location of
concentration of data)
◦ Positively Skewed
Skewed to the left
Longer right tail towards high values
Mean > Median
27
28
Skewness
Measures of symmetry of data
◦ Negatively Skewed
Skewed to the right
Longer left tail towards low values
Mean < Median
29
30
Skewness
Measures of symmetry of data
◦ Symmetric : Bell shaped (Normal
Distribution)
Mean ~ Median
Mirror image on both sides of centre
31
Distribution shapes
Positively skewed
Symmetric 8
4
8
2
6
0
4 1 2 3 4 5 6
0
1 2 3 4 5 6 7
Negatively skewed
8
0
1 2 3 4 5 6
32
33
Shape of Data
Shape of data is measured by
◦ Skewness
◦ Kurtosis
34
Skewness
Measures of asymmetry of data
◦ Positive or left skewed: Longer right tail
◦ Negative or right skewed: Longer left tail
37
Kurtosis Formula
Let x1 , x2 ,...xn be n observatio ns. Then,
n
n ( xi x ) 4
Kurtosis i 1
2
3
n 2
( xi x )
i 1
38
39
40