Professional Documents
Culture Documents
Introduction and Descriptive Statistics (I.e. Easy Stuff)
Introduction and Descriptive Statistics (I.e. Easy Stuff)
1 LEARNING OBJECTIVES
After studying this chapter, you should be able to:
Distinguish between qualitative data and quantitative data.
Describe nominal, ordinal, interval, and ratio scales of
measurements.
Describe the difference between population and sample.
Calculate and interpret percentiles and quartiles.
Explain measures of central tendency and how to compute
them.
Create different types of charts that describe data sets.
Use Excel templates to compute various measures and create
charts.
1-4
WHAT IS BIOSTATISTICS?
Qualitative - Quantitative -
Categorical or Measurable or
Nominal: Countable:
Examples are- Examples are-
Color Temperatures
Gender Salaries
Nationality Number of points
scored on a 100
point exam
1-7
Scales of Measurement
elements.
1-10
Why Sample?
Example 1-2
Other summary
measures:
Skewness
Kurtosis
1-23
• Mean Average
1-24
Sorted
Grams grams Ranks Range = Maximum – Minimum
33 18 1
26 18 2 = 56 – 18 = 38
24 18 3
21 18 4
19 19 5 First Quartile (20+1)×25/100=5.25 19 + (.25)(1) = 19.25
20 20 6
18 20 7
18 20 8
52 21 9
56 22 10 Median (20+1)×50/100=10.5 22 + (.5)(0) = 22
27 22 11
22 23 12
18 24 13
49 26 14
22 27 15 Third Quartile (20+1)×75/100=15.75 27+ (.75)(5) = 30.75
20 32 16
23 33 17
Interquartile Range = Q3 – Q1
32 49 18
20 52 19 = 30.75 – 19.25 = 11.5
18 56 20
1-29
∑(x − x)
n
N 2
∑(x − µ)2
s =
2 i =1
σ 2 = i=1
N
(n − 1)
( )
2
( x)
2
N n
∑ ∑x
i =1
N
∑x −
n
∑ −
x2 i =1 2
= n
i =1
= i=1 N
N (n − 1)
σ= σ 2
s= s 2
1-30
Frequency Distribution
x f(x) f(x)/n
Spending Class ($) Frequency (number of customers) Relative Frequency
184 1.000
x F(x) F(x)/n
Spending Class ($) Cumulative Frequency Cumulative Relative Frequency
Histogram
Frequency Histogram
Histogram ofweights
Histogram of Dollars
50
50
40 38
30 31
Frequency
30
22
20
13
10
0
0 100 200 300 400 500 600
Dollars
grams
1-37
Histogramof
Histogram of weights
Dollars
30
NOTE: The relative 27.1739
frequencies
25
are expressed
20.6522
as percentages. 20
16.8478
16.3043
Percent
15
11.9565
10
7.06522
0
0 100 200 300 400 500 600
Dollars
grams
1-38
Skewness
Measure of the degree of asymmetry of a frequency distribution
Skewed to left
Symmetric or unskewed
Skewed to right
Kurtosis
Measure of flatness or peakedness of a frequency distribution
Platykurtic (relatively flat)
Mesokurtic (normal)
Skewness
Skewed to left
1-40
Skewness
Symmetric
1-41
Skewness
Skewed to right
1-42
Mean = Median
40
35 35
30
Frequency
20
20
15 15
10 10
10
0
100 200 300 400 500 600 700
X
1-43
Kurtosis
Kurtosis
Kurtosis
Chebyshev’s Theorem
1
1 −
At least
k2
of the elements of any distribution lie
within k standard deviations of the mean
1 1 3
1− = 1 − = = 75%
2
2
4 4 2
Standard
At 1 1 8 Lie
1 − 2 = 1 − = = 89% 3 deviations
least 3 9 9 within of the mean
1 1 15 4
1− 2 = 1− = = 94%
4 16 16
1-48
Empirical Rule
Pie Charts
Categories represented as percentages of total
Bar Graphs
Heights of rectangles represent group frequencies
Frequency Polygons
Height of line represents frequency
Ogives
Height of line represents cumulative frequency
Time Plots
Represents values over time
1-50
The Portfolio
Category
Foreign
Foreign Bonds
20, 20.0% Small Cap/Mid Cap
Large Cap Blend Large Cap Value
30, 30.0% Large Cap Blend
Bonds
20, 20.0%
Chartin
CO2 level ofthe
Registration
atmosphere(Millions)
in Ottawa
125
100
Registration (Millions)
CO2 level (ppm)
75
50
25
0
2000 2001 2002 2003 2004 2005 2006
Year
1-52
0.25
Relative Frequency
0.20
0.15
0.10
0.05
0.00 0
0 8 16 24 32 40 48 56
Salesfish in cm
Length of trout
1-53
1.0
The point with height
corresponding to
the cumulative
Cumulative Relative Frequency
0.8
relative frequency is
located at the right
0.6
endpoint of each
interval.
0.4
0.2
0.0 0
0 10 20 30 40 50 60
Sales
Length of trout fish in cm
1-54
Scatter Plots
Scatter Plots
Correlation will be
discussed in later
chapters.