Professional Documents
Culture Documents
Lecture 2
Fraction
variable containing
interval data is .2
distributed .1
• Range of data divided 0
0 20 40 60
into non-overlapping Inflation Rate, 2011
and equal width classes
How many bins? Width of bins?
(bins) that cover range
of values
http://data.worldbank.org/indicator/FP.CPI.TOTL.ZG 3
60 .3
Fraction
40 .2
20 .1
0 0
0 20 40 60 0 20 40 60
Inflation Rate, 2011 Inflation Rate, 2011
and 4 percent?
.2 Is it definitely above
40%?
.1
0
0 2 4 6
Inflation Rate, 2011
Density
.2 .3
.1 .2
.1
0 0
0 2 4 6 0 2 4 6
Inflation Rate, 2011 Inflation Rate, 2011
1
One suggestion: # of bins ≈ 𝑛
.5
0 OECD inflation: 34 = 5.83 and
0 2 4 6 STATA picked 5
Inflation Rate, 2011
6
Density
.1 .2
.05 0
10 20 30 40 1 2 3 4 5
0 .05 .1 .15 .2 .25
0 .1 .2 .3 .4
Density
Density
2 3 4 5 6 0 2 4 6 8
.4
.4
.2 .3
.2 .3
Density
Density
.1
.1
0
0
-4 -2 0 2 4 10 15 20 25 30
.3
Density
Density
.1 .2 0
0 5 10 15 -6 -4 -2 0 2 4
0 .01 .02 .03 .04
0 .2 .4 .6 .8
Density
Density
0 .05 .1 .15 .2
.3
Density
Density
.1 .2 0
5 10 15 20 -15 -10 -5
Density
-15 -14 -13 -12 -11 -10 250 300 350 400 450 500
.4 .4
.2 .2
0 0
0 20 40 60 80 100 0 20 40 60 80 100
% Below Poverty Line % Above Poverty Line
Density
0 2 4 6 8 0 10 20 30
0 .05 .1 .15 .2 .25
0 .05 .1 .15 .2
Density
Density
0 2 4 6 8 10 0 5 10 15
13
Notes: This shows the distribution of violation scores that restaurants obtain during
the initial inspection. The vertical lines correspond to the score thresholds that
would assign A-B-C letter grades. Scores of 13 or less automatically give an A-grade,
while higher scores imply that a restaurant will be reinspected within a few weeks.
For the purpose of this plot, inspection scores are capped at 50. 14
Ages of first-time
mothers in the
U.S. in 2016
16
.02 .04
Density
Density
.015 .03
.01 .02
.005 .01 Sample 1 is a LIE!!
0 0
50 100 150 80 90 100 110
IQ IQ
Sample 2; n = 10 Sample 3; n = 10
.05 .02
How many
.04 samples would .015
Density
Density
.03 you have in real Why aren’t these
life? .01 samples perfectly
.02
.005 Bell shaped?
.01
0 0
80 90 100 110 120 60 80 100 120
IQ IQ
17
Density
.015 .015
.01 .01
Why are there more
.005 .005 bins than last slide?
0 0
50 100 150 60 80 100 120 140
IQ IQ
Sample 2; n = 30 Sample 3; n = 30
.04 .03
.03
.02
Density
Density
.02
.01
.01
0 0
70 80 90 100 110 120 60 70 80 90 100 110
IQ IQ
18
Density
.015
.01
.01
.005
0 0
50 100 150 60 80 100 120 140
IQ IQ
.02 .02
Density
Density
.01 .01
0 0
60 80 100 120 140 50 100 150
IQ IQ
19
.6
.15
.1 .4
.05 .2
0 0
-4 -2 0 2 4 10 11 12 13 14
X Y
20
23
sampling error? .3
.2
.1
0
0 2 4 6
Inflation Rate, 2011
24
Density
.015 .02
.01 from 𝜇, which
.005 .01 is a
0 0 parameter?
50 100 150 60 80 100 120 140
IQ IQ
Density
.006 .01 exactly equal
.004 .005 𝜇 in both
.002
0 0 distributions?
0 20 40 60 80 100 0 20 40 60 80 100
Book Rating Book Rating
25
n = 174 countries
mean = 6.6, median = 5.0
.4
.2
.1
0
0 20 40 60
Inflation Rate, 2011
26
∑ 𝑥 −𝑋 .3
𝑠 = .2
𝑛−1 .1
• Standard deviation: 𝑠 = 0
0 2 4 6
𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒 Inflation Rate, 2011
• Coefficient of variation For all 174 countries, is
(textbook) the range bigger or
smaller than 6.8?
28
29
Density
.003 .006
.002 .004
.001 .002
0 0
800 900 1000 1100 1200 800 900 1000 1100 1200
X X
Density
.01
.001
.005
5.0e-04
0 0
500 1000 1500 900 950 1000 1050 1100
X X
Noticing Normality, can we approximate the s.d. of X in each? 32
33
n = 185 countries
mean = 14955.1, sd = 16243.0
.5
.4
Fraction
.3
How to describe the shape of the
.2 distribution of this variable?
.1
0
0 20000 40000 60000 80000 100000
GDP per capita (PPP), 2012 est.
34
35