You are on page 1of 31

ECON1203 Statistics

Contents
Chapter 1 What is Statistics?...............................................................3
1
2
3

Descriptive statistics vs. inferential statistics................................................3


Population vs. sample.....................................................................................3
Statistical inference........................................................................................3

Chapter 2 Graphical Descriptive Techniques I......................................4


4
5
6
7

Variables, values, data...................................................................................4


Types of data..................................................................................................4
Describing univariate nominal data...............................................................4
Comparing multivariate nominal data...........................................................4

Chapter 3 Graphical Descriptive Techniques II....................................5


8
9
10
11
12

Describing univariate interval data................................................................5


Describing time-series data...........................................................................5
Describing bivariate interval data..................................................................5
Graphical excellence......................................................................................5
Graphical deception.......................................................................................5

Chapter 4 Numerical Descriptive Techniques......................................7


13
14
15
16

Measures of central location..........................................................................7


Variability.......................................................................................................7
Measures of relative standing........................................................................7
Measures of linear relationship......................................................................8

Chapter 5 Data Collection and Sampling.............................................9


17
18
19
20
21

Methods of collecting data.............................................................................9


Sampling........................................................................................................9
Sampling plans...............................................................................................9
Sampling error...............................................................................................9
Nonsampling error.........................................................................................9

Chapter 6 Probability.........................................................................10
22
23
24
25
26
27
28

Random experiment.....................................................................................10
Sample space...............................................................................................10
Requirements of probabilities......................................................................10
Approaches to assigning probabilities.........................................................10
Events..........................................................................................................10
Joint, marginal and conditional probability..................................................10
Probability rules...........................................................................................10

Chapter 7 Discrete Probability Distributions......................................12


29

Random variables........................................................................................12

ECON1203 StatisticsChapter 3 Graphical Descriptive Techniques II


30
31
32

Discrete probability distributions.................................................................12


Bivariate distributions..................................................................................12
Binomial distributions..................................................................................13

Chapter 8 Continuous Probability Distributions.................................14


33
34
35
36

Requirements of probability density functions............................................14


Uniform distributions....................................................................................14
Normal distributions.....................................................................................14
Exponential distribution...............................................................................14

37

Student

38

Chi-squared distribution...............................................................................15

39

distribution..............................................................................14

distribution..........................................................................................15

Chapter 9 Sampling Distributions......................................................16


40
41
42
43
44

Central Limit Theorem (CLT)........................................................................16


Sampling distribution of the sample mean..................................................16
Normal approximation of binomial distributions..........................................16
Approximating sampling distribution of a sample proportion.............................16
Sampling distribution of the difference between two means..........................17

Chapter 10 Introduction to Estimation...............................................18


45
46

Point vs. interval estimators........................................................................18


Properties of estimators...............................................................................18

47

Estimating population mean

48

Estimating population mean

49

Sample size..................................................................................................18

( ) from standard deviation ( ) .................18

( ) from median........................................18

Chapter 11: Introduction to Hypothesis Testing...................................19

ECON1203 StatisticsChapter 3 Graphical Descriptive Techniques II

Chapter 1 What is Statistics?


1

Descriptive statistics vs. inferential statistics

Descriptive statistics Organising, summarising & presenting data


Inferential statistics Drawing conclusions about populations based on
sample data

Population vs. sample

Population All items of interest to a statistics practitioner (e.g. the shoe size
of Australians)
Parameter A descriptive measure of a population (e.g. the mean shoe size
of Australians)
Sample A subset of a population (e.g. the shoe size of UNSW students)
Statistic A descriptive measure of a sample (e.g. the mean shoe size of
UNSW students)

Statistical inference

Statistical inference Drawing conclusions about populations based on


sample data
Confidence level The proportion of times an estimation procedure will
be correct
Significance level The proportion of times a conclusion will be wrong

ECON1203 StatisticsChapter 3 Graphical Descriptive Techniques II

Chapter 2 Graphical Descriptive Techniques I


4

Variables, values, data

Variable (denoted as uppercase letters) A characteristic of a population or


sample (e.g. shoe size)
Values The possible observations of a variable (e.g. shoe sizes between
1-16)
Data (denoted as lowercase letters) The observed values of a variable

Types of data

Hierarchy of data

Moving down the hierarchy of data reduces the number of permissible


calculations.

Higher-level data can be treated as lower-level data, but not vice versa.

1. Interval/quantitative/numerical data Real numbers (all calculations


are valid)
2. Ordinal data Data in a ranked order (calculations based on order are
valid)
3. Nominal/qualitative/categorical data Arbitrary numbers (calculations
based on frequencies and percentages are valid)

Describing univariate nominal data

Frequency
1. Frequency distribution1 - A table that shows the frequency of each
outcome
2. Bar chart A chart that shows the frequency of each outcome
Relative frequency
3. Relative frequency distribution A table that shows the relative frequency
of each outcome
4. Pie chart A chart that shows the relative frequency of each outcome

Comparing multivariate nominal data

1 Excel: To count the frequency of a particular value, use =COUNTIF ([Input


range], [Criteria]).

ECON1203 StatisticsChapter 3 Graphical Descriptive Techniques II

1. Cross-classification table/cross-tabulation table A table that shows


the frequency of combinations of two variables
2. Relative cross-classification table/cross-tabulation table A table
that shows the relative frequency of combinations of two variables
3. Separate bar charts

ECON1203 StatisticsChapter 3 Graphical Descriptive Techniques II

Chapter 3 Graphical Descriptive Techniques II


8

Describing univariate interval data


1. Histogram A chart with rectangles whose bases are the intervals and
whose heights are the frequencies
o

Number of class intervals=1+3.3 log ( n )

Class width=

Largest observationSmallest observation


Number of classes

o
o
o
o
o

Symmetric Mirrored on either sides of the middle


Positively skewed With a tail to the right
Negatively skewed With a tail to the left
Unimodal With one peak
Bimodal With two peaks

Bell-shaped Symmetric & unimodal

2. Stem-and-leaf display A table that separates place values


3. Relative frequency distribution A table that shows the relative
frequency of values
4. Cumulative relative frequency distribution A table that cumulatively adds
relative frequencies
5. Ogive A chart that shows cumulative relative frequency

Describing time-series data

Line chart A chart that plots a variable over time

10 Describing bivariate interval data


Scatter diagram A chart that plots the observed combinations of two
variables

Linearity linear/nonlinear/no relationship


Direction positive/negative
Strength strong/medium-strength/weak

11 Graphical excellence
1.
2.
3.
4.
5.

Concise data
Clear ideas
Multivariate
Substance over form
No distortion

ECON1203 StatisticsChapter 3 Graphical Descriptive Techniques II

12 Graphical deception
1.
2.
3.
4.

Graphs without scale


Graphs with different captions
Stretching and shrinking graphs
Bar charts with changing widths

ECON1203 StatisticsChapter 3 Graphical Descriptive Techniques II

Chapter 4 Numerical Descriptive Techniques


13 Measures of central location
N

1.

xi

Population mean== i=1


N
n

2.

3.

xi

Sample mean=x = i=1


n

Median=Middle observation=x n +1
2

4.

Mode=Most frequent observation

5.

Geometric mean=

( 1+ r)

r =1

14 Variability
1.

Range=Largest observationSmallest observation

2. Variance
N

a.

( x i ) 2

Population variance= 2= i=1

b.

c.

( x ix )2

Sample variance=s 2= i=1

n1

1
Shortcut sample variance=s =
n1
2

2
i

x
i =1

( )
xi

i=1

3. Standard deviation
a.
b.

Population standard deviation= = 2

Sample standard deviation=s= s2

ECON1203 StatisticsChapter 3 Graphical Descriptive Techniques II


n

4.

|x ix|

Mean absolute deviation ( MAD )= i=1

5. Empirical rule
a. Within
one

standard

deviation

of

the

mean:

deviations

of

the

mean:

P ( < x < + )=68


b. Within

two

standard

P ( 2 < x < +2 )=95


c. Within

three

standard

deviations

of

the

mean:

P ( 3 < x < +3 ) =99.7


'

6.

7.

8.

Chebysheff s Theorem : P ( k < x < +k ) 1

Population coefficient of variation=CV =

Sample coefficient of variation=cv=

1
[ for k >1 ]
k2

s
x

15 Measures of relative standing


1.

Location of a percentile=LP =( n+1 )

2.

Interquartile range=Q3Q1

P
100

3. Box plots A graph with a box and whiskers that shows the maximum,
minimum, range, median, interquartile range and outliers.
4. Outliers Unusually large or small observations

16 Measures of linear relationship


1. Covariance
N

a.

( x i x )( y i y)

Population covariance= xy = i =1

ECON1203 StatisticsChapter 3 Graphical Descriptive Techniques II


n

b.

c.

( xi x )( y i y )

Sample covariance=s xy= i=1

n1

1
Shortcut sample covariance=s xy =
n1

i=1

x i y i i=1

2. Coefficient of correlation

3.

4.

a.

Population coefficient of correlation==

b.

Sample coefficient of correlation=r=

Least squares method

xi yi
i=1

xy
xy

s xy
sx sy

a.

Equation of the line : ^y =b 0+ b1 x

b.

y intercept =b 1=

c.

Slope=b0 =y b1 x

s xy
s 2x

Coefficient of determination how much of

Y s variation is explained by

X s variation
a.

Population coefficient of determination=2

b.

Sample coefficient of determination=r 2

5. Correlation is not causation!

10

ECON1203 StatisticsChapter 3 Graphical Descriptive Techniques II

Chapter 5 Data Collection and Sampling


17 Methods of collecting data
1. Primary data Collected by the statistics practitioners for the current
problem
2. Secondary data Collected by someone else for another problem
3. Observation Measuring actual behaviour
4. Experiments Imposing treatments and measuring resultant behaviour
5. Surveys Asking questions

18 Sampling

Target population The population about which we want to draw


inferences
Sampled population The actual population from which the sample has
been take
Self-selected samples When participants choose to participate and
thus are more keenly interested in the issue than other members of the
population

19 Sampling plans
1. Simple random sample Samples with the same number of
observations are equally likely to be chosen
2. Stratified random sample Dividing the population into mutually
exclusive strata and then drawing simple random samples from each
stratum
3. Cluster sample Dividing the population into mutually exclusive clusters
and then only drawing simple random samples from selected clusters

20 Sampling error

Sampling error Differences between the sample and the population


because of observations that happened to be selected for the sample; it
can be reduced by increasing the sample size

21 Nonsampling error
Nonsampling error Differences between the sample and the population
because of mistakes in data acquisition or improper selection of sample
observations; it cannot be reduced by increasing the sample size

11

ECON1203 StatisticsChapter 3 Graphical Descriptive Techniques II

1. Errors in data acquisition (e.g. faulty equipment, inaccurate responses to


sensitive questions)
2. Nonresponse error When responses are not obtained from some
members of the sample
3. Selection bias When members of the target population cannot possibly
be selected for inclusion in the sample

12

ECON1203 StatisticsChapter 3 Graphical Descriptive Techniques II

Chapter 6 Probability
22 Random experiment

Random experiment An action or process that leads to one of several


possible outcomes (e.g. Experiment: Flipping a coin. Outcomes: Heads or
tails.)

23 Sample space

Sample space All possible outcomes of an experiment. They must be


mutually exclusive.

24 Requirements of probabilities
1. The probability of any outcome must lie between 0 and 1:

0 P ( Oi ) 1 [ for each i ]
k

2. The sum of the probabilities of all outcomes is 1;

P(Oi )=1
i=1

25 Approaches to assigning probabilities


1. Classical approach Probabilities in games of chance (e.g. flipping a
coin, rolling dice)
2. Relative frequency approach Probabilities are long-run relative
frequencies
(e.g. if the relative frequency of getting a distinction is 200/1000 students,

P=20 ).
3. Subjective approach Probabilities are the degree of belief in the
occurrence of an event (e.g. the probability that the price of a share will
increase)

26 Events

Simple event An individual outcome of a sample space (e.g. getting a


mark of 80)
Event A collection or set of one or more simple events in a sample space
(e.g. the event of getting a distinction requires a mark of at least 80,

Distinction={80,81, 82, , 99,100 } )

13

ECON1203 StatisticsChapter 3 Graphical Descriptive Techniques II

Probability of an event The sum of the probabilities of the simple events


that make up an event

27 Joint, marginal and conditional probability


A

1. Joint probability (intersection) The probability that both

occur:

and

P( A B)

2. Marginal probability Probabilities computed by adding across rows or


down columns
3. Conditional

P ( A|B )=

probability

The

probability

of

B :

given

P ( A B)
P (B )

4. Independent events

P ( A|B )=P ( A )
A

5. Union The probability that either

or

or both occur:

P ( A B)

28 Probability rules
1. Complement

rule:

The

probability

that

does

not

of

occur:

P ( A C )=1P( A)
2. Multiplication

rule:

The

joint

probability

and

B=P ( A B )=P ( A ) P ( B|A )


3. Multiplication rule for independent events:

P ( A B ) =P ( A ) P(B)

4. Addition

of

rule:

The

union

and

B=P ( A B ) =P ( A )+ P ( B )P( A B)
5. Addition rule for mutually exclusive events:

14

P ( A B )=P ( A ) + P( B)

ECON1203 StatisticsChapter 3 Graphical Descriptive Techniques II

Chapter 7 Discrete Probability Distributions


29 Random variables

Random variable A function or rule that assigns a number to each


outcome of an experiment (e.g. when flipping a coin, the number of heads

{ 0,1, 2, } )

Discrete random variable Can only assume certain values (whether


finite or infinite)
Continuous random variable Can assume any values within a specified
range (e.g. time)
Probability distribution A table, formula or graph that shows the
probabilities of values of a random variable

30 Discrete probability distributions


1. Requirements of discrete probability distributions
a.
b.
2.
3.
4.
5.

0 P ( x ) 1

P ( x )=1
all x

Population mean=E ( X )== xP ( x )


allx

Population variance=V ( X )= 2= ( x )2 P ( x )
all x

Shortcut population variance=V ( X )= 2= x 2 P ( x )2


all x

Population standard deviation= = 2

6. Laws of expected value


a.

E ( c )=c

b.

E ( X +c )=E ( X )+ c

c.

E ( cX )=cE ( X )

7. Laws of variance
a.

V ( c )=0

b.

V ( X +c )=V ( X )

c.

V ( cX )=c2 V ( X )

15

ECON1203 StatisticsChapter 3 Graphical Descriptive Techniques II

31 Bivariate distributions
1. Requirements for discrete bivariate distributions

0 P ( x , y ) 1 [ for all pairs of values ( x , y ) ]

a.

P ( x , y )=1

b.
2.
3.
4.

all x all y

Covariance=COV ( X , Y )= xy = ( x X ) ( yY ) P ( x , y )
all x all y

Shortcut covariance=COV ( X , Y )= xy = xyP ( x , y ) X Y


allx all y

Coefficient of correlation= =

xy
x y

5. Laws of expected value of the sum of two variables

E ( X +Y )=E ( X ) + E(Y )

a.

6. Laws of variance of the sum of two variables


a.

V ( X +Y )=V ( X ) +V ( Y ) +2 COV ( X ,Y )
X

b. If

and

are

independent,

COV ( X ,Y )=0

and

V ( X +Y )=V ( X ) +V (Y )
7.

Mean of a portfolio of two stocks=E ( R p ) =w1 E ( R1 ) +w 2 E ( R2 )

8.

Variance of a portfolio of two stocks=V ( R p )=w 21 V ( R1 ) + w22 V ( R2 ) +2 w 1 w2 1 2

32 Binomial distributions
Requirements of binomial experiments:
1. Fixed number of trials
2. Two outcomes:

(n)

P ( success )= p

and

P ( failure ) =1 p

3. Independent trials the outcome of one trial does not affect the
outcomes of other trials
Binomial probability distribution:

X Bin (n , p)=

n!
x
nx
n x
nx
p ( 1 p ) =C r p ( 1p )
x ! ( nx ) !

Cumulative probability =P ( X x )

16

ECON1203 StatisticsChapter 3 Graphical Descriptive Techniques II

Probability that

is at least

Probability that

equals

x=P ( X x )=1P ( X [ x1 ] )

x=P ( x ) =P ( X x )P ( X [ x1 ] )

Mean, variance and standard deviation:


1.

Mean==np

2.

Variance= 2=np (1 p)

3.

Standard deviation= = np(1p)

17

ECON1203 StatisticsChapter 3 Graphical Descriptive Techniques II

Chapter 8 Continuous Probability Distributions


33 Requirements of probability density functions
1. The function is above 0:

f ( x ) 0 [for a< x <b ]


b

f ( x ) dx=1

2. The area under the function is 1:

34 Uniform distributions
f ( x )=

1
[where a x b ]
ba
P ( x1 < X < x2 ) =Base Height=( x 2x 1)

1
ba

35 Normal distributions
1 x

(
1
f ( x )=
e2
2

[ for < x < ]


() .

It is symmetric about the mean

Increasing the standard deviation

Standard normal random variable=Z=


Standardised

normal

( )

widens the curve.

distributions

are

symmetric

about

P ( Z > Z A )=P ( Z<Z A ) =A

36 Exponential distribution
f ( x )= ex [where x 0 ]

Increasing the parameter of distribution

Mean ( )=Standard deviation ( )=

18

( )

steepens the curve.

0:

ECON1203 StatisticsChapter 3 Graphical Descriptive Techniques II

P ( X > x )=ex

P ( X < x )=1e

P ( x1 < X < x2 ) =P ( X < x 2 )P ( X < x 1 )=e x e x

37 Student

distribution
( +1) 2

[ ]

[ ( +1 ) 2]
t2
f ( t )=
1+

v ( 2)

P ( t>t A , v )=P ( t <t A ,v ) =A

It is symmetrical about 0:

It is flatter than the standard normal distribution.

Increasing the degrees of freedom

Mean=E ( t )=0

Variance=V ( t )=

( )

narrows the curve.

[ for >2 ]
2

38 Chi-squared distribution
f ( 2) =

( 2)1 2
1
1
( 2 )
e
[ where 2> 0 ]
2
( 2) 2
2

Increasing the degrees of freedom

Probabilities

39

+
2
F
(
)

2
2
f ( F )=

( )
F
( ) ( )
1+
(
2
2
)
1

flattens the curve.

P ( 2 > 2A ) =P ( 2 < 21 A ) = A

distribution

1
1 2

1+ 2
2

[ where F> 0 ]

( )

Mean=E ( F )=

2
[ >2 ]
2 2 2

19

ECON1203 StatisticsChapter 3 Graphical Descriptive Techniques II

Variance=V ( F )=

2 2 ( 1 + 22)
2

1 ( 22 ) ( 24)

Area the is A=P ( F > F A , , )= A

Area the A

P ( F< F 1 A , , )= A

F1 A , , =

[ 2 > 4]

1
F A,

, 2

20

ECON1203 StatisticsChapter 3 Graphical Descriptive Techniques II

Chapter 9 Sampling Distributions


40 Central Limit Theorem (CLT)
The sampling distribution of the mean of a random sample drawn from any
population is approximately normal for a sufficiently large sample size.

41 Sampling distribution of the sample mean


2

( )

Normally distributed sampling distribution= X N ,


n
1.

Mean= x =

2.

Variance= 2x =

3.

Standard devi ation= x =

2
n

42 Normal approximation of binomial


distributions
Normally distributed binomial distribution=Y N ( , 2 )
1. Binomial distributions are approximately normally distributed if:
a.

np 5 ; and

b.

n ( 1 p ) 5

2.

Mean==np

3.

Variance= 2=np ( 1 p )

4.

Standard deviation= = np ( 1 p )

5. The continuity correction factor

(0.5)

because binomial distributions

are discrete random variables whereas normal distributions are continuous


random variables:
Binomial distribution

Normal distribution

P ( X=x )

P ( x0.5<Y < x +0.5 )

21

ECON1203 StatisticsChapter 3 Graphical Descriptive Techniques II

P(X x)

P (Y x+ 0.5 )

P(X x)

P(Y x0.5)

43 Approximating sampling distribution of a


sample proportion
1.

^
P

is approximately normally distributed I:


a.

np 5

b.

n ( 1 p ) 5

2.

Expected value =E ( ^
P )= p

3.

p ( 1 p )
Variance=V ( ^
P ) = 2^p=
n

4.

Standard deviation= ^p =

p ( 1 p )
n

44 Sampling distribution of the difference


between two means
1.

Mean= X X =12
1

2
X 1 X 2

21 22
= +
n1 n 2

2.

Variance=

3.

Standard deviation= X X =
1

21 22
+
n1 n2

22

ECON1203 StatisticsChapter 3 Graphical Descriptive Techniques II

Chapter 10 Introduction to Estimation


45 Point vs. interval estimators
1. Point estimators Estimate a parameter using a single value or point
2. Interval estimators Estimate a parameter using an interval

46 Properties of estimators
1. Unbiased The expected value of the estimator equals the parameter:

E ( ^ ) =
2. Consistent As the sample size grows, the difference between the

lim E ( ^ )=

estimator and the parameter falls:

and

lim Var ( ^ )=0

3. Relatively efficient An estimator is relatively more efficient if its


variance
is
lower:

^ 1

is relatively more efficient than

^ 2

if

47 Estimating population mean


deviation

Var ( ^ 1 )<Var ( ^ 2)
( )

from standard

( )

1.

Confidence interval estimator of =x z 2

2.

Lower confidence limit ( LCL )=x z 2

3.

Upper confidence limit (UCL )=x + z 2

48 Estimating population mean


Confidence interval estimator of =m z 2

1.2533
n

49 Sample size

23

( )

from median

ECON1203 StatisticsChapter 3 Graphical Descriptive Techniques II

1.

Bound on theerror of estimation=B=z 2

2.

z
Sample estimate a mean=n= 2
B

24

ECON1203 StatisticsChapter 3 Graphical Descriptive Techniques II

Chapter 11 Introduction to hypothesis testing

25

ECON1203 StatisticsChapter 3 Graphical Descriptive Techniques II

Chapter 12 Inference about a population

26

ECON1203 StatisticsChapter 3 Graphical Descriptive Techniques II

Chapter 13 Inference about comparing two


populations

27

ECON1203 StatisticsChapter 3 Graphical Descriptive Techniques II

Chapter 14 Analysis of variance

28

ECON1203 StatisticsChapter 3 Graphical Descriptive Techniques II

Chapter 15 Chi-squared tests

29

ECON1203 StatisticsChapter 3 Graphical Descriptive Techniques II

Chapter 16 Similar linear regression and


correlation

30

ECON1203 StatisticsChapter 3 Graphical Descriptive Techniques II

Chapter 17 Multiple regression

31