1002 Ls

ECMT1020
Introduction to Econometrics
Semester 2, 2019
Coordinator: Dr Kadir Atalay
Office: Room 502, Social Science Building
Email: kadir.atalay@sydney.edu.au
Topics Today
Week 2
• The Sample Mean
• Statistical Inference Based on the Sample Mean
• Statistical Inference Extensions
Reference: Chapters 5, 6 & 7
• Note that we are skipping Chapters 3 & 4 – these are introductory
chapters to economics data
Road map
Last week, we discussed:
• summary statistics
• EG: our sample of 171 thirty‐year‐old female full‐time workers
had a mean incomes of $41,413
• why we use summary statistics
• to learn about a population parameter
• EG ‐ the mean income of all thirty‐year‐old female full‐time
workers in the country – based on our sample it is probably
around $41,413
• Question: how certain are we?
So, today we connect the dots:
• Statistical inference on means
Statistical inference on means
• First, we will do a little refresher on probability
theory and sampling distributions
• Then, we will look at
• t tests
• confidence intervals
• and why they work.
• Means today, model parameters tomorrow (well, in
a few weeks)
The sample mean
• Sample mean, , is a number that is specific to a
particular sample
• Different samples will yield different sample means
Q: How can we extrapolate from sample to the
population?
Why do we use statistical inference?
• Statistical inference: using sample statistics to
make inferences about the population
• With univariate data:
• Statistical inference = using the sample average to make
inferences about the population mean
• EG ‐ in our EARNINGS data sample:
• sample mean = $41,413, with a sample size of 171.
• Now we want to learn about the likely range of
values of mean earnings for all 30 year old female
full time workers in the country?
• Other common examples of when we do this:
• polls to infer public opinion
• water samples to assess water quality
Some probability
essentials
Some Essentials of Probability:
• A random variable is a variable whose value is
determined by the outcome of an experiment
• where an experiment is any operation whose
outcome cannot be predicted with certainty.
• Example:
• Suppose we flip a coin 10 times and count the number
of times the coin turns up heads = an experiment
• The number of heads appearing in 10 flips of a coin is an
example of a random variable.
Notation:
• Random variables are denoted in upper case, say X
(or Y or Z)
• the Realized value of the random variable is denoted
in lower case, say x (or y or z)
Coin flip example:
• X is not associated with any particular value – but we
know X will take on a value in the set {0,1,2…,10}.
• A particular outcome or realization is x = 3.
Discrete Random Variables
• A discrete random variable is a random variable that
can
a. only take a finite number of values, or
b. a countably infinite number of values such as 0, 1, 2, 3, ....
Examples
a. X may measure whether or not a person is currently
employed, so X may take values 1 (employed) or 0 (not
employed)
b. X may be the number of doctor visits over the past year;
then X may take the values 0, 1, 2….
• The probability distribution of a random variable X
describes the random behavior of X.
• is the probability that the random variable X
takes a certain value .
• This is called the probability mass function of X.
• the probability mass function of X gives the probabilities
for each value taken by the random variable:
….
• And,
• We will also use the cumulative distribution
function, defined as
• the cumulative distribution function, gives the
probability that the random variable X is less than
or equal to a particular value
….
• The probability that X lies in a given range can be
calculated using either
• the probability mass function, or
• the cumulative distribution function
Coin flip example extended:
• Suppose we flip two coins = Experiment
• Count the number of times the coin turns up
heads, X = random variable
• Possibilities (TT, TH, HT, HH)
Number of Heads
First Coin Second Coin
(Random Variable)
T T 0
T H 1
H T 1
H H 2
Coin flip example extended:
• Random Variable:
• X represents the number of heads obtained in two
tosses of a coin.
X PDF CDF
0 0.25 0.25
1 0.5 0.75
2 0.25 1
• Notice that the random variable in our 2 coin flip
example is discrete
• it can only take a countable number of different values
• In this case, three
• However, we will often be interested in continuous
random variables, for which the number of
different possible values is uncountable
Continuous Random Variables
• A continuous random variable is a random variable
that can
• take an uncountably infinite number of values
Examples
• X may measure the annual income of an individual
• Or the length of time the individual has been employed at
their job
• The cumulative distribution function (cdf)
generalizes nicely from discrete RVs to
continuous RVs
• Example:
• If Z is standard normal, we have
• The probability mass function cannot be generalized
so easily. Why?
We always find
• So instead, we work with the probability density function
– we look at the probability that X lies in a range of values
• Note: the probability density function is the derivative of
the cumulative distribution function.
• For a normal distribution, this is the famous bell curve
Expected value
For any random variable, we define its expected value to
be the weighted average of all the values that it can take,
with weights given by their probabilities.
NB – I’m going to only work with discrete random variables for a few slides
here. If you want rigorous definitions for expected values for continuous
random variables, replace all sums by integrals and all mass functions by
densities; I won't quiz you on that
Expected value
What about our two coin flip example:
Expected value, population mean
∑ Pr
• Notice that this looks a lot like our definition of the
sample mean ( )
• Except now we are dealing with population
quantities
• that is – what might happen, not what has happened/has
been realised
• Thus, we often call the population mean of X
• Notation:
Population Variance
• Just as we have the sample mean and an analogous
population mean,
• The sample variance has a population analog: the
population variance =
Population Variance
• In a sample,
• i.e. we take the average of ̅
• In the population, we take the expected value of to
get the population variance
Lastly, note that our population standard deviation is
Population Variance
What about our two coin flip example:
=0.5
Linear transformations of a random variable
1. Add a fixed amount (a) to a random variable
 mean is changed by the amount a
 variance is unchanged
2. Multiply the random variable by a fixed amount (b)
 mean is multiplied by the amount b
 variance is multiplied by the amount b2
Linear transformations of a random variable ‐
Example
Assume X is a random variable with mean and
variance , then what are the mean and variance of
1)
2)
Linear transformations of a random variable ‐
Example
1)
Mean:
Variance:
Linear transformations of a random variable ‐ Example
Variance:
1
1
Linear combinations of random variables
If X and Y are random variables, and we are interested
in studying , then
• Means:
• Variances (assuming X and Y are independent):
Back to using our data and thinking
about statistical inference using the
sample mean…
Statistical inference
• Just now, we described the two most common population
parameters
and
• Last week, we described many sample statistics, including
and
• Presumably/we hope that:
• will provide us with some information about
• will tell us something about
• More precisely, we can view these sample statistics as
realizations of random variables.
• NB: for many students, getting familiar with this concept is the hardest part of
learning statistics
Statistical inference
• Statistical inference seeks to infer properties of
the population from the sample at hand.
Statistical inference – our Key Question
• We want to use a sample to infer whatever we can
about the distribution of random variable X at the
population level
• What would we like to know about the population?
• The population mean,
• The population variance,
• The shape of the distribution of X, pdf (probability density
function)
• What information do we actually get to observe?
• The mean of sample, ̅
• The sample variance,
• The same statistics for any additional samples we take
The sample mean as a random variable
• Sample: a subset selected from the population
• A sample of size n consists of n draws from the population
• Each draw is a realisation of a random variable
• Imagine we are planning to obtain that random sample
of female workers
• When we have our sample, we will have observations on the
salaries of 171 people, say , , … ,
• But, before we take our sample, we don't know whom we will
ask or what their salaries are
So, these salaries are random variables
• Sample mean, : an average of the n sample realisations,
, ,…,
⋯
̅
• Before we collect our sample, in the planning stage, we can
instead talk about
⋯
• But if , , … , are random variables, then so is
• And, the sample mean is a realisation of a random variable
• As the sample mean is a realisation of a random
variable
• then the statistical properties of the sample mean are
determined by the statistical properties of the random
variables that produced the sample
• So, now we are going to show that, if we have a
random sample of observations that are independent
draws of random variables that are each distributed,
then,
• we can use to conduct inference on
Reminder: Key Question #1
• We have a random variable X, we want to estimate
its unknown population mean
• Why? So we can conduct inference on the
population mean
• We need to establish the distribution of
The distribution of
• We need some assumptions about
1. , , … , are independent random variables
is statistically independent of ,
2. , ,…, are drawn from a common distribution
3. have a common mean
EX for all i
4. have a common variance
Var X for all i
5. We will also assume this distribution is Normal
~ ,
The distribution of
• Common mean and variance:
The distribution of :
Mean of the sample mean
• So we have
⋯ 1 1 1
X X ⋯ X
n n n
and we know that
, ,…,
Then applying our rules above about linear combinations of
random variables, we have:
1 1 1
X X ⋯ X
n n n
…
…
·
Mean of the sample mean

This is good news!
• That is, the expected value of the sample mean is equal to
the population mean
• The average across many samples is expected to equal
The sample mean is an unbiased estimator of the
population mean
Note: an estimator is a random variable (because it depends
on the random variables ), while the estimate
we obtain is the realisation
Standard deviation of the sample mean
• The variability of around its mean of is

measured using the variance and standard
deviation of :
̅ ̅
̅
̅
The sample mean exhibits less variability than the underlying data
The variability of the sample mean as an estimate of the
population mean decreases as the sample size increases
Larger samples lead to greater precision
This is also good news!
The sample mean is an unbiased estimator of the population
mean AND its variance gets smaller as the sample size grows
“We are getting it right on average, and we’re getting closer and
closer as we obtain more data”
The distribution of
• From our assumptions,
~ ,
we have found the mean and variance of the sample
mean, and we will state without proof that the
distribution of the sample mean is normal:
=> ~ ,
The distribution of

The distribution of for n=10

The distribution of for n=100

Statistical properties of
• Recall, we want to use our estimator for to
conduct inference on
• What makes it a good estimator?
1. It is an unbiased estimator,
2. It is a consistent estimator – that is, its
distribution gets arbitrarily concentrated around
as the sample size grows,
→
3. One can prove that there is no other unbiased
estimator of that has a smaller variance than
. It is the minimum variance unbiased estimator
The distribution of ‐ an example
To get a better sense of the distribution of the sample,
we'll go through a very simple example:
• Let's think about coin flips, we'll call heads '1' and tails
'0'
• The set of all possible values is just (0, 1) each with a
probability of “1/2”
• The population mean, or expected value of a coin flip,
should just be
• If we take a sample by flipping a coin a few times, what
are we likely to see as the sample mean?
Sample mean = (17 x1 +0 x 13)/30 0.567
1 ‐ Draw 400
samples, each of
30 coins
2‐ For each
sample calculate
sample mean
3‐Plot the sample
mean for each
sample The mean of the sample means is equal
to 0.501 (standard deviation of mean?)
• So the average value of the sample mean should
tell us the population mean, suggesting that we can
use to get an estimate of
• For a single sample, it is unlikely that the observed
is exactly equal to
• The standard deviation of the sample mean, often
called the standard error of the sample mean,
helps us understand how likely it is that a sample
mean will be close to the population mean.
• The smaller the standard error, the narrower the
distribution of the sample mean and the better our
sample mean is as estimator of the population
mean
Statistical inference:
We still need 2 more ingredients
• Recall, we want to use our estimator for namely
, to conduct inference on
• We need to look at two more concepts before we
get there:
1. Z‐statistics
2. Standard errors and the t‐distribution
The z statistic
• We have shown
• We know how to standardize a variable (using the
linear transformations we discussed earlier)
• subtracting the mean and dividing by standard deviation
leads to a random variable with mean 0 and variance 1
• Then the random variable Z, is defined as
The z statistic
• Why is this useful?
• Because it allows us to say things like:
Pr 1.96 1.96 0.95, or
Pr 1.96 1.96 0.95
/
⟹ Pr 1.96 1.96 0.95
⟹ Pr 1.96 1.96 0.95
⟹ Pr 1.96 1.96 0.95
Or that the 95% of our samples will have a mean ̅ such that
̅ 1.96 ̅ 1.96
Interpretation of the z statistic
• We can show that this holds even if the underlying
distribution of the is not exactly normal, if we have a
large enough sample
• From the central limit theorem
• The upshot:
• We have what we need:
• A probability statement about the population mean , based
on ̅
• Well almost – there is one last wrinkle!
• Our statement involves , a population parameter!
• We don’t know ! So we have to use its sample
analog, s
Standard errors and the t-distribution
• is unknown
• So we replace by its estimate
Note that:
=> in the same way that is a realisation of , with
, here we have
=> that is a realisation of , with

(proof omitted)
• is unknown
• So we replace by its estimate
=> Estimated Variance of
• Estimated Standard Deviation  Standard Error of the
Sample
• So we will approximate the Z statistic with a t statistic
• And for any given sample, we can use the standard error of
the mean to write:
• Note: the sample t‐statistic is a single realization of a
random variable T
Sampling Distribution of the Sample Mean
Estimator
The sampling distributions of a mean

describes the behavior of a sampling mean
~ , ̅ where ̅=
RECALL
Can we standardize this – YES
~ , ̅ ⇒ ~ 0, 1
/
But to do this transformation I need to know –
Replace this with sample counterpart–

/ /
~ , ̅ ⇒ ⇒ ~ 0, 1
/ /
Standardising the sample mean estimator– on
our way to hypothesis testing!
Standardising the sample mean estimator–
on our way to hypothesis testing!
Standardising the sample mean estimator–
on our way to hypothesis testing!
69
• This approximation of Z with T does come at a cost.
• T does not have a normal distribution – it has a t‐distribution
with (n‐1) degrees of freedom, denoted ~ 1
• For large n, this is close to normal.
• This difference means that we need to replace our 1.96 by
another (slightly larger) number, but apart from that, we can
continue as before.
• For example, if n=26, we obtain that 95% of our samples will
have a mean ̅ such that:
̅ 2.06 · ̅ 2.06 ·
• So we do now have what we need – a usable probability
statement about the population mean , based on
Summary for the Sample Mean
1. Individual are assumed to have common mean
and variance
2. The average of the n draws of has mean and
variance
3. The standardized static has mean 0
/
and variance 1
4. As sample size n → ∞, Z is standard normal distributed
by the central limit theorem [even if we don’t assume
is normally distributed!]
5. Replacing the unknown by the sample standard
deviation leads to a t‐distribution.
6. The sample t‐statistic: is a single
realization of a T(n‐1) distributed random variable .
Road map
Now that we have the necessary tools and
understanding of the distribution of the sample
mean, we are ready to conduct inference.
• Week 3: Inference
• Hypothesis testing
• p‐values
• Confidence intervals
• Testing for the differences in two means across
independent samples

1002 Ls

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

1002 Ls

Uploaded by

Copyright:

Available Formats

ECMT1020

• The variability of around its mean of is

⟹ Pr 1.96 1.96 0.95

⟹ Pr 1.96 1.96 0.95

=> that is a realisation of , with

The sampling distributions of a mean

You might also like