Quantitative Methods 2: ECON 20003

Quantitative Methods 2
ECON 20003
WEEK 2
DESIRABLE PROPERTIES OF POINT ESTIMATORS
PARAMETRIC AND NONPARAMETRIC TECHNIQUES
THE ASSUMPTION OF NORMALITY
Reference:
SSK: § 10.1
WWL: 3.5-3.9
Dr László Kónya
January 2020
DESIRABLE PROPERTIES OF POINT ESTIMATORS
• We consider four properties that can make point estimators easier to

work with and are possessed by ‘good’ point estimators.
Suppose that we are interested in parameter  (it might be e.g. a

population mean, a population proportion or a slope parameter of a
population regression model) and we estimate it with the following
estimator:
a)  -hat is said to be a linear estimator of  if it is a linear function of the

sample observations.
For example, the sample mean X-bar is a linear estimator of the

population mean .
However, the sample variance s2 is a quadratic function of the Xi
sample observations, so it is a non-linear estimator of the
population variance  2.
L. Kónya, 2020 UoM, ECON 20003, Week 2 2
b)  -hat is said to be an unbiased estimator of  if
i.e. if the expected value of  -hat is equal to  and thus the sampling
distribution of  -hat is centered around .
Otherwise,  -hat is referred to as a biased estimator and
Bias
For example, the sample mean is an unbiased estimator of the

population mean because
Similarly, the sample variance is an unbiased estimator of the

population variance because

However, the following alternative estimator of  2 is biased since
Suppose that 1-hat and 2-hat denote two different (normally distributed)
estimators of .
The sampling distribution of 1-hat is

centered around , while the sampling
distribution of 2-hat is not.
1-hat is an unbiased, whereas

2-hat is a biased estimator.

1-hat is expected to estimate  more accurately than 2-hat.

c)  -hat is an efficient estimator of  within some well-defined class of
estimators (e.g. in the class of linear unbiased estimators) if its
variance is smaller, or at least not greater, than that of any other
estimator of  in the same class of estimators.
3-hat and 4-hat are both unbiased

estimators of , but the sampling
distribution of 3-hat has a smaller
variance than the sampling distribution
of 4-hat.

3-hat is the more efficient estimator, it is likely to produce a more
accurate estimate of  than 4-hat.
Note: In case of random sampling the sample mean is the best linear unbiased
estimator (BLUE) of the population mean. “Best” means that X-bar has
the smallest variance in the class of linear unbiased estimators of ,
hence it is an efficient estimator.
d)  -hat is called a consistent estimator of  if its sampling distribution
collapses into a vertical straight line at the point  when the sample
size n goes to infinity.
Let f1( -hat), f2( -hat) and f3( -hat) n1 < n2 < n3
denote the sampling distributions of
the same  -hat estimator generated
by three different sample sizes.
These sampling distributions are
centered around , and as the
sample size increases they become
narrower. 
Granted that this is true for larger sample sizes as well,
 -hat is a consistent estimator of .
If -hat is an unbiased estimator then consistency requires the

variance of its sampling distribution to go to zero for increasing n.
However, if -hat is a biased estimator then consistency requires both
its variance and the bias to go to zero for increasing n.
PARAMETRIC AND NONPARAMETRIC TECHNIQUES
• Many statistical procedures for interval estimation and hypothesis testing
a) are concerned with population parameters, and

b) are based on certain assumptions about the sampled population
or about the sampling distribution of some point estimator.
These procedures are usually referred to as parametric procedures.
For example, the confidence interval estimation for a population mean

and the corresponding hypothesis testing of a population mean based
on the t distribution are parametric procedures as they are concerned
with the population mean and assume that
i. The variable of interest is quantitative and hence the data is
measured on an interval or a ratio scale.
Otherwise the population mean would not exist and the
central location could be measured only with the mode and
the median (if the measurement scale is at least ordinal).
ii. The sample has been randomly selected.
Otherwise it might not represent the population accurately.
iii. The population standard deviation is unknown, but the population

is normally distributed, at least approximately.
Procedures that are either not concerned with some population

parameter or are based on relatively weaker assumptions than their
parametric counterparts, and hence require less information about the
sampled population, are called nonparametric procedures.
Note: Nonparametric techniques are sometimes known as distribution-free

procedures. This is a bit deceptive as they also rely on some, though
fewer and less stringent, assumptions about the sampled population.
Parametric procedures can be misleading when some of their

assumptions is violated, so it is crucial to be familiar with these
assumptions and to learn how to check them in practice.
Never run any inferential statistical procedure without
performing a thorough explanatory data analysis first.
THE ASSUMPTION OF NORMALITY
• A crucial assumption behind most parametric procedures is normality,

namely that the underlying sampling distribution is normally distributed.
For example, in the case of testing a population mean with a
parametric test, either  should be known and the sample mean
should be normally distributed (Z-test), or if  is unknown, the
sampled population itself should be normally distributed (t-test).
• How can we find out with reasonable certainty whether a population is

normally distributed?
In practice the populations are hardly ever observed entirely. Hence,
we look at the sample data to see if they are more or less normally
distributed, and if they are, then we have ground to believe that the
sampled population is also normal (at least, not extremely non-normal).
Normality can be verified in a number of ways relying on some
(i) graphs, (ii) sample statistics and (iii) formal hypothesis tests.

i. Checking normality visually
We can use two types of graphs to study whether a data set is

characterised by a normal distribution: histogram and QQ-plot.
The QQ (quantile-quantile) plot is a scatter plot that depicts the

cumulative relative frequency distribution of the sample data against
some known cumulative probability distribution.
When it is used for checking normality, the reference distribution is a
(standard) normal distribution and if the sample data is normally
distributed, the points on the scatter plot lie on a straight line.
Ex 1: (Week 1, Ex 2)
Last week we performed a t-test to find out whether there was sufficient
evidence at the 5% level of significance to establish that the average Australian
is more than 10kg overweight.
The sample size was large enough (n = 100) to rely on CLT, so the sampling
distribution of the sample mean could be assumed approximately normal.
However,  was unknown, so we had to assume that the sampled population
was not extremely non-normal in order to be able to rely on the t-test.
a) Develop an histogram and a QQ-plot of Diff with R to see whether the
sampled population might be normally distributed.
Histogram of diff with a normal curve

that has the same mean and standard QQ-plot of diff
deviation than the sample of diff
The histogram is skewed to the right and on the QQ-plot the points are
scattered around the straight line. Hence, both graphs suggest that diff is
unlikely to be normally distributed.
ii. Quantifying normality with numerical descriptive measures
There are four simple numerical descriptive measures that can help us
decide whether a data set is characterised by a normal distribution:
mean, median, skewness and kurtosis.
For (continuous and unimodal) symmetric distributions, such as the

normal, mean = median (for normal distributions the mode is also equal
to them), …
… while for right (positively) skewed distributions median < mean and
for left (negatively) skewed distributions mean < median.

• Skewness (SK) is a descriptor of the shape of a distribution and it is
concerned with the asymmetry of a distribution around its mean.
The population parameter for SK

is the third standardized moment
defined as:
For symmetric distributions SK = 0, for distributions that are skewed to

the right SK > 0 and for distributions that are skewed to the left SK < 0.
The pastecs package of R

estimates SK with:
The approximate estimated

standard error of this statistic is
The data are likely asymmetric,

thus non-normal (at  = 0.05), when
• Kurtosis (K) is another descriptor of the shape of a distribution.
It is related to the thickness of the tails of the distribution.
These curves illustrate three different

allocations of the unit probability of
the certain event over the range of
possible values.
A distribution that, relative to the normal distribution, is more

concentrated around its mean, and thus has thinner tails, is called
leptokurtic (leptos is Greek for thin, fine).
A distribution that, relative to the normal distribution, is less
concentrated around its mean, and thus has fatter tails, is called
platykurtic (platus is Greek for broad, flat).

The population parameter for K is
the fourth standardized moment
defined as:
K = 3 for normal distributions, K > 3 for leptokurtic distributions

and K < 3 for platykurtic distributions. K  3 is called excess kurtosis.
The pastecs package of R

estimates K  3 with
The approximate estimated

standard error of this statistic is
The data are likely non-normal

(at  = 0.05), when

iii. Testing for normality
There are several statistical tests for normality, i.e. for

H0 : the data comes from a normally distributed population;
HA : the data comes from a non-normally distributed population.
We use only the Shapiro-Wilk (SW) test because it is easy to implement

in R and compares favorably to other tests for normality at the limited
sample sizes we usually have to work with in economics, business and
marketing.
We do not discuss the details of this test as we shall always perform it
with R. The program reports the test statistic and the p-value and H0 is
rejected if the p-value is smaller than the selected significance level.
Note:
a) The t test (and several other statistical tools that assume normality) is fairly
robust to departures from normality and hence reliable in practice, unless
the sample size is very small (say, less than 30) or the population is rather
skewed (in this case the t test is acceptable only at much larger sample
sizes).
b) The SW test, similarly to other tests for normality, has two shortcomings.
(i) At small sample sizes (n < 20), when the normality assumption can be
crucial, it has little power to reject H0 even if the population is indeed
not normally distributed.
(ii) At large sample sizes (n > 100), when the violation of normality is far
less critical in practice, it tends to be sensitive to small deviations from
normality and rejects H0 far too often.
For these reasons, it is not recommended to rely entirely on the SW test.
It is always better to assess normality with a combination of graphs, sample
statistics and formal hypothesis tests, though at small sample sizes all these
checks can be unreliable.
(Ex 1)
b) Obtain descriptive statistics and the SW statistic for diff with R and discuss
their implications about normality.
The stat.desc function of the pastecs package generates the following printout:

i. The sample mean (12.175) is bigger than the sample median (10.500),
so the sample of diff is skewed to the right (non-normal).
ii. SK-hat = 0.556 is positive, so the sample of diff is skewed to the right.
SK-hat divided by twice of the standard error is skew.2SE = 1.151 > 1,
so the distribution of diff is unlikely normal.
iii. The estimate of excess kurtosis is K-hat – 3 = -0.548. It is negative, so the

sample of diff is platykurtic. However, the absolute value of K-hat – 3
divided by twice of the standard error is |kurt.2SE | = 0.573 < 1, so the
distribution of diff might be normal.
iv. The reported p-value of the SW test is normtest.p = 0.001 < 0.05, thus
normality is rejected at the 5% level.
Since 3 out of 4 checks cast doubt on normality, the t-test in Ex 2, Week 1

might be misleading.

NONPARAMETRIC TESTS FOR A POPULATION
CENTRAL LOCATION
• For quantitative data the two most useful and popular measures of
central location are the arithmetic mean and the median (in this order).
The mean has two advantages over the median:

 The mean is a comprehensive measure because it is computed
from all available data points, while the median is based on at most
two data points.
 The mean is used far more extensively in inferential statistics than
the median.
However, occasionally the median also has some advantages:
 Since the median depends only on the middle value(s), it is not
affected by outliers (uncharacteristically small or large values),
while the mean can be unduly influenced by them.
 The median exists even if the measurement scale is just ordinal,
but the mean does not.
• A hypothesis about the central location of a quantitative population is
usually best tested with a Z or t test for the population mean ().
However, when the mean does not exists, or is not an ideal measure of
central location due to the presence of outliers, or when the Z / t tests
are inappropriate because the normality assumption is violated, instead
of these parametric procedures one should rely on some nonparametric
alternative for testing the central location of a population.
We consider two such alternatives, the sign test for the median and the
Wilcoxon signed rank test for the median, both based on one sample.
(One sample) Sign test for the median ()

This procedure assumes that
i. the data is a random sample,
ii. the variable of interest is continuous and the measurement scale is
at least ordinal,
but it makes no assumption about the shape of the distribution.
The hypotheses are vs.

This test is based on the signs of the observed non-zero deviations from
0, i.e. on the signs of xi -0  0, i = 1, 2, …, n. Since the true median is
right in the middle of an ordered data set, the numbers of negative and
positive deviations (S- and S+) are expected to be about the same if the
null hypothesis is correct, i.e. the population median is indeed 0.
Let S denote the test statistic. In essence, it could be either S- or S+,
but we arbitrarily choose S = S+.
If H0 is true and the selection of the sample items is random, S follows a
binomial distribution (see Review 3) with n and p = 0.5 parameters,
For sufficiently large n (np = nq = 0.5n  5, so n  10), this binomial

distribution (B) can be approximated with a normal distribution (N),
Reject H0 if (i) right-tail test: pR = P(S  S+) is small,

(ii) left-tail test: pL = P(S  S+) is small,
(iii) two-tail test: 2min(pR , pL) is small.
(One sample) Wilcoxon signed ranks test for the median ()
(also known as Wilcoxon signed rank sum test)
The sign test is based entirely on the signs of the deviations from 0.
The Wilcoxon signed ranks test is a more sensitive and
potentially more powerful alternative because it takes the
magnitudes of these deviations as well into consideration.
The Wilcoxon signed ranks test assumes that

i. the data is a random sample,
ii. the variable of interest is continuous and the measurement scale is
interval or ratio, and
iii. the distribution of the sampled population is symmetric ( = ).
The Wilcoxon signed ranks test has the same null and alternative
hypotheses than the sign test, but it is based on the signs and on the
absolute values of the deviations, i.e. |di| = |xi -0|, i = 1, 2, …, n.
Rank all non-zero |di| from smallest to largest and calculate the
sum of ranks assigned to negative deviations (T) and the sum
of the ranks assigned to positive deviations (T+).
The test statistic is T = T+.
If H0 is true, T is right in the middle of this

interval.
The sampling distribution of T is non-standard, but lower and upper

critical values (TL and TU) for 6 ≤ n ≤ 30 are provided in Table 9,
Appendix B of the Selvanathan book.
Using these critical values, reject H0 if

(i) right-tail test: T ≥ TU,,
(ii) left-tail test: T ≤ TL,,
(iii) two-tail test: T ≥ TU,/2 or T ≤ TL ,/2.
When H0 is true and there are at least 30 non-zero deviations, the

sampling distribution of T can be approximated with a normal
distribution. Namely,
with

(Ex 1)
c) Perform the sign test and the Wilcoxon signed ranks test at the 5% level of
significance with R.
The original null and alternative hypotheses are H0 :  = 10 and HA :  > 10,
but since these nonparametric tests are focusing on the median rather than on
the mean, we rewrite them as H0 :  = 10 and HA :  > 10.
The SignTest function of the DescTools package generates the following

printout:
R reports the test statistic (S),

the number of non-zero
differences, the p-value, the
alternative hypothesis, the
95% ‘one-sided’ confidence
interval (do not worry about it)
and the sample median.
Check whether R performed the required test (i.e. a right-tail sign test this time)
and whether the p-value <  = 0.05. Since p-value = 0.2692 > 0.05, we
maintain H0 at the 5% significance level.
The wilcox.exact function of the exactRankTests package generates the
following printout:
R reports the test statistic (V),

the p-value, the alternative
hypothesis, the 95% ‘one-sided’
confidence interval (do not worry
about it) and a variant of the
sample median.
Since p-value  0.019 < 0.05, we reject H0 at the 5% significance level.
In summary, at the 5% significance level the sign test maintains H0, while the
Wilcoxon signed ranks test rejects it.
Given these contradicting outcomes, recall that the Wilcoxon test assumes that
the population is symmetrical. This assumption, however, is not supported by
the sample (see the mean, median and skewness on slide 18), so we better
rely on the sign test this time.
Hence, based on the sign test we conclude at the 5% level of significance that
the sample does not support the claim that the average Australian is more than
10kg overweight.
WHAT SHOULD YOU KNOW?
• Desirable statistical properties of point estimators:

linearity, unbiasedness, efficiency and consistency.
• To verify whether a sample might have been drawn from a normally
distributed population using graphs, numerical descriptive measures
and the Shapiro-Wilk test.
• Difference between parametric and nonparametric tests.
• To perform the (one sample) Sign test and Wilcoxon signed ranks test
for the population median manually and with R/RStudio.

Quantitative Methods 2: ECON 20003

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Quantitative Methods 2: ECON 20003

Uploaded by

Copyright:

Available Formats

Quantitative Methods 2

• We consider four properties that can make point estimators easier to

Suppose that we are interested in parameter  (it might be e.g. a

a)  -hat is said to be a linear estimator of  if it is a linear function of the

For example, the sample mean X-bar is a linear estimator of the

For example, the sample mean is an unbiased estimator of the

Similarly, the sample variance is an unbiased estimator of the

L. Kónya, 2020 UoM, ECON 20003, Week 2 3

The sampling distribution of 1-hat is

1-hat is an unbiased, whereas

L. Kónya, 2020 UoM, ECON 20003, Week 2 4

3-hat and 4-hat are both unbiased

If -hat is an unbiased estimator then consistency requires the

• Many statistical procedures for interval estimation and hypothesis testing

a) are concerned with population parameters, and

For example, the confidence interval estimation for a population mean

iii. The population standard deviation is unknown, but the population

Procedures that are either not concerned with some population

Note: Nonparametric techniques are sometimes known as distribution-free

Parametric procedures can be misleading when some of their

• A crucial assumption behind most parametric procedures is normality,

• How can we find out with reasonable certainty whether a population is

L. Kónya, 2020 UoM, ECON 20003, Week 2 9

We can use two types of graphs to study whether a data set is

The QQ (quantile-quantile) plot is a scatter plot that depicts the

Histogram of diff with a normal curve

For (continuous and unimodal) symmetric distributions, such as the

L. Kónya, 2020 UoM, ECON 20003, Week 2 12

The population parameter for SK

For symmetric distributions SK = 0, for distributions that are skewed to

The pastecs package of R

The approximate estimated

The data are likely asymmetric,

These curves illustrate three different

A distribution that, relative to the normal distribution, is more

L. Kónya, 2020 UoM, ECON 20003, Week 2 14

K = 3 for normal distributions, K > 3 for leptokurtic distributions

The pastecs package of R

The approximate estimated

The data are likely non-normal

L. Kónya, 2020 UoM, ECON 20003, Week 2 15

There are several statistical tests for normality, i.e. for

We use only the Shapiro-Wilk (SW) test because it is easy to implement

L. Kónya, 2020 UoM, ECON 20003, Week 2 17

iii. The estimate of excess kurtosis is K-hat – 3 = -0.548. It is negative, so the

Since 3 out of 4 checks cast doubt on normality, the t-test in Ex 2, Week 1

L. Kónya, 2020 UoM, ECON 20003, Week 2 18

The mean has two advantages over the median:

(One sample) Sign test for the median ()

The hypotheses are vs.

For sufficiently large n (np = nq = 0.5n  5, so n  10), this binomial

Reject H0 if (i) right-tail test: pR = P(S  S+) is small,

The Wilcoxon signed ranks test assumes that

If H0 is true, T is right in the middle of this

The sampling distribution of T is non-standard, but lower and upper

Using these critical values, reject H0 if

When H0 is true and there are at least 30 non-zero deviations, the

L. Kónya, 2020 UoM, ECON 20003, Week 2 23

The SignTest function of the DescTools package generates the following

R reports the test statistic (S),

R reports the test statistic (V),

Since p-value  0.019 < 0.05, we reject H0 at the 5% significance level.

• Desirable statistical properties of point estimators: