You are on page 1of 26

Quantitative Methods 2

ECON 20003

WEEK 2
DESIRABLE PROPERTIES OF POINT ESTIMATORS
PARAMETRIC AND NONPARAMETRIC TECHNIQUES
THE ASSUMPTION OF NORMALITY
Reference:
SSK: § 10.1
WWL: 3.5-3.9

Dr László Kónya
January 2020
DESIRABLE PROPERTIES OF POINT ESTIMATORS

• We consider four properties that can make point estimators easier to


work with and are possessed by ‘good’ point estimators.

Suppose that we are interested in parameter  (it might be e.g. a


population mean, a population proportion or a slope parameter of a
population regression model) and we estimate it with the following
estimator:

a)  -hat is said to be a linear estimator of  if it is a linear function of the


sample observations.

For example, the sample mean X-bar is a linear estimator of the


population mean .
However, the sample variance s2 is a quadratic function of the Xi
sample observations, so it is a non-linear estimator of the
population variance  2.
L. Kónya, 2020 UoM, ECON 20003, Week 2 2
b)  -hat is said to be an unbiased estimator of  if
i.e. if the expected value of  -hat is equal to  and thus the sampling
distribution of  -hat is centered around .
Otherwise,  -hat is referred to as a biased estimator and

Bias

For example, the sample mean is an unbiased estimator of the


population mean because

Similarly, the sample variance is an unbiased estimator of the


population variance because

L. Kónya, 2020 UoM, ECON 20003, Week 2 3


However, the following alternative estimator of  2 is biased since

Suppose that 1-hat and 2-hat denote two different (normally distributed)
estimators of .

The sampling distribution of 1-hat is


centered around , while the sampling
distribution of 2-hat is not.

1-hat is an unbiased, whereas


2-hat is a biased estimator.

1-hat is expected to estimate  more accurately than 2-hat.

L. Kónya, 2020 UoM, ECON 20003, Week 2 4


c)  -hat is an efficient estimator of  within some well-defined class of
estimators (e.g. in the class of linear unbiased estimators) if its
variance is smaller, or at least not greater, than that of any other
estimator of  in the same class of estimators.

3-hat and 4-hat are both unbiased


estimators of , but the sampling
distribution of 3-hat has a smaller
variance than the sampling distribution
of 4-hat.


3-hat is the more efficient estimator, it is likely to produce a more
accurate estimate of  than 4-hat.
Note: In case of random sampling the sample mean is the best linear unbiased
estimator (BLUE) of the population mean. “Best” means that X-bar has
the smallest variance in the class of linear unbiased estimators of ,
hence it is an efficient estimator.
L. Kónya, 2020 UoM, ECON 20003, Week 2 5
d)  -hat is called a consistent estimator of  if its sampling distribution
collapses into a vertical straight line at the point  when the sample
size n goes to infinity.

Let f1( -hat), f2( -hat) and f3( -hat) n1 < n2 < n3
denote the sampling distributions of
the same  -hat estimator generated
by three different sample sizes.
These sampling distributions are
centered around , and as the
sample size increases they become
narrower. 
Granted that this is true for larger sample sizes as well,
 -hat is a consistent estimator of .

If -hat is an unbiased estimator then consistency requires the


variance of its sampling distribution to go to zero for increasing n.
However, if -hat is a biased estimator then consistency requires both
its variance and the bias to go to zero for increasing n.
L. Kónya, 2020 UoM, ECON 20003, Week 2 6
PARAMETRIC AND NONPARAMETRIC TECHNIQUES

• Many statistical procedures for interval estimation and hypothesis testing

a) are concerned with population parameters, and


b) are based on certain assumptions about the sampled population
or about the sampling distribution of some point estimator.
These procedures are usually referred to as parametric procedures.

For example, the confidence interval estimation for a population mean


and the corresponding hypothesis testing of a population mean based
on the t distribution are parametric procedures as they are concerned
with the population mean and assume that
i. The variable of interest is quantitative and hence the data is
measured on an interval or a ratio scale.
Otherwise the population mean would not exist and the
central location could be measured only with the mode and
the median (if the measurement scale is at least ordinal).
L. Kónya, 2020 UoM, ECON 20003, Week 2 7
ii. The sample has been randomly selected.
Otherwise it might not represent the population accurately.

iii. The population standard deviation is unknown, but the population


is normally distributed, at least approximately.

Procedures that are either not concerned with some population


parameter or are based on relatively weaker assumptions than their
parametric counterparts, and hence require less information about the
sampled population, are called nonparametric procedures.

Note: Nonparametric techniques are sometimes known as distribution-free


procedures. This is a bit deceptive as they also rely on some, though
fewer and less stringent, assumptions about the sampled population.

Parametric procedures can be misleading when some of their


assumptions is violated, so it is crucial to be familiar with these
assumptions and to learn how to check them in practice.
Never run any inferential statistical procedure without
performing a thorough explanatory data analysis first.
L. Kónya, 2020 UoM, ECON 20003, Week 2 8
THE ASSUMPTION OF NORMALITY

• A crucial assumption behind most parametric procedures is normality,


namely that the underlying sampling distribution is normally distributed.
For example, in the case of testing a population mean with a
parametric test, either  should be known and the sample mean
should be normally distributed (Z-test), or if  is unknown, the
sampled population itself should be normally distributed (t-test).

• How can we find out with reasonable certainty whether a population is


normally distributed?
In practice the populations are hardly ever observed entirely. Hence,
we look at the sample data to see if they are more or less normally
distributed, and if they are, then we have ground to believe that the
sampled population is also normal (at least, not extremely non-normal).
Normality can be verified in a number of ways relying on some
(i) graphs, (ii) sample statistics and (iii) formal hypothesis tests.

L. Kónya, 2020 UoM, ECON 20003, Week 2 9


i. Checking normality visually

We can use two types of graphs to study whether a data set is


characterised by a normal distribution: histogram and QQ-plot.

The QQ (quantile-quantile) plot is a scatter plot that depicts the


cumulative relative frequency distribution of the sample data against
some known cumulative probability distribution.
When it is used for checking normality, the reference distribution is a
(standard) normal distribution and if the sample data is normally
distributed, the points on the scatter plot lie on a straight line.

Ex 1: (Week 1, Ex 2)
Last week we performed a t-test to find out whether there was sufficient
evidence at the 5% level of significance to establish that the average Australian
is more than 10kg overweight.

The sample size was large enough (n = 100) to rely on CLT, so the sampling
distribution of the sample mean could be assumed approximately normal.
However,  was unknown, so we had to assume that the sampled population
was not extremely non-normal in order to be able to rely on the t-test.
L. Kónya, 2020 UoM, ECON 20003, Week 2 10
a) Develop an histogram and a QQ-plot of Diff with R to see whether the
sampled population might be normally distributed.

Histogram of diff with a normal curve


that has the same mean and standard QQ-plot of diff
deviation than the sample of diff

The histogram is skewed to the right and on the QQ-plot the points are
scattered around the straight line. Hence, both graphs suggest that diff is
unlikely to be normally distributed.
L. Kónya, 2020 UoM, ECON 20003, Week 2 11
ii. Quantifying normality with numerical descriptive measures

There are four simple numerical descriptive measures that can help us
decide whether a data set is characterised by a normal distribution:
mean, median, skewness and kurtosis.

For (continuous and unimodal) symmetric distributions, such as the


normal, mean = median (for normal distributions the mode is also equal
to them), …
… while for right (positively) skewed distributions median < mean and
for left (negatively) skewed distributions mean < median.

L. Kónya, 2020 UoM, ECON 20003, Week 2 12


• Skewness (SK) is a descriptor of the shape of a distribution and it is
concerned with the asymmetry of a distribution around its mean.

The population parameter for SK


is the third standardized moment
defined as:

For symmetric distributions SK = 0, for distributions that are skewed to


the right SK > 0 and for distributions that are skewed to the left SK < 0.

The pastecs package of R


estimates SK with:

The approximate estimated


standard error of this statistic is

The data are likely asymmetric,


thus non-normal (at  = 0.05), when
L. Kónya, 2020 UoM, ECON 20003, Week 2 13
• Kurtosis (K) is another descriptor of the shape of a distribution.
It is related to the thickness of the tails of the distribution.

These curves illustrate three different


allocations of the unit probability of
the certain event over the range of
possible values.

A distribution that, relative to the normal distribution, is more


concentrated around its mean, and thus has thinner tails, is called
leptokurtic (leptos is Greek for thin, fine).
A distribution that, relative to the normal distribution, is less
concentrated around its mean, and thus has fatter tails, is called
platykurtic (platus is Greek for broad, flat).

L. Kónya, 2020 UoM, ECON 20003, Week 2 14


The population parameter for K is
the fourth standardized moment
defined as:

K = 3 for normal distributions, K > 3 for leptokurtic distributions


and K < 3 for platykurtic distributions. K  3 is called excess kurtosis.

The pastecs package of R


estimates K  3 with

The approximate estimated


standard error of this statistic is

The data are likely non-normal


(at  = 0.05), when

L. Kónya, 2020 UoM, ECON 20003, Week 2 15


iii. Testing for normality

There are several statistical tests for normality, i.e. for


H0 : the data comes from a normally distributed population;
HA : the data comes from a non-normally distributed population.

We use only the Shapiro-Wilk (SW) test because it is easy to implement


in R and compares favorably to other tests for normality at the limited
sample sizes we usually have to work with in economics, business and
marketing.
We do not discuss the details of this test as we shall always perform it
with R. The program reports the test statistic and the p-value and H0 is
rejected if the p-value is smaller than the selected significance level.

Note:
a) The t test (and several other statistical tools that assume normality) is fairly
robust to departures from normality and hence reliable in practice, unless
the sample size is very small (say, less than 30) or the population is rather
skewed (in this case the t test is acceptable only at much larger sample
sizes).
L. Kónya, 2020 UoM, ECON 20003, Week 2 16
b) The SW test, similarly to other tests for normality, has two shortcomings.
(i) At small sample sizes (n < 20), when the normality assumption can be
crucial, it has little power to reject H0 even if the population is indeed
not normally distributed.
(ii) At large sample sizes (n > 100), when the violation of normality is far
less critical in practice, it tends to be sensitive to small deviations from
normality and rejects H0 far too often.
For these reasons, it is not recommended to rely entirely on the SW test.
It is always better to assess normality with a combination of graphs, sample
statistics and formal hypothesis tests, though at small sample sizes all these
checks can be unreliable.

(Ex 1)
b) Obtain descriptive statistics and the SW statistic for diff with R and discuss
their implications about normality.
The stat.desc function of the pastecs package generates the following printout:

L. Kónya, 2020 UoM, ECON 20003, Week 2 17


i. The sample mean (12.175) is bigger than the sample median (10.500),
so the sample of diff is skewed to the right (non-normal).

ii. SK-hat = 0.556 is positive, so the sample of diff is skewed to the right.
SK-hat divided by twice of the standard error is skew.2SE = 1.151 > 1,
so the distribution of diff is unlikely normal.

iii. The estimate of excess kurtosis is K-hat – 3 = -0.548. It is negative, so the


sample of diff is platykurtic. However, the absolute value of K-hat – 3
divided by twice of the standard error is |kurt.2SE | = 0.573 < 1, so the
distribution of diff might be normal.

iv. The reported p-value of the SW test is normtest.p = 0.001 < 0.05, thus
normality is rejected at the 5% level.

Since 3 out of 4 checks cast doubt on normality, the t-test in Ex 2, Week 1


might be misleading.

L. Kónya, 2020 UoM, ECON 20003, Week 2 18


NONPARAMETRIC TESTS FOR A POPULATION
CENTRAL LOCATION

• For quantitative data the two most useful and popular measures of
central location are the arithmetic mean and the median (in this order).

The mean has two advantages over the median:


 The mean is a comprehensive measure because it is computed
from all available data points, while the median is based on at most
two data points.
 The mean is used far more extensively in inferential statistics than
the median.
However, occasionally the median also has some advantages:
 Since the median depends only on the middle value(s), it is not
affected by outliers (uncharacteristically small or large values),
while the mean can be unduly influenced by them.
 The median exists even if the measurement scale is just ordinal,
but the mean does not.
L. Kónya, 2020 UoM, ECON 20003, Week 2 19
• A hypothesis about the central location of a quantitative population is
usually best tested with a Z or t test for the population mean ().
However, when the mean does not exists, or is not an ideal measure of
central location due to the presence of outliers, or when the Z / t tests
are inappropriate because the normality assumption is violated, instead
of these parametric procedures one should rely on some nonparametric
alternative for testing the central location of a population.

We consider two such alternatives, the sign test for the median and the
Wilcoxon signed rank test for the median, both based on one sample.

(One sample) Sign test for the median ()


This procedure assumes that
i. the data is a random sample,
ii. the variable of interest is continuous and the measurement scale is
at least ordinal,
but it makes no assumption about the shape of the distribution.

The hypotheses are vs.


L. Kónya, 2020 UoM, ECON 20003, Week 2 20
This test is based on the signs of the observed non-zero deviations from
0, i.e. on the signs of xi -0  0, i = 1, 2, …, n. Since the true median is
right in the middle of an ordered data set, the numbers of negative and
positive deviations (S- and S+) are expected to be about the same if the
null hypothesis is correct, i.e. the population median is indeed 0.
Let S denote the test statistic. In essence, it could be either S- or S+,
but we arbitrarily choose S = S+.
If H0 is true and the selection of the sample items is random, S follows a
binomial distribution (see Review 3) with n and p = 0.5 parameters,

For sufficiently large n (np = nq = 0.5n  5, so n  10), this binomial


distribution (B) can be approximated with a normal distribution (N),

Reject H0 if (i) right-tail test: pR = P(S  S+) is small,


(ii) left-tail test: pL = P(S  S+) is small,
(iii) two-tail test: 2min(pR , pL) is small.
L. Kónya, 2020 UoM, ECON 20003, Week 2 21
(One sample) Wilcoxon signed ranks test for the median ()
(also known as Wilcoxon signed rank sum test)

The sign test is based entirely on the signs of the deviations from 0.
The Wilcoxon signed ranks test is a more sensitive and
potentially more powerful alternative because it takes the
magnitudes of these deviations as well into consideration.

The Wilcoxon signed ranks test assumes that


i. the data is a random sample,
ii. the variable of interest is continuous and the measurement scale is
interval or ratio, and
iii. the distribution of the sampled population is symmetric ( = ).

The Wilcoxon signed ranks test has the same null and alternative
hypotheses than the sign test, but it is based on the signs and on the
absolute values of the deviations, i.e. |di| = |xi -0|, i = 1, 2, …, n.
Rank all non-zero |di| from smallest to largest and calculate the
sum of ranks assigned to negative deviations (T) and the sum
of the ranks assigned to positive deviations (T+).
L. Kónya, 2020 UoM, ECON 20003, Week 2 22
The test statistic is T = T+.

If H0 is true, T is right in the middle of this


interval.

The sampling distribution of T is non-standard, but lower and upper


critical values (TL and TU) for 6 ≤ n ≤ 30 are provided in Table 9,
Appendix B of the Selvanathan book.

Using these critical values, reject H0 if


(i) right-tail test: T ≥ TU,,
(ii) left-tail test: T ≤ TL,,
(iii) two-tail test: T ≥ TU,/2 or T ≤ TL ,/2.

When H0 is true and there are at least 30 non-zero deviations, the


sampling distribution of T can be approximated with a normal
distribution. Namely,

with

L. Kónya, 2020 UoM, ECON 20003, Week 2 23


(Ex 1)
c) Perform the sign test and the Wilcoxon signed ranks test at the 5% level of
significance with R.
The original null and alternative hypotheses are H0 :  = 10 and HA :  > 10,
but since these nonparametric tests are focusing on the median rather than on
the mean, we rewrite them as H0 :  = 10 and HA :  > 10.

The SignTest function of the DescTools package generates the following


printout:

R reports the test statistic (S),


the number of non-zero
differences, the p-value, the
alternative hypothesis, the
95% ‘one-sided’ confidence
interval (do not worry about it)
and the sample median.

Check whether R performed the required test (i.e. a right-tail sign test this time)
and whether the p-value <  = 0.05. Since p-value = 0.2692 > 0.05, we
maintain H0 at the 5% significance level.
L. Kónya, 2020 UoM, ECON 20003, Week 2 24
The wilcox.exact function of the exactRankTests package generates the
following printout:

R reports the test statistic (V),


the p-value, the alternative
hypothesis, the 95% ‘one-sided’
confidence interval (do not worry
about it) and a variant of the
sample median.

Since p-value  0.019 < 0.05, we reject H0 at the 5% significance level.

In summary, at the 5% significance level the sign test maintains H0, while the
Wilcoxon signed ranks test rejects it.
Given these contradicting outcomes, recall that the Wilcoxon test assumes that
the population is symmetrical. This assumption, however, is not supported by
the sample (see the mean, median and skewness on slide 18), so we better
rely on the sign test this time.
Hence, based on the sign test we conclude at the 5% level of significance that
the sample does not support the claim that the average Australian is more than
10kg overweight.
L. Kónya, 2020 UoM, ECON 20003, Week 2 25
WHAT SHOULD YOU KNOW?

• Desirable statistical properties of point estimators:


linearity, unbiasedness, efficiency and consistency.
• To verify whether a sample might have been drawn from a normally
distributed population using graphs, numerical descriptive measures
and the Shapiro-Wilk test.
• Difference between parametric and nonparametric tests.
• To perform the (one sample) Sign test and Wilcoxon signed ranks test
for the population median manually and with R/RStudio.

L. Kónya, 2020 UoM, ECON 20003, Week 2 26

You might also like