1203 2010sem2 Solution

ECON1203 Past Papers Suggested Solutions
By Blair Wang BIT12

If you find any questions where you don't agree with my solution, please let me know! That means that one
of us can learn from the other :)
NOTE: To access these papers, go to UNSW Library website and search for "ECON1203". You will be able
to view them online as PDF files. You need your Zpass.
TABLE OF CONTENTS:
2010s2
2010s2
(1.a.i)
Assumptions:
1. There is a fixed number of trials, n
2. Each trial can be categorised as either 'failure' or 'success'
3. The probability of a particular trial being a 'success' is a constant, p, between 0 and 1
4. The trials are independent - one trial's outcome is not affected by that of another
(1.a.ii)
Mean = E(x) = np = 1*p = p
Variance = Var(X) = np(1-p) = p-p^2
(1.a.iii)
The drawing of the 5 items cannot be modelled as a binomial experiment since the items are drawn without
replacement and therefore these 'trials' would not be independent. The alternative is to treat each instance
of 5 items being drawn as a 'trial', but because only one such 'trial' is conducted, the only possible values
of 'p' that we could get would be increments of 1/n, i.e., 0.2, 0.4, 0.6, 0.8, and 1.0.
(1.b.i)
Since the 10 items are being drawn from a 'large' shipment of items, it can be assumed that each act of
drawing without replacement has an insignificant effect on every other draw. The draws are therefore
practically independent and can be treated as 'trials' in a binomial experiment.
(1.b.ii)
Let X be the number of defective items in the sample. As per (1.b.i), this can be modelled as a binomial
experiment. n=10, p=0.1 (given). For exact probability, must not use tables.
P(1<=X<=3) = P(X=1) + P(X=2) + P(X=3)
= (10C1)(0.1^1)(0.9^9) + (10C2)(0.1^2)(0.9^8) + (10C3)(0.1^3)(0.9^7)
= 0.6385263615...
(1.b.iii)
Normal approximation to the binomial: X~N(np, np(1-p)) i.e. X~N(1, 0.9)
now P(1<=X<=3)
~= P(0.5 < X < 3.5), applying continuity correction factor
= P((0.5-1)/0.9 < Z < (3.5-1)/0.9)
= P(-5/9 < Z < 25/9)

~= P(-0.56 < Z < 2.78)
= P(0<Z<0.56) + P(0<Z<2.78)
= 0.2123 + 0.4973
= 0.7096
This approximation is not very accurate because n=10 is a rather small sample size. To improve accuracy,
a larger sample size is needed.
(1.c)
P(gadget works) = P(2nd works | 1st works) = (0.9)^2 = 0.81
(1.d)
P(gadget works) = P(1st works OR 2nd works)
= P(both work) + P(only one works)
= 0.81 + 0.9*0.1*2
= 0.99
This enhancement is very useful as it increases the success rate from 81% to 99%.
(2.a)
The distribution must have been bell-shaped (symmetrical, unimodal) with mode=mean=median=15.
(2.b)
Let X be the number of hours worked. X~N(15, sigma^2).
P(X>25) = 0.1 (given)
.'. P(Z > (25-15)/sigma) = 0.1
10/sigma = 2.326 (table 4)
sigma = 10/2.326 = 4.299226139...
Thus the standard deviation is approx. 4.30 (2dp).
(2.c)
P(5<X<25)
= P((5-15)/sigma < Z < (25-15)/sigma)
= P(-10/sigma < Z < 10/sigma)
= 2*P(0 < Z < 10/sigma)
~= 2*P(0 < Z < 2.326)
= 2*0.1 = 0.2
(2.d)
X~N(15, sigma^2) .'. X-bar~N(15, [sigma^2]/n).
N.B. Although n=25 is too small to invoke Central Limit Theorem, we know that X-bar is normally distributed
since the underlying distribution is normal.
Now P(5 < X-bar < 25)
= P(-10/[sigma/sqrt(n)]) < Z < 10/[sigma/sqrt(n)])
= 2*P(0< Z < 10/[sigma/sqrt(25)])
= 2*(0<Z<11.63)
=1
Thus is it practically certain that the average is between 5 and 25 hours.
(2.e)
Let female population mean be F-bar, sample mean is f-bar. Assume F has a standard deviation the same
as that of males (X-bar), i.e., sigma_F-bar = (10/2.326)/sqrt(n). Now n=36.
(1) Hypotheses:
H0: F=15 (females work same number of hours)
H1: F<15 (females work less)
(2) Standardise Observation to Test Statistic:
We observe f=13.5, so Z=(13.5-15)/[(10/2.326)/sqrt(36)] = -2.0934
P(Z<-2.0934) = 0.5-P(0<Z<-2.0934)
~= 0.5-P(0<Z<-2.09)
= 0.5-0.4817 = 0.0183
(3) Decision rule:
From (2), we reject H0 for all alpha > 0.0183 i.e. 1.83%
This is a reasonable significance level for a one-tailed hypothesis test. Therefore, for all significance
levels greater than 1.83%, we reject H0 in favour of H1, i.e. we find that females do in fact work less
hours than males on average.
A Type I error is when the null hypothesis is correct but has been erroneously rejected. In this case, a type
I error would have been to conclude that females work less hours than males if, in fact, they did not. The
probability of committing a type I error is known as the significance level, which in this case is only 1.83%.
A Type II error is when the null hypothesis is incorrect but we have failed to reject it. In this case,
a type II error would have been to conclude that "females work the same number of hours" passes the
hypothesis test if, in fact, females worked less hours.
(2.f)
It is not necessary to assume that the distribution of hours worked by females is normal in part (e) since the
sample size (n=36) large enough to invoke the Central Limit Theorem, which states that for large (n>30)
sample sizes, the sampling distribution of the mean is approximately normal regardless of the underlying
population distribution.
(3.a)
A point estimator provides a single possible value for the parameter. An interval estimator provides a range
of possible values for the parameter.
(3.b)
n=114, which is large enough (n>30) to invoke Central Limit Theorem and thus assume that X-bar is
normally distributed. Hence X-bar~N(10.88, (3.20^2)/n), i.e. X-bar~N(10.88, 10.24/114)
For 99% CI, alpha = 0.01, alpha/2 = 0.005
CI half-width = Z_alpha/2 * sqrt(variance) = Z_0.005*sqrt(10.24/114)
= 2.576*sqrt(10.24/114)
= 0.7720464162...
Hence CI = [10.11, 11.65].
The sampling method used by HRD is a questionnaire that recipients could choose not to respond to. If
managers' choice whether or not to complete the questionnaire is correlated with how many years the
manager has been in the firm, then this sampling method is biased.
(3.c)
The confidence interval could be changed by altering the confidence level (a greater confidence level
increases the CI width) or by altering the sample size (a larger sample size decreases the CI width).
Now for 95% CI, alpha = 0.05, alpha/2 = 0.025
CI half-width = Z_alpha/2 * sqrt(10.24/n) =1.96*sqrt(variance) = 0.5*0.7720464162...
.'. sqrt(10.24/n) = (0.5/1.96)*0.7720464162... = 0.19695...

.'. 10.24/n = 0.03878...
.'. n = 263.988...
.'. At least 264 general managers should be interviewed.
(3.d)
Whether or not the assumption that employment duration is normally distributed depends on prior
experience and empirical observation as there is no universal rule. If the median is significantly different
from the mean, then a normal distribution cannot be used; however, it is unclear whether 10.95 is
significantly different from 10.88.
(3.e)
For 95% CI, alpha=0.05, alpha/2 = 0.025. n=114
LCL = (n-1)*s^2 / ChiSq(alpha/2, n-1) = 113*3.2/ChiSq(0.025, 113) = 361.6/129.5612 = 2.79...
UCL = 361.6/ChiSq(1-alpha/2, n-1) = 361.6/ChiSq(0.975, 113) = 361.6/74.2219 = 4.87...
.'. 95% CI = [2.79, 4.87].
This is not symmetrical about the point estimate, 3.20, because the variance of a normal distribution is
distributed according to the positively-skewed Chi-square distribution.
(4.a)
Covariance measures the extent to which variance in X moves in the same direction as variance in Y. It is
calculated as (Xi-Xbar)(Yi-Ybar)/w, where w is population size for population covariance, or sample size
minus 1 for sample covariance. If variance in X moves in the same direction as variance in Y (covariance
is positive), then X and Y have a positive relationship and the slope coefficient will be positive. Otherwise,
if they move in different directions (covariance is negative), then X and Y have a negative relationship and
the slope coefficient will be negative. Therefore covariance and the slope coefficient have the same sign,
QED.
(4.b)
Assumptions about the error terms:
Zero Conditional Mean, i.e. E(epsilon_i | X_i) = 0
Homoskedasticity, i.e. Var(epsilon_i) = sigma^2 for all i.
Uncorrelated, i.e. correlation(episilon_i, epsilon_j) = 0 for random i, j
These assumptions are necessary in order to allow the model to have epsilon_i~N(0, sigma^2) and thus
assume that actual values of Y are normally distributed around the regression line of forecasts for Y.
(4.c.i)
A=(actual-expected)/se = (0.600-0)/0.112 = 0.6/0.112 = 5.357142857...
B=P(t>5.357142857...) = 0.0000
(4.c.ii)
Intercept = 0.600, which means that a country with 0 inhabitants per hectare still has a deforestation rate of
0.6% per year. This case is never realised, which means that the model only applies for countries that are
actually inhabited.
Pop. density = 0.842, which means that every additional inhabitant increases deforestation rate by 0.842%
per year, holding other factors constant. This is a positive coefficient, as expected, since more inhabitants
generally result in more deforestation for human activity.
(4.c.iii)
Beta0 = Intercept, point estimator = 0.600.

For 95% CI, alpha = 0.05, alpha/2 = 0.025. df=n-2 = 70-2 = 68 (for 2 coefficients)
Now half-width = t(alpha/2, df)*se = t(0.025,68)*0.112 = 2.000*0.112 = 0.224 (closest df=60)
.'. CI = [0.376, 0.824]
(4.c.iv)
Beta1 = Pop. Density, point estimator = 0.842
(1) Hypotheses:
H0: Beta1 = 0.5
H1: Beta1 > 0.5 (one-tailed!)
(2) Standardise Observation to Test Statistic:
t Stat = (actual-expected)/se = (0.842-0.5)/0.117 = 2.923076923...
p-value = P(t>2.923076923...) for df=68 i.e. n=70
as n is large (n>30), we may approximate t distribution using normal distribution
.'. p-value ~= P(Z>2.923076923...) = 0.5-P(0<Z<2.92) = 0.5-0.4982 = 0.0018
(3) Decision rule
From (2), we reject H0 for all alpha > 0.0018. This is a very small significance level (0.18%) and
therefore in practically all cases we would reject H0 in favour of H1, i.e., we would conclude that
Beta1 is significantly greater than 0.5.
(4.c.v)
Sample correlation = r_xy = s_xy/s_x^2 ~= R^2 = 0.436
[I'm not sure how you would get the actual sample correlation since you don't have s_xy, and you don't
really have s_x either]
(4.c.vi)
For this country, Y-hat = 0.600 + 0.842(0.640) = 1.13888...
But given Y=2.6.
Residual = e_i = Y_i minus Y-hat_i = 1.46112
This means that the model's forecasted deforestation rate was 1.46% different to the true deforestation
rate. [What else would you say...?]
(5.a)
R^2 measures the proportion [don't say percentage, because that would be 100*R^2] of variance in Y that
the regression model explains using variance in X.
(5.b.a)
Intercept = 573.5, which means that a new house with no area whatsoever is still worth $573,500.
This is clearly absurd, which means the model only applies to houses that actually exist in physical space
(i.e. have an area).
Square metres = 1.772, which means that every additional square metre that the house occupies,
holding other factors constant, increases its price by $1,772. This is a positive number as expected since
larger houses are, ceteris paribus, worth more.
Age (years) = -6.663, which means that every year it ages, a house's price decreases by $6,663,
holding other factors constant. This is a negative number as expected since older houses are, ceteris
paribus, worth less than newer houses.
(5.b.b)
For 99% CI, alpha=0.01, alpha/2=0.005. n=23, so df=20 for 3 coefficients.
Point estimator = 1.772, se=0.315.
CI half-width = t(alpha/2, df)*se = t(0.005, 20)*0.315 = 2.845*0.315 = 0.896175

.'. 99% CI = [1.772-0.896175 , 1.772+0.896175] = [0.875825, 2.668175]
(5.b.c)
Beta2 = Age (years) coefficient
(1) Hypotheses:
H0: Beta2 = 0 (no effect)
H1: Beta2 != 0 (some effect, either positive or negative)
(2) Test statistic:
The Excel output is based on this exact hypothesis test and provides t Stat = -2.923 which yields Pvalue = 0.008418.
(3) Decision rule:
From (2), we reject H0 for all alpha>0.008418 (i.e. the minimum significance level that would lead
to a rejection of the null hypothesis is 0.8418%). Since this is a tiny significance level, we do indeed
reject H0 in favour of H1 and conclude that for all practical intents and purposes, Age (years) does
in fact have an effect on selling price.
(5.b.d)
The absolute value of the coefficient is not proportional to its importance, since some variables vary more
greatly (i.e. have greater variance) than others. For example, each square metre only affects price by
$1,772 as opposed to age (years) which affects price by $6,663. However, simply the addition of an extra
room may count as an increase of around 10 square metres, or a price change of $17,720.
(5.b.e)
Y-hat = 573.5 + 1.772(220) - 6.663(10) = 896.71
hence the predicted selling price is $896,710
(5.b.f)
This regression model is already quite good, as R Square and Adjusted R Square suggest that it explains
around 72-74% of variation in house prices. However, it may be improved by including other factors such
as houses' distance to schools and public transport, the quality of the neighbourhoods that they are situated
in, the type of building materials, the presence of a swimming pool, etc.

1203 2010sem2 Solution

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

1203 2010sem2 Solution

Uploaded by

Copyright:

Available Formats

ECON1203 Past Papers Suggested Solutions

By Blair Wang BIT12

= P(-5/9 < Z < 25/9)

.'. sqrt(10.24/n) = (0.5/1.96)*0.7720464162... = 0.19695...

Beta0 = Intercept, point estimator = 0.600.

CI half-width = t(alpha/2, df)se = t(0.005, 20)0.315 = 2.845*0.315 = 0.896175

You might also like