You are on page 1of 10

Office Use Only

Semester One 2018


Examination Period

Faculty of Business and Economics

EXAM CODES: ETC1000 / ETW1000/MCD2080

TITLE OF PAPER: BUSINESS AND ECONOMIC STATISTICS – PAPER 1 OF 1

EXAM DURATION: 2 hours writing time

READING TIME: 10 minutes

THIS PAPER IS FOR STUDENTS STUDYING AT: (tick where applicable)


 Caulfield  Clayton Malaysia  Off Campus Learning  Open Learning
 Parkville  Gippsland  Peninsula  Monash Extension  Sth Africa
 Other (specify)

During an exam, you must not have in your possession any item/material that has not been authorised for
your exam. This includes books, notes, paper, electronic device/s, mobile phone, smart watch/device,
calculator, pencil case, or writing on any part of your body. Any authorised items are listed
below. Items/materials on your desk, chair, in your clothing or otherwise on your person will be deemed to
be in your possession.

No examination materials are to be removed from the room. This includes retaining, copying, memorising
or noting down content of exam material for personal use or to share with any other person by any means
following your exam.
Failure to comply with the above instructions, or attempting to cheat or cheating in an exam is a discipline
offence under Part 7 of the Monash University (Council) Regulations.

AUTHORISED MATERIALS

OPEN BOOK NO

CALCULATORS NO

SPECIFICALLY PERMITTED ITEMS NO


if yes, items permitted are:

Candidates must complete this section if required to write answers within this paper

STUDENT ID: __ __ __ __ __ __ __ __ DESK NUMBER: __ __ __ __ __

Page 1 of 10
INSTRUCTIONS TO CANDIDATES:

Answer ALL questions in this examination paper.


Paper is out of 100 marks

Where you are asked to perform calculations, you should write out the solution as an
equation containing the appropriate numerical values from within the question. You do not
need to calculate exact values in order to receive full marks for that part of the question.

Background
This exam will focus on how we measure poverty in a society / country. The task of developing
acceptable measures is important in order to effectively monitor progress in efforts to improve
wellbeing.

The measure you will study is the Multidimensional Poverty Index (MPI). This Index is based on the
idea that a household is “poor” when they are deprived in a number of areas, covering aspects of
education, health and living standards.

Here are the dimensions of the MPI and the weights used to calculate the overall index.

The dataset that will be used in this exam is based on a sample taken in 2010, covering over 11,000
households in Timor-Leste.

The variables specify whether the household is deprived in each of these 10 indicators. For all the
variables, a Zero (0) indicates the statement is not true (not deprived), and a One (1) indicates it is
true (deprived).

Page 2 of 10
Question 1: Living Standards (10 marks)

In this question we will look at some of the individual components / indicators for the MPI related to
living standards, and associations between these.

a. Consider the following pivot tables, and then answer the questions below.

Count of sanitation is poor sanitation is poor


no access to clean water 0 1 Grand Total
0 24.3% 30.9% 55.3%
1 7.2% 37.6% 44.7%
Grand Total 31.5% 68.5% 100.0%

Count of sanitation is poor sanitation is poor


no access to clean water 0 1 Grand Total
0 77.2% 45.2% 55.3%
1 22.8% 54.8% 44.7%
Grand Total 100.0% 100.0% 100.0%

Count of sanitation is poor sanitation is poor


no access to clean water 0 1 Grand Total
0 44.1% 55.9% 100.0%
1 16.0% 84.0% 100.0%
Grand Total 31.5% 68.5% 100.0%

i. What percentage of households are deprived of both clean water and good sanitation?
37.6% (2 marks)

ii. Which is the more common: access to clean water, or access to good sanitation facilities? Explain.
access to clean water
(2 marks)

iii. Explain how the value 45.2% in the second table is calculated.
30.9% / 68.5% x 100 (2 marks)

iv. Consider the following incomplete % of row table.

Count of sanitation is poor sanitation is poor


no access to clean water 0 1 Grand Total
0 a b 100.0%
1 c d 100.0%
Grand Total 31.5% 68.5% 100.0%

Suppose access to sanitation is independent of whether households have clean water or not. What
values would be in the cells of the table labelled a, b, c and d? Explain intuitively why they would
take these values.
a = c = 31.5% (4 marks)
b = d = 68.5%

Page 3 of 10
Question 2: Education (22 marks)

a. The following regression can be used to analyse the Indicator around the level of education in the
household.

Dependent Variable: the variable indicating that no person in the household has 5+ years of
education.

Explanatory Variable (Mean): takes the value 1 for all households.

Dependent Variable: No person has 5+ years of Education

SUMMARY OUTPUT

Regression Statistics
Multiple R 0.4734051
R Square 0.2241124
Adjusted R Square
0.2240251
Standard Error
0.4170146
Observations 11463

ANOVA
df SS MS F Significance F
Regression 1 575.7446567 575.7447 3310.758 0
Residual 11462 1993.255343 0.173901
Total 11463 2569

Coefficients Standard Error t Stat P-value Lower 95% Upper 95%Lower 99.0%
Upper 99.0%
Intercept 0 #N/A #N/A #N/A #N/A #N/A #N/A #N/A
Mean 0.2241124 0.003894952 57.53918 0 0.216478 0.231747 0.214078 0.234147

i. Give a 95% Confidence Interval for proportion of households in Timor-Leste that have no person
with 5+ years of education. Interpret that confidence interval in everyday language.
we are 95% confident that true proportion of households in timor leste that have no one with 5+ (3 marks)
years of education lies in between 0.216478 and 0.231747

ii. The confidence interval is quite narrow in this case. What do you think is the main reason for
that?
the reason why it’s narrow because the critical value is smaller
(2 marks)

iii. Use the Confidence Interval to explain carefully the idea of repeated sampling and what the strict
interpretation of a confidence interval is, using the repeated sampling approach.
repeated sampling means the activity of taking numerous times from sampling with the size of n (6 marks)
continuously. Based on the sampling distribution we can work out the range between which 95%
sample statistic lie which is the CI
iv. Compare the 95% Confidence Interval and the 99% Confidence Interval in the output. Why would
the 99% Interval be wider?
The 99% confidence interval has wider interval width than 95% CI because the critical value for 99% is (3 marks)
larger than the critical value for 95%
v. Estimate the proportion of households in Timor-Leste that have at least one person with 5+ years
of education.
the estimated average of household with at least one person with 5+ years of education is 0.7759 (2 marks)
more than those with no person with 5+ years of education

Page 4 of 10
b. In the 2005 Census covering the whole population, it was found that 65% of households had all
school-aged children enrolled in school. Consider the following Excel output – note carefully how
the dependent variable is defined – and then answer the questions below.

Dependent Variable: Any school-aged child not in school, minus 0.35


SUMMARY OUTPUT

Regression Statistics
Multiple R 0.116696
R Square 0.013618
Adjusted R Square 0.013531
Standard Error 0.456665
Observations 11463

ANOVA
df SS MS F Significance F
Regression 1 33.00065 33.00065 158.2441 4.74E-36
Residual 11462 2390.317 0.208543
Total 11463 2423.317

Coefficients
Standard Error t Stat P-value Lower 95%Upper 95%
Intercept 0 #N/A #N/A #N/A #N/A #N/A
Mean -0.05366 0.004265 -12.5795 4.73E-36 -0.06202 -0.04529

i. What is the proportion of households with all school-aged children in school for this 2010 sample
of households?
any school-aged child not in school estimate = -0.05366 + 0.35 = 0.29634 (2 marks)
Thus, the proportion of household with all school-aged children in school will be 1-0.29634 = 0.70366
ii. Perform a hypothesis test for whether there has been an improvement in school enrolments using
this indicator between 2005 and 2010. Explain carefully all the steps of your hypothesis test.
Hypothesis (4 marks)
- H0 : B1 = 0
- H1 : B1 < 0

Significant level
a = 0.05

p value = 4.73E-36

decision
if p value/2 is less than alpha, we can reject null hypothesis. in this case, 4.73E-36/2 is less than
0.05. therefore, there is sufficient evidence to reject null that there is no improvement between
2005 and 2010 as there is difference between 2005 and 2010 with 2005 is less than 2010

Page 5 of 10
Question 3: Multidimensional Poverty Index (16 marks)

a. Below is a Frequency distribution for the number of areas deprived, along with a chart
representation.

Number of Deprivations per Household

Deprivations Households
Number Percentage
0 160 1.4%
1 622 5.4%
2 913 8.0%
3 1077 9.4%
4 1306 11.4%
5 1758 15.3%
6 2317 20.2%
7 2109 18.4%
8 943 8.2%
9 242 2.1%
10 16 0.1%

i. Comment on the main features of this distribution.


the most number of deprivation per household is 6 with mode of 2317 which is 20.2% of the (6 marks)
household. the mean is 1042.0909. the distribution shape is negative asymmetric
ii. If we said a household is multidimensionally poor if they are deprived in more than three areas,
what percent of households would classify as poor in this sample?
(2 marks)
11.4% + 15.3% + 20.2% + 18.4% + 8.2% + 2.1% + 0.1% = 75.7%

b. The most commonly used MPI Index uses weights to calculate the index, on the basis that some of
the 10 indicators might be more important than others. The weights are given in the Background
section at the start of this paper. Briefly comment on these weights, and suggest a couple of
alternative approaches to determining weights. Explain the rationale behind these alternatives.
They give 1/3 to each component, but it’s arbitrary to give equal weight to indicator within each (6 marks)
component.
c. It has been pointed out that instead of estimating the percent of households that are poor, we
should be estimating the percent of people / individuals who are poor. Generally, poor households
have more people. If this is true on average, does your estimate above understate or overstate the
individual poverty rate? Explain.
(2 marks)
more people are poorer than the amount of household

Page 6 of 10
Question 4: Subjective Wellbeing (34 marks)

In the survey, a number of questions were asked, and used to construct a measure of subjective
wellbeing for each household. For this measure, people are asked to rate their overall household
wellbeing, on a scale of 1 to 10 (1 = Excellent, 10 = Very bad).

a. Here is a table of descriptive statistics for Subjective Wellbeing.

Subjective Wellbeing

Mean 5.580301841
Standard Error 0.024318159
Median 6
Mode 7
Standard Deviation 2.603633425
Sample Variance 6.778907011
Kurtosis -0.949879552
Skewness -0.266154342
Range 18
Minimum 1
Maximum 19
Sum 63976
Count 11463

Comment on what you learn from these summary statistics about the data, and about the
distribution of Subjective wellbeing. Make sure you cover central tendency, spread, shape, and any
other notable features.
the shape is asymmetric negatively with mean<median<mode which more household are worse off than (8 marks)
the mean. average wellbeing is 5.6 with most ratings are 7 and half of households are above 6. the
range is in between 1 to 19
b. Subjective wellbeing is likely to be affected by most or all of the indicators used in constructing
the MPI. To assess this connection, a multiple regression model is estimated as follows.

Dependent variable: Subjective Wellbeing


Explanatory variables: Each of the 10 Indicators.

The model estimates are given at the top of the next page:

Page 7 of 10
Dependent Variable: Subjective Wellbeing
SUMMARY OUTPUT

Regression Statistics
Multiple R 0.925580558
R Square 0.856699369
Adjusted R Square 0.856574238
Standard Error 0.986037477
Observations 11463

ANOVA
df SS MS F Significance F
Regression 10 66565.3972 6656.54 6846.391 0
Residual 11452 11134.435 0.97227
Total 11462 77699.8322

Coefficients Standard Error t Stat P-value Lower 95% Upper 95%


Intercept 0.783579528 0.05300892 14.78203 5.41E-49 0.679672977 0.887486
No person has 5+ years education 0.963730844 0.02344716 41.10225 0 0.917770402 1.009691
Any school age child not enrolled at school 0.535656303 0.02066613 25.91952 5.4E-144 0.495147146 0.576165
Mum or child malnourished 0.365388611 0.01887977 19.35344 3.88E-82 0.328381026 0.402396
A child has died 0.360644892 0.04038762 8.92959 4.93E-19 0.28147824 0.439812
no electricity 1.05426944 0.02438209 43.2395 0 1.006476373 1.102063
sanitation is poor 1.235811687 0.02360058 52.36363 0 1.18955052 1.282073
no access to clean water 1.55288984 0.01993688 77.89031 0 1.513810143 1.59197
Dirt floor 1.154180983 0.02220871 51.96975 0 1.110648115 1.197714
dirty cooking fuel 0.244743216 0.05557211 4.404065 1.07E-05 0.135812363 0.353674
Is asset poor (no bike, motorbike, car, radio, fridge,
1.55564818
TV, phone) 0.02481697 62.68487 0 1.50700268 1.604294

i. First, look at the column of numbers titled P-value. What do these values tell you about which of
the indicators are relevant to subjective wellbeing? Explain.
All are much below 0.05, very very small , so all indicators are important (4 marks)

ii. The coefficient of the intercept is 0.784. How do you interpret this value? Does it make sense?
the estimated mean of subjective wellbeing if there is no deprivation is 0.78 on average (3 marks)

iii. The coefficient of “no electricity” is 1.054. How do you interpret this value? Do you think the
value is realistic? Why or Why not?
the estimated mean of subjective wellbeing if there is no deprivation is 0.78 on average (4 marks)

iv. A researcher has reported: “Based on the estimated coefficients, the most important factors that
can improve subjective wellbeing are: Access to clean water, and ownership of certain assets”. How
do you think they came to this conclusion? Do you think this is a valid conclusion in this context?
(3 marks)

v. Interpret the R Square value in the above model (0.857), and the Standard Error (0.986).
(4 marks)

vi. Based on this estimated model, if you were giving advice to the government about their spending
priorities for improving people’s subjective wellbeing, what are three things you would say? Make
sure you link your advice to the model results.
(8 marks)

Page 8 of 10
Question 5: Subjective Wellbeing over time (18 marks)

In a separate study, a local organisation has been collecting data on subjective wellbeing of people in
Timor-Leste over several years. They conduct a survey of 50 people twice per year for the past 10
years, and calculated subjective wellbeing each time. Each year one survey is conducted in the wet
season, and one in the dry season. Generally the wet season of the year is a more difficult time for
daily living: more rain means greater humidity, increased risk of water-borne diseases like malaria,
many roads become difficult to pass, etc, etc.

The output below shows the results of the following regression:

Dependent variable: Subjective wellbeing of each individual in each survey.


This is measured the same way as in Question 4 (Scale 1 to 10, 1 = Excellent, 10 = very poor)

Explanatory Variables:
Year = Year of survey; Year = 1 in 2006, through to Year = 10 in 2015
WetSeason = 1 if the survey took place in the wet season, and 0 if in the dry season

Dependent Variable: Subjective Wellbeing


SUMMARY OUTPUT

Regression Statistics
Multiple R 0.814195
R Square 0.662914
Adjusted R Square0.662237
Standard Error 0.472771
Observations 1000

ANOVA
df SS MS F Significance F
Regression 2 438.2403 219.1202 980.3491 3.8E-236
Residual 997 222.8419 0.223512
Total 999 661.0822

Coefficients
Standard Error t Stat P-value Lower 95%Upper 95%
Intercept 5.025232 0.035589 141.2022 0 4.955394 5.09507
Year -0.15357 0.005205 -29.5044 4.9E-138 -0.16379 -0.14336
WetSeason 0.98726 0.029901 33.01799 3.9E-162 0.928584 1.045935

a. Interpret the intercept in this estimated model.


(2 marks)

b. Does the sign of the WetSeason coefficient seem correct to you? What does it say about how
much wellbeing varies with the seasons?
(3 marks)

c. It is claimed that subjective wellbeing has not been changing at all in Timor-Leste over the years,
as the country develops. Use this estimated model to test this claim using an appropriate hypothesis
test.
(4 marks)

d. Use the estimated model to predict average subjective wellbeing in the dry season of 2018.
(3 marks)

Page 9 of 10
e. Policy makers have asked you to use the model to predict subjective wellbeing in the dry season
of 2030. Why would you think that is not a good idea?
(2 marks)

e. This model has highlighted the fact that subjective wellbeing varies between the seasons. This
might have some implications for the analysis you undertook in Question 4. Why is it potentially
relevant? What effect do you think it might have?
(4 marks)

END OF EXAM

Page 10 of 10

You might also like