You are on page 1of 10
EXAMINATIONS ‘SEGTION Office Use Onl bi 1s0 Monash University il iii Semester Two Examination Period 2008 904131497 Faculty of Business and Economics EXAM CODES: ETC1000/ETC9000/ETW1000 TITLE OF PAPER: Business and Economic Statistics EXAM DURATION: 2 Hours: READING TIME: 10 minutes, THIS PAPER IS FOR STUDENTS STUDYING AT:( tick where applicable) O Berwick Y Clayton ¥ Malaysia Cf Campus Learning 1 Open Learning Caulfield CQ Gippstand Peninsula «CL Enhancement Studies) Sth Africa OPharmacy C1 Other (specify) During an exam, you must not have in your possession, a book, notes, paper, electronic device/s, calculator, pencil case, mobile phone or other material/item which has not been authorised for the ‘exam or specifically permitted as noted below. Any material or item on your desk, chair or person will be deemed to be in your possession. You are reminded that possession of unauthorised materials in an exam is a discipline offence under Monash Statute 4.1 No examination papers are to be removed from the room. AUTHORISED MATERIALS CALCULATORS v YES ono OPEN BOOK Oves ¥NO SPECIFICALLY PERMITTED ITEMS Oyes ¥NO if yes, items permitted are: ‘Candidates must complete this section ‘required to write answers within this paper STUDENT ID DESK NUMBER Instruction to Candidates Candidates should attempt all FOUR questions Page 1 of 10 |. Question 1 (15 marks) a, In an ETC1000 class, students were asked to sample and weigh 10 chocolates of various sizes from a bag containing 100 chocolates, and use 10 times the total weight of their sample as an estimate of the total weight of the bag of chocolates. In every case over the last three years that the experiment has been done, all the students have over-estimated the weight. Explain why you think this is the case, and what source of bias this represents, (4 marks) 'b. What sort of charts and or tables are appropriate for the following variables: i, The result of house auctions on a particular Saturday. Houses can be sold at auction, sold before, or passed in (i.e not sold). ii, For those houses that were sold at auction, the winning bid. (2+2=4 marks) c. Ina study examining factors related to heart disease, researchers classified 356 volunteer subjects according to their smoking habits and socio-economic status (SES). The results are given in the two-way table below: SES ‘Smoking High ‘Medium Low Total Current 31 22 43, 116 Former 92 21 28 141 Never 68 9 22 99. Total 211 32 33 356 Use % of column statistics to summarise the major pattems in the data. Comment on the pattems you see. (242=4 marks) 4. Draw a frequency diagram for a variable with negative skewness. (2 marks) ¢. Fora distribution which is approximately symmetric, what can you say about the mean and the median? (1 mark) (4+4+4+2+1=15 marks) Page 2 of 10 Question 2 (35 marks) Suppose we aré interested in the relationship between Australian individuals’ beer expenditures and their personal characteristics. We have surveyed 41 persons to obtain data for the following variables for each individual: Beer : annual expenditure on beer by the individual in dollars, Age: age of the individual measured in years, Income : annual income of the individual measured in thousand dollars, Education dummies for the highest education level achieved: 1, if below Year -12 g,_{h if Year-12 0, otherwise ” 7 lo, otherwise ” gy ={h_ i trade or diploma z,a{h if tertiary degrees 0, otherwise 0, otherwise Gender dummies: 1, if male 1, if female Male = : Female = ae {i if female ae {i if male An OLS regression model is estimated that has the beer expenditure as dependent variable and personal characteristics as explanatory variables. The Excel output for the regression is given below: ‘SUMMARY OUTPUT BgorRxponditua ‘Regrossion Statlatles Multiple R 0.88027 162 R Square 0.77487812 Adjusted R Squ_0.73515073 Standard Error 79.6906414 Observations 41 ANOVA a 35, ws ‘Significance F Regression © 749206.096 12086760 1950488384 1.0389E-09 Residual 34 215920.343 6350.5983 Total 40__ 959126.439 Co eificfents Standard Errot Stat P-value Lower 95% Upper 95% Intercept 259.42956 53.14190775 48816262 2.443605 151.432281 967.4268302 Malo 183.142917 24.99020929 7.325867 1.72037E-08 132.356734 233.9290988 Income 2.8312183 0.422400363 6.7026891 1.06917E-07 1.97279805 3.689638557 Age -8,89824766 1.457714593 -6.1042454 6.30902E-07 -11.860678 -5.935817131 2 73.1265311 36.67696889 1.993797 0.054245023 -1.4099886 147.6630508, ea -6.90519912 39.64470038 -0,1590427 0.874576244 -86.872871 74.26247259 Ea -42.033194_52.53710792_-0.8000667_0.429227051_-148.80137_64.73498794 Page 3 of 10 Do you expect a correlation between annual beer expenditure and income? Do you expect zero correlation, negative correlation or positive correlation? Explain your thinking. (G marks) Write out clearly the estimated regression equation based on the above OLS output. (G marks) Interpret the estimated R? and the Standard Error of the regression. (4 marks) Interpret the estimated coefficient for Male. Test whether males on average spend more ‘on beer consumption than females, using a 5% significance level. (5 marks) Would adding the additional variable Female improve the model? (1 mark) Interpret the estimated coefficient for Income and the associated 95% confidence interval. (4 marks) Interpret the meanings of the estimated coefficients for the educational dummy variables E,, E, and E,. Comment on the significance (at 5% significance level) of these educational dummies, (6 marks) Use the estimated mode! to predict the beer expenditure of a 25-year-old male Australian who has a diploma qualification and who cams $45,000 per year. G marks) How could the above beer consumption model be improved? (2 marks) Why is the analysis of residual plots important in regression modelling? What would you have looked for in a residual plot? (4 marks) (B+3+4+5+144+6434244=35 marks) Page 4 of 10 Question 3 (15 marks) a. List the four components in a time series model. Which appear to be present in Graph A below? (4 marks) Graph A sdisdsaessesesasnasssssasssassosssassosssansossiansossiansossianssssianaan ' (A) y= 1.56x + 258.12 ee er ee SE CE 1990 1991 1992 1999 1994 1995 1996 1997 1958 1899 2000 2001 2002 JBEREEERE g8 Consumption data for a small country is available for the last (completed) 10 years. Quarterly data, 1998.1 to 2007.4, n= 40, ¢= 1 in 1998.1. A linear trend model was fitted giving the output below: SUMMARY OUTPUT Regression Statistics Muttipie R 0.9921 R Square 0.9842 Adj, R Sq 0.9824 Stand. Error 629.1498 Observations, 40. ANOVA Significance ot ss Ms F FE Regression ‘4 862099302 215524825 544.49 5556-31 Residual 35 13854030 395829 Total 39__ 675953332 ‘Standard Upper Coefficients Error tStat___ P-value __Lower 95% 95% Intercept 58256.87 26952 216.15 2.665E-56 $7711.70 8806.03 t 380.68 8.66 43.97 3.2076-32 363.10 398.26 a2 1078.82 281.50 3.83 0.005055 507.36 1650.29 a3 1995.36 281.90 7.08 3.028E-08 1423.07 2567.84 4 4637.08 281.50 16.47 _4612E-18 4065.61 __ 5208.55 Page $ of 10 b, Interpret carefully all of the estimated coefficients in this model, including the intercept. (4 marks) c. The coefficient of Q4 in this model is ,. Use an appropriate hypothesis test to determine (at the 5% level of significance) whether B, = 0 or A, >0. What does that tell us about the first quarter of each year? (4 marks) 4. Forecast consumption for each of the four quarters of 2010 using the above model. (3 marks) (4+4+4+3= 15 marks) Page 6 of 10 Question 4 (35 marks) a. Are each of the following statements true or false? i, When calculating confidence intervals for the mean, we use the t- distribution rather than the normal distribution because we know the population standard deviation. ‘The Central limit theorem only applies to the Normal Distribution. Ifa two-sided test is used when a one-sided test is appropriate, then the probability of a Type 2 error increases iv. Regression can be used to test a hypothesis involving the difference between two means. v. The degrees of freedom is always n-1 where n is the number of data points. (1+1+14141=5 marks) b. The following table gives the frequency of the various grades of results for a particular university subject over the past five years, Result Grade | Males Females HD 363 185 DI 1225 782 cl 79 695 P 658 als F 222 108) Give the conditional probability distribution of grades for Males and for Females. Based on these conditional distributions, determine whether university grades for this subject are independent of gender. (1+1+1=3 marks) c. Grade point averages (GPA) are calculated by giving a score of 4 for HDs, a score of 3 for Ds, a score of 2 for Cs, a score of I for Ps, and a score of 0 for Fs. Calculate the expected value of the GPA for Males using the conditional distribution of grades for Males calculated in (b). (4 marks) 4. Every day, a sample of 12 tins of paint is taken from a production line and the volume of paint in each tin is determined. If the volume in the individual tins has a mean of 4.05 litres and the standard deviation is 0.05 litres, what is the effective range of the daily averages (Using the rule that a standard normal 3 distribution lies between -3 and 3 with probability 0.997)? (4 marks) Page 7 of 10 €. The output below relates to the width-to-length ratios of beaded rectangles used by the Shoshoni Indians of America to decorate their leather goods. Show how the upper bound of the 95% confidence interval has been calculated using the regression coefficient, the standard error (be careful, there are two in the output!), and using TINV(.05,19)=2.093024, Is the data consistent with the golden rectangle (for which the width-to-length ratio is 0.618) just as it was for the Greeks and Egyptians? ‘SUMMARY OUTPUT. Regression Statistics Multiple R 0.99081002 R Square 0,981704496 Adjusted R Square 0929072917 Standard Err 0.092510881 Observations 20 ANOVA di 35 MS. F. Significance F. Regression 8725205 8.725205 109.507 -2.66212E-17 Residual 19 0.162607 0.008888 Total 20. 8.887812 Cooficienss Standard Error __1Siai P-value __Lowsr 95% ___Upper 95% Intercept 0 ANA aN aA ANA ANIA Width-length ratio 0.6605 _0.020686062_31.92971__S.66E-18 __0.617203575 __0.703796425 (3+2=5 marks) Page 8 of 10 f A production plant cost-control engineer is responsible for cost reduction. One of the costly items in his plant is the amount of water used by the production facilities each month. He decided to investigate water usage by collecting seventeen observations on his plant's water usage ( the dependent variable) and relevant explanatory variables. Variable Description ‘Temperature Average monthly temperate (F) Production Amount of production(Million Pounds) Persons Number of persons on the monthly plant payroll Water Monthly water usage (gallons) SUMMARY OUTPUT Regression Sttistios Multiple R 0794924582 R Square 0.631908091 Adjusted R Square 0.546960112 Standard Error 300,6688291 Observations 7 ANOVA wf 5S a FE Significance F Regression 3 2017440.117 6724800389 7.438992841 _0,003783139 Residual 3 1175191.413 99399,33946 Total 16 ___3192631.529 Coofficionss Standard Error Stat Povalue Lower 95% Upper 959% Iniercept 3865943787 1102.644892 —3.5060642050,003868799 1483.8243286248.063247 Temperature 8036413485 $.630357078_1.427336379 0177060319 -4.127233448 —20,20006042. Production 0.193340616 0054354642 3.557021217 —0.0035089140.07591485 —_0,310766681 Persons =19.67626463_8.742446806_-2.250658776__0.042355879__-38.56317265 _-0.789356607 Page 9 of 10 The direction of the Temperature effect on water usage is unclear. Set up appropriate null and alternative hypotheses and use the critical value approach to determine whether to support of reject your hypotheses. Test at the 10% level. (TINV(.05,13)=2.160, TINV(.10,13)=1.771,TINV(.20,13)=1.350) (4 marks) i. It was expected that higher production would lead to more water usage. Set up appropriate null and alternative hypotheses and use the p-value approach to determine whether to support or reject your hypotheses. Test at the 1% level. (4 marks) It was thought that by employing more people that greater water efficiencies would be obtained. Set up appropriate null and alternative hypotheses and use both the critical value approach and p-value approach to determine whether to support or reject your hypotheses. Test at the 5% level. (TINV(.025,13)=2.533 TINV(.05,13)=2.160,TINV(.10,13)=1.771) (6 marks) (5+3+4+445+4+4+6=35 marks) End of Examination Page 10 of 10

You might also like