You are on page 1of 11
CONFIDENTIAL, CSIDEC 2016/STA408 UNIVERSITI TEKNOLOGI MARA FINAL EXAMINATION COURSE STATISTICS FOR SCIENCE AND ENGINEERING COURSE CODE STA408 EXAMINATION DECEMBER 2016 TIME 3 HOURS IN TO CANDIDATES 1 This question paper consists of five (5) questions, 2 ‘Answer ALL questions in the Answer Booklet. Start each answer on a new page. 3 Do not bring any material into the examination room unless permission is given by the invigitator. 4 Please check to make sure that this examination pack consists of: i) the Question Paper ji) a two-page Appendix 1 iii) an Answer Booklet - provided by the Faculty |v) Statistical Tables — provided by the Faculty 6. ‘Answer ALL questions in English DO NOT TURN THIS PAGE UNTIL YOU ARE TOLD TO D0 SO This examination paper consists of 9 printed pages (© Hak Cipta Universiti Teknologi MARA CONFIDENTIAL, CONFIDENTIAL, 2 CSIDEC 2016/STA40B QUESTION 1 a) ») 5% of all old sawmill sites contain soil residuals of dioxin (an additive used for anti- ‘sap-stain treatment in wood) higher than the recommended level. 50 old sawmill sites are randomly selected for inspection. i) iit) iv) Find the probability that none of the sites exceed the recommended level of dioxin. (2 marks) Calculate the probability that more than 47 sites do not exceed the recommended level of dioxin. (3 marks) By using Poisson approximation, find the probability that at most 5 sites exceed the recommended level of dioxin. (3 marks) How many sites do you expect would contain soil residuals of dioxin higher than the recommended level? (2 marks) The lengths of steel beams produced in a factory follow a normal distribution with mean 4! = 8.25 m and standard deviation o = 0.07 m. i) ii) What is the probability that the length of a steel beam is between 8.2m and 8.4m? (3 marks) From a random sample of 10 factories, find the probability that the mean length of a steel beam is greater than 8.3 m. (4 marks) It is in the factory's interests to be able to supply 98% of its output to a customer that requires beams that are not longer than 8.35 m. The factory decides to improve the precision of the production process by fine-tuning its machinery which will alter the standard deviation, 0. What value of o should the factory aim for so that it can supply 98% of its output to this customer? (3 marks) (© Hak Cipta Universit Teknologi MARA CONFIDENTIAL CONFIDENTIAL, 3 CSIDEC 2016/STA408 QUESTION 2 a) Fill in the blanks. ) iy iv) vy) The sample coefficient quantifies the direction and strength of the linear association between the two variables. (1 mark) In simple linear regression, the number of independent variable (s) is (1 mark) The correlation between two variables can be shown graphically by a diagram. (1 mark) The simple linear regression model assumes there is a relationship between the dependent variable and the independent Variable. (1 mark) ifr = -1, then we can conclude that there is a negative relationship between X and Y. (1 mark) b) Suppose we want to assess the association between HDL cholesterol (in milligrams per deciliter, mg/dL) and body mass index (BMI, measured as the ratio of weight in Kilograms to height in meters*). The BMI and HDL cholesterol of 10 randomly selected samples are given in the following table. BMI 20.3 | 216 [219 | 24 | 25 | 272] 30 | 32 | 268] 318 HOL Cleric Peper a7 |S | 2 Se The data was analyzed and the output is shown below. Regression Analysis: HDL versus BMI Predictor Coef SE Coef ag P Constant 93.399 7.556 12.36 0.000 BMI 1.6032 0.2843 5.64 0.000 S = 3.68369 R-Sq= 79.9% R-Sq(adj) = 77.48 Analysis of Variance Source DE 8s Ms F Pe Regression 1 431.54 431.54 31.80 0.000 Residual Error 8 108.56 13.57 Total 9 540.10 (© Hak Cipta Universiti Teknolog! MARA CONFIDENTIAL CONFIDENTIAL 4 CSIDEC 2016/STA408 i) State the independent and dependent variables. (2 marks) ji) Write the equation of the line that best describes the association between the independent and dependent variable. Interpret the meaning of the regression coefficients, a and b. (4 marks) ii) Determine the correlation coefficient, r (2 marks) iv) At 5% level of significance, test whether the linear regression model is significant. (4 marks) v) Determine the coefficient of determination and interpret its meaning. (2 marks) vi) Estimate Ahmad's HDL cholesterol level if his BMI is 26 (1 mark) QUESTION 3 a) An environmentalist wanted to determine if there are differences in the mean acidity of rain in three locations, A, B and C. He randomly selected six pH readings of the rain at each of the three locations and obtained the data in the following table. A B c 5.41 487 5.46 5.39 5.18 6.29 490 440 5.57 5.14 5.12 5.15 480 489 5.45 5.24 5.06 5.30 The MINITAB output for the above analysis is given below. One-way ANOVA: A, B, C Source DF SSMS F P Factor 2 1.168 0.584 5.81 0.014 Error 15 1.507 0.100 Total 17 2.674 $= 0.3170 R-Sq = 43.668 R-Sq(adj) = 36.148 (© Hak Cipta Universiti Teknolog! MARA, CONFIDENTIAL CONFIDENTIAL, 5 CSIDEC 2016/STA408 i) Show that the total sum of squares is 2.674. (5 marks) ii) Test whether the mean acidity of rain is different for the three locations. Use 00.05. Give your conclusion. (S marks) b) A businessman is interested to examine the effects of types of popper and brands of popcorn on yield of popcorn. Two types of popper and three brands of popcorn are considered. The measurements, in amount of cups yield are taken for each combination and the results are shown in the following table. Brand Popper Xx ¥ z Oil 3,4,3.5 6,5.5,5.5 445,45 | Air 45,5,4 7,7,65 5,5.5,5 The data is analysed. The interaction test conducted showed that there is no interaction between type of popper and brand of com. The output is shown below. ‘Two-way ANOVA: yield versus popper, brand source DF ss Ms F popper 1 4.5000 4.50000 e brand 2 . 7.87500 56.70 Interaction a 0.0833 a 0.30 Error b 1.6667 0.13889 Total 17.22.0000 S = 0.3727 R-Sq= 92.428 R-Sqiadj) = 89.278 Based on the computer output, answer the following questions. i) Find the missing values (a, b, ¢, d, e) in the ANOVA table. (5 marks) ji) Is there sufficient evidence at 5% level of significance to conclude that brands of popcorn has an effect on the yield of popcorn? (5 marks) © Hak Cipta Universiti Teknologi MARA CONFIDENTIAL CONFIDENTIAL, 6 CSIDEC 2016/STA40B QUESTION 4 a) The two methods of learning a computer program are visual and textual manual, A study was conducted to investigate whether there is a difference in the comprehension scores among students between the two methods. A random sample of 36 students are selected and equally divided into two groups. Students from Group 1 use the visual manual, while Group 2 students use the textual manual ‘Answer part (i), (ii) and (ji) based on the computer output. i) Write down a 90% confidence interval for the mean of the comprehension ‘scores for students that use the visual manual. Is there sufficient evidence to conclude that the students learning a computer program using visual manner will get at most 70 marks in their comprehension score? Give a reason for your answer. ‘One-Sample T N Mean —_StDev SE Mean 908 cr 18 62.2000 12.0000 2.8284 (57.2796, 67.1204) (3 marks) ii) Using the p-value approach, is there sufficient evidence to indicate a significant difference on the average comprehension score between students Using the visual manual and students using the textual manual? Test at 5% significance level. ‘Two-Sample T-Test and CI Sample N Mean StDev SE Mean 1 18 62.2 12.0 2.8 2 18 54.40 9.00 2.1 Difference = mu (1) - mu (2) Estimate for difference: 7.80000 95% CI for difference: (0.61493, 14.98507) T-Test of difference = 0 (vs not =): T-Value = 2.21 P-Value = 0.034 DF = 34 Both use Pooled StDev = 10.6066 (5 marks) (© Hak Cipta Universiti Teknologi MARA CONFIDENTIAL, CONFIDENTIAL, 7 CSIDEC 2016/STA408 b) (© Hak Cipta Universiti Teknologi MARA ili) Based on the output below, investigate if there is a difference in the variability of the comprehension score between students using the visual manual and students using the textual manual? Test at 5% significance level. Test and Cl for Two Variances Method Null hypothesis (First) / o(Second) = 1 Alternative hypothesis (First) / o(Second) * 1 Significance level a = 0,05 Statistics 95% CI for Sample N StDev Variance stDevs First 18 12.000 144.000 (9.005, 17.990) Second 18 9.000 81.000 (6.753, 13.492) Ratio of standard deviations = 1.333 Ratio of variances = 1.778 Tests Test Method DF1 DF2 Statistic P-Value F 747 1.78 0.246 (5 marks) The shell thickness (in millimeter) of the birds eggs recorded for 10 randomly selected eggs are shown as follows: 0.15 0.29 0.32 039 0.25 0.18 0.26 0.37 035 0.20 i) Calculate the unbiased e: te of the population variance, o?. (2 marks) ii) At 5% significance level, test the hypothesis that the true standard deviation of shell thickness of the eggs is less than 1 millimeter. Assume that the shell thickness of the eggs is normally distributed (5 marks) CONFIDENTIAL, CONFIDENTIAL, 8 CSIDEC 2016/STA4O8 QUESTION 5 a) b) Determine whether each of the following statements is TRUE or FALSE. i) vy) A researcher will use the F distribution to make inference on a population variance. (1 mark) If a small sample is selected randomly from a normal population with known. variance, the t distribution will be used to construct a 99% confidence interval for w. (1 mark) The significant level, a, is the probability of rejecting Ho when it is in fact true. (1 mark) Do not reject Ho if p-value less or equal to significance level. (1 mark) The t distribution always has n degrees of freedom. (1 mark) In an effort to improve the performance of students in a Statistics course, a lecturer provides a weekly one hour tutorial for 10 selected students. Each student was given a test twice, before and after completing a 14 weeks session. The test score results are gi n below. Student [1 2 3] 4 6 6 7 8 9 [10 | Before 45 [55 [40 [70 | 63 | 35 | 61 | 30 | 50 | 44 After (COR 7 Os [SORT] 75 | 70 1G5 Be | SNSO | CO ME | m7 OM EOS| ‘The data collected was analysed and the output is shown below. Paired T-Test and Cl: before, after Paired T for before - after N Mean StDev SE Mean before 10 49.3000 12.8932 4.0772 after 10 64.4000 7.6478 2.4184 Difference 10 -15.1000 12.1879 3.8541 95% upper bound for mean difference: -8.0349 T-Test of mean difference = 0(vs <0): T-Value = -3.92 P-Value = 0.002 (© Hak Cipta Universiti Teknologi MARA CONFIDENTIAL CONFIDENTIAL 9 CSIDEC 2016/STA408 Based on the output given, answer the following question. i) Show that the standard deviation of the paired differences is 12.1879 (4 marks) ii) Test at 5% significance level whether attending the tutorial helped to improve the students’ performance. ( marks) A study was conducted to identify the number of years taken by students to pay up their study loans after they graduate. A random sample of 30 graduates was selected and a mean of 13 years and a standard deviation of 4.71 years were obtained. i) Construct a 90% confidence interval for the mean years taken to pay up the ‘study loans. ( marks) ji) Based on the confidence interval obtained in (i), can we conclude that a graduate will take at most 17 years to pay up his/her study loans? (1 mark) END OF QUESTION PAPER © Hak Cipta Universiti Teknologi MARA CONFIDENTIAL CONFIDENTIAL, APPENDIX 1(1) CSIDEC 2016/STA408 FORMULAE Binomial probability formula ] 7 rexan-| tar Finnosa, A x Poisson probability formula oo taX x P(X =x)= x! CONFIDENCE INTERVALS Parameter and description A (1-a) 100% confidence interval Mean, 11, variances , 6” unknown, small sample Difference in means, 1, — 12, variances o? = 63 and unknown Difference in means, 1, —u2, variances a? # 63 and unknown 2 — sh s: (%-Ky)tte f+ 2; Sa\ng nz (s? Jn, +82 In)? (sf inay? (83 119)? Mean difference for paired samples, 14 Variance, o? Ratio of the variances o} /o3 2 F, ea si st 83 Fue) Vie M1, Ve= m= 4 ‘al2vva SE (© Hak Cipta Universiti Teknologi MARA CONFIDENTIAL CONFIDENTIAL APPENDIX 1(2) CSIDEC 2016/STA408 HYPOTHESIS TESTING Null Hypothesis Test Statistic Hy 7 = Ho 0” unknown Ho: Hy — Ha variances «7 = 3 and unknown Ho Hy — He =D variances o? # 03 and unknown __ (sf 1m +53 Ing? (sh lng)? (S31 ng)? ny 1-1 df= n—1 where nis no. of pairs Hy : Hy =D z Hy 0? =03 By ODS ten o Z Hy 0? =03 Fou =S ema dveenent CORRELATION AND REGRESSION Product Moment Correlation Soy. VSSxxSSyy where Least-squares regression line of Y against X Ssxy Where b = SS% ere P= 38x anda =7-bx (© Hak Cipta Universiti Teknologi MARA CONFIDENTIAL

You might also like