find your documents

Attribution Non-Commercial (BY-NC)

655 views

find your documents

Attribution Non-Commercial (BY-NC)

- Simple Regression 2-10-12
- Fixed Income > Yield Curve Building With Bonds
- (c) (d) (b)
- 902959
- Replacement Analysis of Aging Equipments
- Project Report on tata steel shares
- SLRAssumGraphs
- ANOVA 2-way
- 12-1-1
- Sugar Water Diffusivity Sample Lab Report
- Writing an Honours Thesis - Joe Wolfe - UNSW
- chapter9.pdf
- 06 Analysis of Variance
- Chapter 9
- Sea Surface Temper 00 and e
- Man-hour Estimation Model and Its Comparison of Interim Products Assembly for Shipbuilding
- Climatological Mean
- The Business Outsourcing in Telecommunication Industry Case of Pakistan
- FinalPresentation_Jadamus
- Madsen Et Al-2002-Water Resources Research

You are on page 1of 9

Stat 112

1. When, in 1982, average Scholastic Achievement Test (SAT) scores were first published on a state-by-state basis in the United States, the huge variation in the scores was a source of great pride for some states and of consternation for others. Average scores ranged from a low of 790 (out of a possible 1,600) in South Carolina to a high of 1,088 in Iowa. Two researchers set out to figure out how certain variables are associated with state SAT differences.1 The variable SAT is the average total SAT (verbal+quantitative) score in the state and the two explanatory variables considered are the following: Takers Expend percentage of the total eligible students (high school seniors) in state who took the exam total state expenditure on secondary schools, expressed in hundreds of dollars per student

Response SAT Whole Model Actual by Predicted Plot

1100 1050 1000 SAT Actual 950 900 850 800 750 750 800 850 900 950 1000 1050 1100

Summary of Fit

RSquare RSquare Adj Root Mean Square Error Mean of Response Observations (or Sum Wgts) 0.808786 0.800472 31.93721 948.449 49 Sum of Squares 198456.79 46919.33 245376.12 Estimate 932.41448 4.2985226 Std Error 22.16843 1.025343 Mean Square 99228.4 1020.0 F Ratio 97.2841 Prob > F <.0001 Prob>|t| <.0001 0.0001

Analysis of Variance

Source Model Error C. Total Term Intercept EXPEND

1

DF 2 46 48

Parameter Estimates

t Ratio 42.06 4.19

B. Powell and L.C. Steelman, Variations in State SAT Performance: Meaningful or Misleading?, Harvard Educational Review 54(4), 1984: 389-412.

TAKERS

-3.07411 Nparm 1 1 DF 1 1

0.2206

-13.94

Effect Tests

Source EXPEND TAKERS Sum of Squares 17926.44 198071.21

100

SAT Residual

50

SAT Predicted

For questions (a)-(e), assume the ideal multiple linear regression model holds. (a) For Pennsylvania, SAT=885, TAKERS=50 and EXPEND=27.98. What would you predict Pennsylvanias average SAT score to be based on knowing its TAKERS and EXPEND, but not knowing its SAT? What is the residual for Pennsylvania? (b) Is there strong evidence that the multiple regression model provides better predictions of SAT than just using the sample mean of SAT to predict SAT? Use a test at the .05 level to justify your answer. (c) Find an approximate 95% confidence interval for the coefficient on TAKERS. (d) Is there strong evidence that total state expenditures (EXPEND) helps to predict a states average SAT score once TAKERS has been taken into account? Use a test at the . 05 level to justify your answer. (e) The two states with the largest Cooks distances are Alaska and South Carolina with Cooks distances of 2.06 and 0.18 respectively and leverages of 0.44 and 0.09 respectively. For each state (Alaska, South Carolina), answer whether it would be justified to delete the state from the analysis and report that we omitted the state and that our conclusions only hold for a reduced range of explanatory variables, not including the explanatory variables of the state.

(f) Suppose we want to use either Takers or Log(Takers) in the multiple regression. On the basis of the below information, which of these two forms would you choose to use? Explain.

Bivariate Fit of SAT By TAKERS

1100 1050 1000 SAT 950 900 850 800 750 0 10 20 30 40 50 60 70 TAKERS

Linear Fit:

Linear Fit

SAT = 1020.3062 - 2.7599621 TAKERS

Summary of Fit

RSquare RSquare Adj Root Mean Square Error Mean of Response Observations (or Sum Wgts) 0.735838 0.730335 36.79525 947.94 50

Analysis of Variance

Source Model Error C. Total DF 1 48 49 Sum of Squares 181024.09 64986.73 246010.82 Mean Square 181024 1354 F Ratio 133.7066 Prob > F <.0001

Parameter Estimates

Term Intercept TAKERS Estimate 1020.3062 -2.759962 Std Error 8.139082 0.238686 t Ratio 125.36 -11.56 Prob>|t| <.0001 <.0001

100 Residual 50 0 -50 -100 0 10 20 30 40 50 60 70

TAKERS

SAT = 1112.2477 - 59.018822 Log(TAKERS)

Summary of Fit

RSquare RSquare Adj Root Mean Square Error Mean of Response Observations (or Sum Wgts) 0.810762 0.80682 31.14298 947.94 50

Analysis of Variance

Source Model Error C. Total DF 1 48 49 Sum of Squares 199456.33 46554.49 246010.82 Mean Square 199456 970 F Ratio 205.6494 Prob > F <.0001

Parameter Estimates

Term Intercept Log(TAKERS) Estimate 1112.2477 -59.01882 Std Error 12.27496 4.11554 t Ratio 90.61 -14.34 Prob>|t| <.0001 <.0001

50 Residual 0 -50 -100 0 10 20 30 40 50 60 70 TAKERS

2. The number of car accidents on a particular stretch of highway seems to be related to the number of vehicles that travel over it and the speed at which they are traveling. A city alderman has decided to ask the county sheriff to provide him with statistics covering the last few years, with the intention of examining these data statistically so that he can (if possible) introduce new speed laws that will reduce traffic accidents. Using the number of accidents as the response variable, he obtains estimates of the number of cars passing along a stretch of road (subtracted from the mean number of cars passing along a stretch of the road) and their average speeds (in miles per hour, subtracted from the mean average speed) for 60 randomly selected days. (a) JMP output from simple linear regressions of (i) Accidents on Speed and (ii) Cars on Speed are shown below. Would you expect the estimated coefficient on Speed to increase, decrease or stay the same in a multiple linear regression of Accidents on Speed and Cars as compared to the estimated coefficient of Speed in the simple linear regression of Accidents on Speed. Justify your answer using the omitted variable bias formula.

RSquare RSquare Adj Root Mean Square Error Mean of Response Observations (or Sum Wgts) 0.021001 0.004122 2.430355 7.033333 60

Parameter Estimates

Term Intercept Speed Estimate -8.018052 0.2508495 Std Error 13.49733 0.224888 t Ratio -0.59 1.12 Prob>|t| 0.5548 0.2693

RSquare RSquare Adj Root Mean Square Error Mean of Response Observations (or Sum Wgts) 0.003515 -0.01367 1.222004 9.935 60

Parameter Estimates

Term Intercept Speed Estimate 13.003931 -0.051147 Std Error 6.786575 0.113076 t Ratio 1.92 -0.45 Prob>|t| 0.0603 0.6527

(b) JMP output from a multiple linear regression of Accidents on Cars, Speed and Cars*Speed is shown below. Is there strong evidence of an interaction between Cars and Speed? Justify your answer using a test at the .05 level.

Response Accidents Summary of Fit

RSquare RSquare Adj Root Mean Square Error Mean of Response Observations (or Sum Wgts) 0.743622 0.729887 1.265725 7.033333 60

Analysis of Variance

Source Model Error C. Total DF 3 56 59 Sum of Squares 260.21801 89.71533 349.93333 Mean Square 86.7393 1.6021 F Ratio 54.1424 Prob > F <.0001

Parameter Estimates

Term Intercept Cars Speed Cars*Speed Estimate 7.1405117 0.4158119 0.0644162 1.0763228 Std Error 0.163638 0.136049 0.118519 0.087791 t Ratio 43.64 3.06 0.54 12.26 Prob>|t| <.0001 0.0034 0.5889 <.0001

(c) The alderman proposes decreasing the speed limit by 5 MPH. The number of cars on the road is higher on average on weekdays than the weekends. Assuming that the average number of cars will not be changed by decreasing the speed limit and that there are no confounding variables, would you expect the decrease in the speed limit to have a larger impact on the number of accidents during the weekends or the weekdays? 3. Car designers have been experimenting with ways to improve gas mileage for many years. An important element in this research is the way in which a cars speed affects how quickly fuel is burned. Competitions whose objective is to drive the farthest on the smallest amount of gas have determined that low speeds and high speeds are inefficient. Designers would like to know which speed burns gas most efficiently. As an experiment, 50 identical cars are driven at different speeds and the gas mileage measured. (a) JMP output from a simple linear regression model of Mileage on Speed is shown below. Comment on the regression diagnostics the residual plot, the histogram of the residuals and the boxplot of the Cooks distances. If you see any problems, suggest what you would do next in the analysis to try to address those problems.

Bivariate Fit of Mileage By Speed

40 35 30 Mileage 25 20 15 10 5 0 10 20 30 40 50 60 70 80 90 100 110 Speed

Linear Fit

Linear Fit

Mileage = 23.266776 - 0.0012701 Speed

Summary of Fit

RSquare RSquare Adj Root Mean Square Error Mean of Response Observations (or Sum Wgts) 0.000028 -0.02081 7.102586 23.202 50

Analysis of Variance

Source Model Error C. Total DF 1 48 49 Sum of Squares 0.0672 2421.4426 2421.5098 Mean Square 0.0672 50.4467 F Ratio 0.0013 Prob > F 0.9710

Parameter Estimates

Term Intercept Speed Estimate 23.266776 -0.00127 Std Error 2.039431 0.034802 t Ratio 11.41 -0.04 Prob>|t| <.0001 0.9710

-15

-10

-5

10

15

0.2

0.15

0.1

0.05

(b) JMP output for a quadratic regression of mileage on speed and speed squared is shown below. Is there strong evidence that the quadratic regression provides better predictions of mileage based on speed than the simple linear regression? Justify your answer using a test at the .05 level.

RSquare RSquare Adj Root Mean Square Error Mean of Response Observations (or Sum Wgts) 0.710249 0.697919 3.863732 23.202 50 Estimate 9.3413673 0.8021188 -0.007876 Std Error 1.70707 0.077207 0.000734 t Ratio 5.47 10.39 -10.73 Prob>|t| <.0001 <.0001 <.0001

Parameter Estimates

Term Intercept Speed Speed Squared

40 35 Mileage Actual 30 25 20 15 10 5 5 10 15 20 25 30 35 40 Mileage Predicted P<.0001 RSq=0.71 RMSE=3.8637

Summary of Fit

RSquare RSquare Adj Root Mean Square Error Mean of Response Observations (or Sum Wgts) 0.710249 0.697919 3.863732 23.202 50 Sum of Squares 1719.8740 701.6358 2421.5098 Estimate 9.3413673 0.8021188 -0.007876 Mean Square 859.937 14.928 F Ratio 57.6040 Prob > F <.0001 Prob>|t| <.0001 <.0001 <.0001

Analysis of Variance

Source Model Error C. Total Term Intercept Speed Speed Squared DF 2 47 49

Parameter Estimates

Std Error 1.70707 0.077207 0.000734 t Ratio 5.47 10.39 -10.73

10

Mileage Residual

-5 5 10 15 20 25 30 35 40

Mileage Predicted

40 Mileage Leverage Residuals 35 30 25 20 15 10 5 0 10 20 30 40 50 60 70 80 90 100 Speed Leverage, P<.0001

40 Mileage Leverage Residuals 35 30 25 20 15 10 5 0 1000 3000 5000 7000 9000 Speed Squared Leverage, P<.0001

(c) Suppose you are low on gas. Which speed does the quadratic regression model suggest that it is best to drive at 20 MPH, 50 MPH or 70 MPH? Justify your answer.

- Simple Regression 2-10-12Uploaded byDon Ho
- Fixed Income > Yield Curve Building With BondsUploaded byapi-27174321
- (c) (d) (b)Uploaded byMathathlete
- 902959Uploaded bytunha
- Replacement Analysis of Aging EquipmentsUploaded byvaibhav_kapoor
- Project Report on tata steel sharesUploaded byRonal Mukherjee
- SLRAssumGraphsUploaded byKhushbakht Kanwal Baloch
- ANOVA 2-wayUploaded byWiween Mihad
- 12-1-1Uploaded byMahfuzur Rahman
- Sugar Water Diffusivity Sample Lab ReportUploaded byz77ia
- Writing an Honours Thesis - Joe Wolfe - UNSWUploaded byrizkibizniz
- chapter9.pdfUploaded bynmukherjee20
- 06 Analysis of VarianceUploaded byJerome Teaño
- Chapter 9Uploaded byAniket Batra
- Sea Surface Temper 00 and eUploaded bypacotao123
- Man-hour Estimation Model and Its Comparison of Interim Products Assembly for ShipbuildingUploaded byJuri
- Climatological MeanUploaded byJokoRaharjo
- The Business Outsourcing in Telecommunication Industry Case of PakistanUploaded byRashid Javed
- FinalPresentation_JadamusUploaded byMax Greco
- Madsen Et Al-2002-Water Resources ResearchUploaded byayman_awadallah
- MlrUploaded byShilpi Jain
- BU_385_Class_5 Forecasting Part 3 (1)Uploaded byQuinn Nguyen
- 330_Lecture8_2014Uploaded byAnonymous gUySMcpSq
- Using+Matlab+in+Mutual+Funds+EvaluationUploaded bycons the
- UJI LAGIUploaded byAnnisa Aisyha Malik
- autokorelasiUploaded byghania
- ch12Uploaded byTonoy Peter Corraya
- EUISKARTIKASARI_SPSS.docxUploaded byTemin
- RegressionUploaded byFaheemullah Haddad
- dadoUploaded byBruno Marins

- District IUIUUploaded byMuwaga Musa Iganga Musa
- CoursesUploaded byMuwaga Musa Iganga Musa
- The Influence of Value $ Sexual Self-regulationUploaded byMuwaga Musa Iganga Musa
- Sexual BehaviorsUploaded byMuwaga Musa Iganga Musa
- PsychologyUploaded byMuwaga Musa Iganga Musa
- TR YOUUploaded byMuwaga Musa Iganga Musa
- Insperational QuotesUploaded byMuwaga Musa Iganga Musa
- The valueUploaded byMuwaga Musa Iganga Musa
- ValuesUploaded byMuwaga Musa Iganga Musa

- Tanaka&FarahUploaded byakshaybhargava3000
- Utilization of Corncob Particles as Structural BoardUploaded byMohammad Nasif Basman Saga
- Exp5 Sensory Intro,Obj,Method,ErrorUploaded byAlimah Azeli
- CHAPTER 2Uploaded byचन्दनप्रसाद
- Comparison of Isoflavones Composition in Seed, Embryo, Cotyledon and Seed Coat of Cooked With Rice and Vegetable Soybean (Glycine Max L.) VarietiesUploaded byMuhammad Subchi Wira Putratama
- effect of suggestopedia on the critical thinking .pdfUploaded byJonie Kurt
- Analysis of Selected Physical Fitness of University, Inter-University & State Level Male Hockey PlayersUploaded byInternational Journal of Innovative Science and Research Technology
- Split MaizUploaded byLuis López-Pérez
- Zulkurnain Et Al. - 2013 - Optimization of Palm Oil Physical Refining Process for Reduction of 3-Monochloropropane-1,2-Diol (3-MCPD) EstUploaded byMoch Dimas Khoirul Umam
- nieu_prelims.pdfUploaded byShishir Modak
- Facilities Equipment as Predictor of Sport Development in Edo State NigeriaUploaded byنور امالينا زاينون
- Role of Demographic Factors on Job Satisfaction Among the Star Category Hotel EmployeesUploaded byijteee
- A Short List of the Most Useful R CommandsUploaded bycristiansolomon1754
- EFFECTS OF JOB STRESS ON WORK FAMILY CONFLICT AND DEVIANT BEHAVIOR. a RESEARCH PROJECT OF CAL CENTERSUploaded byMuhammad Asadullah Shah Pirzada
- Mechanical+ReliabilityUploaded byalbertofgv
- Degrees of Freedom: A Correction to Chi-Square for Physical HypothesesUploaded byJohn Michael Williams
- Classroom LearningUploaded byMadalina Scorus
- StataUploaded byAkbar 'Kanserio' Bahar
- Corn Stalk Borer Infestation on SorghumUploaded bypaspartu121
- The Impact of Social Support and Self-Esteem on Adolescent Substance Abuse Treatment OutcomeUploaded byThe Stacie Mathewson Foundation
- Twoway AnovaUploaded byArvind Kumar
- TQMUploaded byMandar Borkar
- The Effect of Alcoholism Toward AcademicUploaded byJayson Fabrigas Villafranca
- Abid ResearchMethods VadodraUploaded byVivek Sharma
- A Comparison of Job Satisfaction Between It and Non-IT Women Incumbents in Clerical, Professional, And Managerial PositionsUploaded byChristina Hea
- etr 522 khalifa elgosbi external projectUploaded byapi-398642997
- SPSS IntroductionUploaded byGarrison Doreck
- Activity 5Uploaded byHermis Ramil Tabhebz
- Final Report Presentation - EditUploaded bythulile
- Growth and Nitrogen Relations in the Mat-forming Lichens Stereocaulon paschale and Cladonia stellarisUploaded byAnton Gebre Hanna

## Much more than documents.

Discover everything Scribd has to offer, including books and audiobooks from major publishers.

Cancel anytime.