You are on page 1of 3

Homework 5 Due Tuesday Nov 29th 2022

You will need to refer to the document titled “Hypothesis testing further details” in
order to solve question 3. You are expected to know the concepts and formulae (you
will be allowed a cheatsheet) for the final exam

1) We have already discussed the VW emissions scandal and the corresponding data in
class. In addition to testing the two VW models in their on-road tests, the West Virginia
University researchers also tested a BMW model (The interesting fact about these tests is
that the researchers were not particularly interested in testing these brands; they had
just settled on these by chance. Had VW not been picked randomly by them, it would
have avoided detection and the ensuing 15 billion $ scandal that followed).

Just as with the VW cars, the researchers tested two BMW cars on the road (urban in LA)
and found that the sample mean NO/gm emitted for the two cars was 0.07 with a sample
standard deviation of 0.041.

Carry out the test of hypothesis for this data to test whether there is evidence that the
BMW emissions is in excess of the U.S. acceptable standard of 0.04 gm/km. You must
state the appropriate null and alternative hypotheses and support your conclusion with a
p-value

2) Several opticians have introduced on-site labs that can prepare eyeglass lenses in one
hour and conventional opticians are sceptical of the accuracy of these labs. As a test, lens
prescriptions of various reasonably representative types were randomly sent to on-site
labs and the error in the lens curvature was measured carefully (The error can be negative
or positive, depending on whether the curvature is not sufficient enough, or more than
sufficient). The data on errors for the sample is in the file LENSERROR.

Is there evidence that the average error is different from zero? Support your answer with
a p-value

3) The files ZAGAT2014Village and ZAGAT2014OuterBoroughs give results from a sample


of restaurant reviews in the 2014 Zagat restaurant guide, of restaurants sampled from
the Village and from outside Manhattan, respectively. The data is on the cost of dinner
(including one drink and tip), ratings for food, décor and service (the maximum possible
rating is 30 in any category).
An urban planner is interested in testing whether the average cost of dinner (including
one drink and tip) in Zagat rated restaurants in Manhattan is higher than the
corresponding average cost of dinner in Zagat rated restaurants outside Manhattan
a) State the null and alternative hypotheses of interest
b) Carry out the relevant test of hypothesis and provide your conclusion. In doing
this, you must show the formula for any relevant test statistic that you compute
as well as the number obtained for it
4) The file ToyotaPrices36orMore has data on roughly 1200 second hand Toyota cars that
were 36 or more months old that were sold in the Netherlands. It contains two columns,
one with the price (in Euros) for which the second hand car was sold and the other the
age of the car in months (how old the car model was). Below is a scatterplot of the price
versus the age of the cars

Scatterplot of Price vs AgeInMonths


20000

17500

15000

12500
Price

10000

7500

5000

30 40 50 60 70 80
AgeInMonths

a) Fit a simple regression model to predict the car prices based on their age. What is the
equation of the fitted line?

b) Based on the output that you get from the regression, is there evidence of a linear
relationship between price and age of the car? Specify any number(s) that you base your
conclusion on

c) According to your model, what is the average predicted price in Euros for a second
hand Toyota which is 40 months old?

d) A second hand Toyota dealer has a Toyota that is 40 months old and she quotes a
price of 16,100 Euros for it. Based on all the analysis done so far in the earlier questions,
do you think that this price is reasonable or do you think that it is too high, for the age
of the car? Justify your answer in a couple of sentences

e) Make a plot of the standardized residuals versus the fitted values and provide it. Note
that it does not show any pattern (a funnel shape or a U/inverted U shape). However,
after inspecting it, are there any cars in the dataset that seem to be priced unusually
relative to their age? Explain your answer briefly in a sentence or two
5) In November 2001, the NYTimes published an article titled “A small dose of
common sense would help Congress break deadlock over airport security”. The
article considered the different factors that could impact the quality of security
screening at airports. One of the factors that it considered was the turnover rate (a
measure of how quickly employees leave the job) of airport security personnel and
its potential impact on how good the security screening was.
The article mentioned a study that considered the turnover rate at 19 airports
across the country and also the violations detected (per million passengers) at each
of those airports; the article reported that the study found that a lower turnover
rate (i.e. employees stay in their job for a longer period) was associated with a
greater likelihood of detecting violations (i.e. a large number of violations detected
per million passengers) and thus advocated for measures that would reduce the
turnover rate in order to increase the quality of the security screening
The original article in the newspaper also had the data for these two variables across
the 19 airports and you can find that data in the file AirportViol
Below is a scatter plot of violations detected per million passengers (Y) versus the
turnover rate (X).

Scatterplot of ViolDet vs TurnRate


35

30

25
ViolDet

20

15

10

5
0 100 200 300 400
TurnRate

Fit a simple regression model to the above data. Based on the analysis (you need not
worry about checking the assumptions of regression, they are not violated here), is there
evidence that the violations detected per million passengers are related to the turnover
rate at airports?

What does this make you feel about the policy prescription of taking measures to reduce
turnover rate in order to improve security

You might also like