You are on page 1of 3

Assignment-3

Q1.)
a.) The data is about amount of cigarettes smoked and amount of deaths from cancer in
the US.
Q2.)
a.)
Plot - 1
25
LUNG

20

15
15 20 25 30 35 40
CIG

b.)
Plot-2
25
f(x) = 0.54207608695448 x + 6.27407356137824
R² = 0.516318125148469

20
LUNG

15
15 20 25 30 35 40
CIG
c.) Excel file

d.) Equation: Lung = 6.274 + 0.5421Cig + U, B0 is 6.274, B1 is 0.5421 meaning that when the
amount of cigarettes smoked increase by 100 this will cause an increase in lung cancer
deaths by 0.5421.

e.)

Q1= 21.21, Q3 = 28.09, IQR = 6.88, Upper bound = 38.41, Lower bound = 10.89
outliers = 40.46 cig and 42.4 cig.

3.)
Unrestricted model equation:
Lung = B0 + B1Cig + B2 Predicted lung^2 + B3 Predicted Lung^3 + U
F-test = 4.286 while F critical at 0.05 = 1.298, F-test is greater than critical value which means
null hypothesis is rejected and the restricted model is not appropriate for current situation.
4.)
a.) Excel file
b.) Old Regression equation: Lung = 6.274 + 0.5421Cig + U
New Regression equation: Lung = 2.271 + 0.7159 Cig + U
c.) Yes, the number of the coefficient changed as B0 decreased from 6.274 to 2.271 while B1
for cigarettes increased from 0.5421 to 0.7159.
d.)
PLot-3
25
f(x) = 0.71591568658058 x + 2.27107202370234
R² = 0.559350926734987

20
LUNG

15
15 20 25 30 35 40
CIG
The outlier observation is influential because the new plot has changed after removing them
and the old one was more stretched as if it was being pulled by the outlier observations.

You might also like