Professional Documents
Culture Documents
1. Born to be old? Is there a relationship between the gestational period (time from conception to
birth) of an animal and its average life span? The figure shows a scatterplot of the gestational period
and average life span for 43 species of animals as well as the least-squares regression line.
(b) Point A is the hippopotamus. Would you consider this point an outlier, high leverage point, and/or
influential point? Explain your reasoning for each.
Point A has a large residual. Therefore, it’s an outlier. It’s not a high leverage point
because it’s value of gestation is similar to the other remaining data. Point A is
considered a in uential point because its large residual likely in uences the correlation
substantially and if removed, correlation would become closer to 1.
(c) Point B is the Asian elephant. Would you consider this point an outlier, high leverage point, and/or
influential point? Explain your reasoning for each.
Point B is not considered an outlier because its residual is low. However, it should be
considered a high leverage point because it’s value of gestation is high compared to
the rest of the data. Point B would not be considered a in uential point because even
though it may have a high value in the x direction compared to the rest of the data, it’s
residual is low meaning that the slope of the regression line would not change by
much if Point B were to be removed.
2. Penguins diving A study of king penguins looked for a relationship between how deep the penguins
dive to seek food and how long they stay under water. For all but the shallowest dives, there is a linear
relationship that is different for different penguins. The study gives a scatterplot for one penguin titled
“The Relation of Dive Duration (y) to Depth (x).” Duration y is measured in minutes and depth x is in
meters. The report then says, “The regression equation for this bird is: ŷ = 2.69+ 0.0138x .”
(a) What is the slope of the regression line? Interpret this value.
(b) Does the y intercept of the regression make any sense? If so, interpret it. If not, explain why not.
The y intercept predicts that a dive with 0 meters of depth is predicted to take 2.69
minutes. This does not make any sense as it’s impossible to dive with 0 meters of
depth.
(c) According to the regression line, how long does a typical dive to a depth of 200 meters last?
0.013862003 5.45
2.69
(d) One of these penguins dives down to 100 meters and it takes a duration of 3 minutes and 15
seconds. To the nearest tenth of a minute, what is this penguin’s residual? Explain in context what this
value means.
(a) Give the equation of the least-squares regression line for these data. Identify any variables you use.
(c) What’s the correlation between car age and mileage? Interpret this value in context.
The correlation between car age and mileage is strong positive linear.
5837 0.915
(d) Is a linear model appropriate for these data? Explain how you know.
A linear model is appropriate for this data as teh residual plot shows
random scatter and no pattern.
37,12 r 0,85
4
LSRL: 4.686
Slope - The number of days in April until the rst bloom is predicted to decrease by 4.686
days per degree increase in temperature celsius in average temperature in March,
Y intercept - The predicted number of days until rst bloom would be 33.12 if the average
temperature in March was 0 degree celsius.
Correlation - There is a moderately strong negative linear relationship between average March
temperature in Celsius and number of days until the rst bloom in April.
(b) Suppose that the average March temperature this year was 8.2°C. Would you be willing to use the
equation in part (a) to predict the date of first bloom? Explain.
No, I would not be willing to use the equation in part A to predict the date of the rst bloom
because the scope of the data collected does not cover up to 8.2 degrees Celsius.
Therefore, it is possible that a di erent changes in pattern may occur past the scope of the
data that is not represented in the regression line.
(c) Calculate and interpret the residual for the year when the average March temperature was 4.5°C.
Show your work.
The time it took for rst bloom was 2.033 days earlier than what was predicted by
the regression line when the average temperature in March was 4.5 degrees Celsius.
(b) A different researcher wants to use this data to analyze the sleep recommendations for individuals
up to and including 12 years of age only. Describe the association shown in the scatterplot if you ignore
any of the data values of people over the age of 12.
(c) Explain how your answers for (a) and (b) highlight the problem with extrapolating from a previous
set of data values.
The pattern associating age and recommend hours of sleep changes after age 12
as the recommend hours of sleep attens signi cantly. This shows the di culty in
extrapolating because had the data past 12 years not been represented, the
change of pattern might of remained unknown.