Professional Documents
Culture Documents
Laboratory 08 Linear Regression Models Solution
Laboratory 08 Linear Regression Models Solution
Question 1:
An investigation of the relationship between traffic flow (thousands of cars per day ) (X) and the lead content of
tree bark near the highway [ug/g] (Y) dry weight produced the following data: a) Calculate the covariance and
indicate whether there is a relationship between the variables, and of what type the relationship is.
X Y
8.3 227
8.3 312
12.1 362
12.1 521 COVARIANCE = 2050.0512396694
17 640
17 539
17 728
24.3 945
24.3 738
24.3 759
33.6 1263
The covariance indicates that there is a direct relationship between the variables, i.e., the greater the flow of cars,
the higher the lead content in the trees.
b) Construct a scatter plot, including the linear regression model and its coefficient of determination.
Linear Model: y = 36.184x - 12.842
Interpretation pending:
When the average number of cars increases by 1,000 cars per day, it
is estimated that the average lead content in tree bark increases by
36,184(ug/g).
c) Construct the ANOVA table and the linear regression model using the EXCEL Regression tool. How is the F value
obtained interpreted?
Page 1 of 18 Form 1
Mathematics Program
Statistics II Exam General Training Directorate
The critical value is less than 0.05, therefore the hypothesis that the variable Flow is not significant is rejected. At a
significance level of 5%, it can be affirmed that vehicular flow influences the lead content in trees.
Question 2:
The director of a company thinks that the demand for a product he markets depends solely on the retail price. The
following sample data are available:
19 1000
18 1200
16 1300
15 1400
15 1500
14 1700
14 2000
13 2100
12 2200
13 2000
a) Calculate the covariance and indicate if there is a relationship between the variables, and what type of
relationship it is.
The covariance indicates that there is an inverse relationship between the variables, i.e., the higher the price of the
product, the lower the quantity of items sold.
b) Construct a scatter plot, including the linear regression model and its coefficient of determination.
Linear Model: y = -177.28x + 4281.5
Interpretation pending:
When the price of the item increases by $1, the quantity of items sold
is estimated to decrease by approximately 177 items.
R² = 0,8909
............................................................
Coefficient of determination: According to the linear model, 89.09% of the variability in product
demand is explained by price variability.
Interpretation:
Page 2 of 18 Form 1
Mathematics Program
Statistics II Exam General Training Directorate
c) Construct the ANOVA table and the linear regression model using the EXCEL Regression tool. How is the F value
obtained interpreted?
The critical value is less than 0.05, therefore the hypothesis that the price variable is not significant is rejected. At a
significance level of 5% it can be stated that price influences the demand for the item.
Question 1 https://www.youtube.com/watch?v=BVc9WIftQj0&t=1s
Page 3 of 18 Form 1
Question 1 b1
Covariance: -796
Lead content and transit flow
b0
b1
Covariance: 2050.0512396
694
Question 2
Q
u
a
n
tit
y
Price (US$)
b0
Summary
Regression statistics
Multiple correlation coefficient 0.9561851758
Coefficient of determination R^2 0.9142900904
R^2 adjusted 0.9047667671
Standard error 92.1909593246
Remarks 11
ANALYSIS OF VARIANCE
Degrees of freedom Sum of squares
Regression 1 815966.170441948
Waste 9 76492.5568307796
Total 10 892458.727272727
Summary
Regression statistics
Multiple correlation coefficient 0.9438702737
Coefficient of determination R^2 0.8908910936
R^2 adjusted 0.8772524803
Standard error 146.9815072531
Remarks 10
ANALYSIS OF VARIANCE
Degrees of freedom Sum of squares
Regression 1 1411171.4922049
Waste 8 172828.5077951
Total 9 1584000
S$)
Pri
(U
ce
Mean squares F Critical value of F
815966.170441948 96.005361021 4.2387567145E-06
8499.1729811977
Statistic t Probability Lower 95% Upper 95% Lower 95.0% Lower 95.0%
12.9695165442 1.183465E-06 3520.2532494761 5042.7757038 3520.25324948
-8.0821554745 4.056887E-05 -227.8652712677 -126.70043029 -227.86527127
Higher 95.0%
Higher 95.0%
Higher 95.0%
Higher 95.0%
Higher 95.0%
Higher 95.0%
150.3569639231
44.537757979
Higher 95.0%
Higher 95.0%
5042.775703753
Higher 95.0%
Higher 95.0%
Higher 95.0%
Higher 95.0%
-126.700430291
1400
Summary
Regression statistics
Multiple correlation coefficient 0.9561851758
Coefficient of determination R^ 0.9142900904
R^2 adjusted 0.9047667671
Standard error 92.1909593246
Question 1: 1
Question 2: 2
Lead content and transit flow 4
b0 b1 4
Price and Quantity Sold 4
b1 5
Coefficients
Interception -12.8415535691
X 36.1838481556
Relationship between vehicle flow and lead content
I---------------------------------------1-------------------------------------1-------------------------------------I
15 20 25 30
Standard error Statistic t Probability Lower 95%. Upper 95% Lower 95.0%
72.1428734085 Upper 95.0% Lower
-0.1780016925 0.862663702 -176.040071395 150.3569643 95.0%
-176.04007
3.6928954265 9.7982325458 4.238757E-06 27.8299383152 44.537758 27.8299383
I "I
35 40
Upper 95.0% 150.35696426 44.537757996
Question 1 https://www.youtube.com/watch?v=BVc9WIftQj0&t=1s