Professional Documents
Culture Documents
How To Interprete The Minitab Output of A Regression Analysis
How To Interprete The Minitab Output of A Regression Analysis
Step I:
Model: From the description of the problem, it says that this a time series data where the weight of soap
depends on the number of days it had been used. Thus dependent variable(y) is weight of the soap and
independent variable is the number of days (x).
We wish to fit a liner model Y = + x
Step II:
The following scatter diagram shows that
1. there is a inverse relationship between x and y, that is as the number of days increase, weight of the
soap decreases.
2. We see a distinct liner trend among the data points supporting our model in step I.
Scatterplot of Weight vs Day
140
120
Weight
100
80
60
40
20
0
0
10
15
20
25
Day
Interpretation: the line intersects y axis at 123 with a slope of -5.57. that is on the day=0,
weight is 123gm and for each increase in a day, the weight of the soap decreases on the
average by 5.57 grams.
Predictor
Constant
Day
Coef
123.141
-5.5748
SE Coef
1.382
0.1068
T
89.09
-52.19
P
0.000
0.000
Interpretation: the sample estimates of alpha and beta are 123.141 and -5.57
respectively. The corresponding test statistics are 89.09 and -52.10 indicating that these
are too large values of t-statististics and lie on the extreme ends of t-curve.
Thus we reject the null hypothesis of alpha =o and beta=o. And conclude that the beta
and alpha play a significant role in the regression model.
S = 2.94921
R-Sq = 99.5%
R-Sq(adj) = 99.5%
Interpretation: the standard deviation of the error terms is 2.94. A 99.5% R-sqadj
indicates that when ever we observe a variation in the value of y, 99.5% of it is due to
the model (or due to change in x) and only .5% is due error or some unexplained factor.
That is this data fits well to the linear model.
Analysis of Variance
Source
Regression
Residual Error
Total
DF
1
13
14
SS
23694
113
23807
MS
23694
9
F
2724.11
P
0.000
Interpretation: In this case ANOVA tests the hypothesis that beta=0. In fact F is nothing
but T-square. A low p-value suggest that beta plays a significant role in the model, this is
just reassurance of the t-test.
Unusual Observations
Obs
10
15
Day
12.0
22.0
Weight
50.000
6.000
Fit
56.244
0.496
SE Fit
0.772
1.418
Residual
-6.244
5.504
St Resid
-2.19R
2.13R
Step 5:
Checking the validity of the assumptions:
We made the assumptions that the all the error terms are identically and independently
normally distributed with mean 0 and common variance sigma square.
99
5.0
Residual
Percent
90
50
10
1
0.0
-2.5
-5.0
-5.0
-2.5
0.0
Residual
2.5
5.0
5.0
4.5
2.5
3.0
1.5
0.0
30
60
Fitted Value
90
120
6.0
Residual
Frequency
2.5
0.0
-2.5
-5.0
-6
-4
-2
0
2
Residual
9 10 11 12 13 14 15
Observation Order
Interpretation:
1. the graph on top left checks the assumption of normality of error terms. In this
case we see that most of the points are clustered around blue line indication that
the error terms are approximately normal. Thus our assumption of normality is
valid.
2. The graph on top right plots the error terms against the fitted values. There are
approximately half of them are above and half are below the zero line indicating
that our assumption of error terms having mean zero is valid.
3. On the same graph we see the clear cyclic pattern among the error terms
indicating that they are violating the assumption of independence of error. Error
terms are not independent. May be there is another factor present in this
example which we need to find out.
4. The bottom left graph again re-emphasizes the normality assumption. Though
our sample size is just 15.
5. The bottom right graph is also important in this case because data is a time
series and order of the data is important. A clear cyclic pattern indicates that
error terms are dependent on the time variable.
Step VI:
Although the beta is significant and R sq adj is very high indicating that model is a very
good fit to the data, there is violation of assumption of independence indicate that there
is some other factor which is playing role behind the screen and we may have to study it
further.
Step VII:
Let us estimate the value of y and interpret it
Say for x = 14 we find and interval for the average value of y
y-hat = 123 - 5.57 * 14 = 45.02
that is we expect that on the average the expected value of weight on the 14th day
approx 45 grams.
98% confidence interval:
45.02 t * .8441 = 45.02 2.326*.8441= (43.0565, 46.9635)
We are 98% confidant that that on the 14 th day the weight of the soap on the average
lies between 43 grams and 47 grams approx.
98% prediction interval:
45.02 2.326* 3.1163 = (41.9036, 48.1363)
We are 98% confidant that on the 14th day the predicted value of the weight of the
soap lies between 42 grams and 48 grams approx.
140
120
100
Regression
95% CI
95% PI
S
R-Sq
R-Sq(adj)
80
2.94921
99.5%
99.5%
60
40
20
0
0
10
15
20
Day
25
140
120
Regression
95% CI
95% PI
100
S
R-Sq
R-Sq(adj)
60
40
20
0
0
10
15
20
25
Day
99
2
Residual
Percent
90
50
10
0
-2
-4
1
-5.0
-2.5
0.0
Residual
2.5
5.0
60
90
Fitted Value
120
2
Residual
3
2
1
0
30
4
Frequency
Weight
80
1.95599
99.8%
99.8%
0
-2
-4
-4
-3
-2
-1
0
Residual
1 2
3 4 5 6 7 8 9 10 11 12 13 14 15
Observation Order