You are on page 1of 83

STATIS

TICAL DATA YSIS – I


& MINISTATISTICAL DATA ANALYSIS - 2
Step by Step Guide to SPSS & MINITAB

i
First Edition
Copyright © (2020) Lakmini U. Mallawarachchi

First Edition June 2020

ISBN: 979-8653861543

All rights reserved. No part of this book may be reproduced, stored in a


retrieval system or transmitted by any means, electronic, mechanical,
photocopying, recording or otherwise, without written permission of
the publisher, except in the case of brief quotations embodied in critical
reviews permitted by copyright law.

ii
STATISTICAL DATA ANALYSIS - 2
Step by Step Guide to SPSS & MINITAB

Lakmini U. Mallawarachchi

MSc in Business Statistics Degree (University of Moratuwa,

Sri Lanka)

Master of Financial Economics Degree (University of Colombo,

Sri Lanka)

BSc in Business Management (Special-Project Management) Degree

(NSBM Green University, Sri Lanka)

iii
Preface

Statistical Data Analysis-2, Step by Step Guide to SPSS & MINITAB, takes
a straight forward, step by step approach that makes familiar to SPSS
and MINITAB softwares.

This book covers the topics of simple linear regression, multiple


regression, polynomial regression and non-linear regression analysis
techniques using SPSS and MINITAB, in a simple language with several
examples to make easier for a beginner to understand with less effort.
Most importantly, this book is ideal for undergraduates who need to
complete their data analysis in research studies using SPSS and
MINITAB softwares.

I hope that this book will be very much useful to students, instructors
and researchers in applied and social sciences. Additionally, this can also
be used as a self-study material and text book.

Any suggestions to further improve the contents of this edition would be


warmly appreciated.

Lakmini U. Mallawarachchi

June 2020

iv
Table of Contents

CHAPTER ONE: SIMPLE LINEAR REGRESSION ............................................. 1

1.1 Regression analysis ................................................................................. 1


1.2 Correlation analysis: ................................................................................ 1
1.2.1 Correlation Strength............................................................................. 2
1.2.2 Correlation matrix ................................................................................ 5
1.3 Simple Linear Regression ....................................................................... 8
1.3.1 Test the significance of the model .................................................. 9
1.3.2 Test the significance of the parameters ....................................... 12
1.3.3 Measure of Model Adequacy ......................................................... 14
1.3.4 Examining the unusual observations ........................................... 28

CHAPTER TWO: MULTIPLE REGRESSION ................................................... 34

2.1 Introduction ............................................................................................ 34


2.1.1 Test the significance of the model ................................................ 35
2.1.2 Variable Selection methods for the model .................................. 35
2.1.3 Multicollinearity .............................................................................................35

CHAPTER THREE: POLYNOMIAL REGRESSION .......................................... 57

3.1 Introduction ............................................................................................ 57


3.2 Quadratic Regression ............................................................................ 57
3.2.1 Test the significance of the model ................................................ 52
3.2.2 Test the significance of the parameters....................................... 54

CHAPTER FOUR: NON LINEAR REGRESSION .............................................. 65

4.1 Introduction ............................................................................................ 65


4.1.1 Intrinsically linear models............................................................. 65

v
4.1.2 Intrinsically non linear models ..................................................... 65
4.2 Exponential model ................................................................................. 66
4.2.1 Test the significance of the model ................................................ 73
4.2.2 Test the significance of the parameters ....................................... 73
4.2.3 Diagnostic Testing for Errors ........................................................ 73
REFERENCES .................................................................................................... 76

vi
Examples

Example 1.1 ............................................................................................................................. 2

Example 1.2 ............................................................................................................................. 5

Example 1.3 ...........................................................................................................................10

Example 1.4 ...........................................................................................................................15

Example 1.5 ...........................................................................................................................28

Example 2.1 ...........................................................................................................................36

Example 3.1. ..........................................................................................................................49

Example 4.1 ...........................................................................................................................55

vii
CHAPTER ONE: SIMPLE LINEAR REGRESSION

1.1 Regression analysis

Regression analysis a statistical methodology used to analyze data and


determine the relationships among the variables.

1.2 Correlation analysis

Correlation indicates the strength of the relationship between two


variables. It is measured through correlation coefficient. Generally, Karl
Pearson’s correlation coefficient and Spearmen's Rank correlation are
used to find out the correlation coefficient between variables. Sample
correlation coefficient is denoted as ‘r’ and the population correlation
coefficient is measured through ‘𝜌’. Correlation coefficient can be either
positive, negative or none. Correlation coefficient is always in between -
1 to +1 and it can be manually calculated using any of the following
formulas.
̅̅
√ ̅ ̅

̅ ̅
√ ̅ ̅

√ √

Where, ̅ = mean of x variable, ̅= mean of y variable

1
1.2.1 Correlation Strength

Strength of the correlation depends on the value of the correlation


coefficient and it is determined based on the calculated values as
follows.

0.00 – 0.20 Weak or none


0.20 – 0.40 Weak
0.40 – 0.60 Moderate
0.60 – 0.80 Strong
0.80 – 0.10 Very strong

Example 1.1: If X and Y are two random variables, find whether there is
a relationship between X and Y using SPSS and MINITAB.

X Y
15 30
25 45
30 60
35 65
40 75
45 80
50 105
55 120
60 135

In SPSS, Step 1: Analyze  Correlate Bivariate

2
Step 2: In the ‘Bivariate Correlations’ dialogue box, add the required
variables in to the list of variables that need to analyze. According to this
example, selected variables are X and Y. Then press ok to proceed.

Step 3: Generated SPSS output is given below.

Correlations

X Y

Pearson Correlation 1 .982**

X Sig. (2-tailed) .000


N 9 9
**
Pearson Correlation .982 1

Y Sig. (2-tailed) .000


N 9 9
**. Correlation is significant at the 0.01 level (2-tailed).

3
According to the above output, there is a strong positive correlation of
0.982 between the variables X and Y.

In MINITAB,

Step 1: Stat  Basic Statistics Correlation

Step 2: In the ‘correlation’ dialogue box, add the required variables in to


the list of variables that need to analyze. According to this example,
selected variables are X and Y. Then press ok to proceed.

4
Step 3: Generated MINITAB output is given below.

Correlations: X, Y

Pearson correlation of X and Y = 0.982


P-Value = 0.000

According to the above output, there is a strong positive correlation of


0.982 between the variables X and Y.

1.2.2 Correlation matrix

Correlation matrix is a table showing correlation coefficients between a


set of variables. When there is more than one independent variable (X
variables) with one dependent variable, need to get a set of correlation
coefficients in a form of a matrix.

Example 1.2: Develop a correlation matrix of X, X2, X3, X4 and Y using


SPSS and MINITAB.

Y X1 X2 X3 X4
27 20 50 75 15
23 27 55 60 20
18 22 62 68 16
26 27 55 60 20
23 24 75 72 8
27 30 62 73 18
30 32 79 71 11
23 24 75 72 8
22 22 62 68 16
24 27 55 60 20
16 40 90 78 32
28 32 79 71 11
31 50 84 72 12
22 40 90 78 32
24 20 50 75 15

5
31 50 84 72 12
29 30 62 73 18
22 27 55 60 20

In SPSS,

Step 1: Analyze  Correlate Bivariate

Step 2: In the ‘correlation’ dialogue box, add the required variables in to


the list of variables that need to analyze (X1, X2, X3, X4 and Y). Then
press ok to proceed.

6
Step 3: Generated SPSS output is given below.

Correlations
Y X1 X2 X3 X4

Pearson Correlation 1 .373 .059 .048 -.522*

Y Sig. (2-tailed) .127 .815 .852 .026

N 18 18 18 18 18
**
Pearson Correlation .373 1 .758 .288 .192

X1 Sig. (2-tailed) .127 .000 .247 .444

N 18 18 18 18 18
** *
Pearson Correlation .059 .758 1 .555 .099
X2 Sig. (2-tailed) .815 .000 .017 .697

N 18 18 18 18 18
*
Pearson Correlation .048 .288 .555 1 .060
X3 Sig. (2-tailed) .852 .247 .017 .813

N 18 18 18 18 18
Pearson Correlation -.522* .192 .099 .060 1

X4 Sig. (2-tailed) .026 .444 .697 .813

N 18 18 18 18 18
*. Correlation is significant at the 0.05 level (2-tailed).
**. Correlation is significant at the 0.01 level (2-tailed).

According to the above correlation matrix, correlation between Y and X1


is 0.373, which indicates that there is a weak positive correlation
between Y and X1. The significant value of 0.127, just below 0.373
indicates that the relationship is not significant at 0.05 significant level
as (sig > 0.05). Similarly, between X2 and Y, there’s weak positive
correlation of 0.059 and it’s statistically not significant as sig >0.05. In
the case of X4 and Y, there’s a negative and a moderate level relationship
between Y and X4 and its statistically significant as (p) sig value <0.05.

7
1.3 Simple Linear Regression

When there is an association between two variables, we often interested


in trying to determine the relationship between the two variables. If X
and Y are two random variables, the linear relationship between a
dependent variable (Y) and one independent variable (X) is known as
‘simple linear regression.

Simple linear regression model can be expressed as,

Where, intercept, slope, = Error term

and are the two unknown parameters which can be estimated


using a given data set. The two unknown parameters are known as
coefficients.

𝑦 β β 𝑥 𝜀

The parameters can be estimated manually using the following formulas


obtained using the Least Square Estimation (LSE) technique.

8
̅ ̅

1.3.1 Test the significance of the model

ANOVA or the ‘Analysis of Variance’ is a statistical method used to test


the overall significance of the model.

Analysis of Variance (ANOVA)

Source DF SS MS=SS/DF F
Regression k-1 SSR MSR=SSR MSR/MSE
Residuals (Errors) n-k-1 SSE MSE=SSE/(n-k-1)
Total n-1 SST

Where, DF= degrees of freedom, k= number of parameters, n= number


of observations, SS= sum of squares, SSR= regression sum of squares,
SSE= error sum of squares, SST = total sum of squares, MSE=mean
square of errors, MSR= mean square of regression, F= F statistic

Components in the ANOVA table can be manually calculated using the


following formulas.

̅
̂ ̅
̂

̅ = ̂ ̂ ̅

9
Where, ̅= mean of y variable, ̂ estimated y value

Example 1.3: Company A wants to find out whether the interest rate (X)
has a significance influence on the number of clients (Y) who deposit the
fixed deposits. Analyze the data using SPSS and MINITAB.

X Y
12 265
14 228
16 242
18 260
20 286
22 291
24 320
26 352
28 396

In SPSS,

Step 1: Analyze  Regression Linear

10
Step 2: In the ‘Linear Regression’ dialogue box, select ‘Y’ as the
dependent variable and ‘X’ as the independent variable and press the ok
button.

Step 3: Generated SPSS output is given below.

Regression

Model Summary
Model R R Square Adjusted R Std. Error of the
Square Estimate
a
1 .911 .829 .805 23.970
a. Predictors: (Constant), X

The above ‘Model Summary’ table indicates the strength of the


relationship between the model and the dependent variable. Generally,
R square shows the percentage of variability explained by the

11
independent variables. According to the above model, R Square is 0.805
i.e. 80.5% of the variability is explained by the overall model
ANOVAa
Model Sum of Squares df Mean Square F Sig.
Regression 19548.150 1 19548.150 34.023 .001b
1 Residual 4021.850 7 574.550
Total 23570.000 8
a. Dependent Variable: Y
b. Predictors: (Constant), X

According to the above ANOVA table, the F value (34.023) is significant


as the corresponding P value (0.001) is less than 0.05. Therefore, it can
be concluded with 95% confidence that the fitted model is significant.

1.3.2 Test the significance of the parameters

a
Coefficients
Model Unstandardized Coefficients Standardized t Sig.
Coefficients
B Std. Error Beta
(Constant) 112.833 31.960 3.530 .010
1
X 9.025 1.547 .911 5.833 .001
a. Dependent Variable: Y

As shown in the above ‘coefficient’ table, both P values of the two


parameters are less than 0.05. Therefore, it can be concluded that both
parameters are significantly different from zero. Based on the values in
the ‘B’ column under the ‘Unstandardized Coefficients’ column, fitted
model equation can be written as,

Y = 112.833 + 9.025X

12
Number of clients = 112.833 + 9.025* interest rates

The above formula indicates that one unit increase of interest rate would
increase the number of clients by 9.025 times.

In MINITAB,

Step 1: Stat  Regression Regression

Step 2: In the ‘Regression’ dialogue box, select ‘Y’ as the dependent


variable and ‘X’ as the independent variable and press the ok button.

13
Step 3: Generated MINITAB output is given below.

Regression Analysis: Y versus X

The regression equation is


Y = 113 + 9.03 X 1

Predictor Coef SE Coef T P


Constant 112.83 31.96 3.53 0.010
X 9.025 1.547 5.83 0.001

S = 23.9698 R-Sq = 82.9% R-Sq(adj) = 80.5%

Analysis of Variance

Source DF SS MS F P
Regression 1 19548 19548 34.02 0.001
Residual Error 7 4022 575
Total 8 23570

Unusual Observations

Obs X Y Fit SE Fit Residual St Resid


1 12.0 265.00 221.13 14.73 43.87 2.32R

R denotes an observation with a large standardized residual.

Similar results were obtained from the MINITAB output as in SPSS.


Unlike in SPSS, MINITAB generates the model equation 1 in the
regression output.

1.3.3 Measure of Model Adequacy

Once the model has been identified, it is necessary to test the


assumptions for the residuals or the error term. These assumptions are;

 Errors should be random: In order to test the randomness of


errors, Durbin Watson (DW) statistic is used. If DW statistic ~ 2.0,
the errors are considered to be random.

14
 Errors should be normally distributed: This is tested by using the
histogram for errors and if the errors are normally distributed,
shape of the histogram should be symmetric. Normality assumption
is checked by using the Anderson Darling (A-D) test.

 Error mean should be zero: Usually this is tested by using the plot
of residuals vs fitted values or by using the one sample t test.

 Errors should have a constant variance (homoscedasticity): This


is tested by using the plot of residuals vs fitted values.

**Note: These assumptions need to be satisfied, in order to use the


model for predictions and forecasting.

Example 1.4: Test the assumptions for errors using the data given in
the above example.

 Test for the randomness of errors

In MINITAB,

Step 1: Stat RegressionRegression

15
Step 2: In the ‘Regression’ dialogue box, include ‘Y’ as the response
variable and ‘X’ as the predictors.

Step 3: Click ‘options’ button in the ‘Regression’ dialogue box, and select
‘Durbin Watson’ statistic. Then press ok button to proceed.

Step 4: Generated MINITAB output is given below.

Durbin-Watson statistic = 1.06128

16
In SPSS,

Step 1: Analyze RegressionLinear

Step 2: In the ‘Linear Regression’ dialogue box, include ‘X’ as dependent


variable and ‘Y’ as independent variable.

Step 3: Click ‘statistics’ button in the ‘Linear Regression’ dialogue box.


Then select ‘Durbin Watson and select ‘continue’ to proceed.

17
Step 3: Click ‘statistics’ button in the ‘Linear Regression’ dialogue box.
Then select ‘Durbin-Watson and select ‘continue’ to proceed. Generated
SPSS output is indicated below.

Model Summaryb
Model R R Square Adjusted R Std. Error of the Durbin-Watson
Square Estimate
1 .911a .829 .805 23.970 1.061
a. Predictors: (Constant), X
b. Dependent Variable: Y

As the Durbin –Watson statistic is 1.061, it can be concluded that


errors are non-random. If the errors are random, DW statistics ~
2.0.

 Test for normality of errors

In MINITAB,

18
Step 1: Stat RegressionRegression

Step 2: In the ‘regression’ dialogue box, select ‘Y’ as the response and
select ‘X’ as the predictors. Then click ‘storage’ button to proceed.

Step 3: In the ‘storage’ dialogue box, select ‘residuals’ and press ok


button to proceed.

Step 4: The ‘residuals’ are listed in the data window as follows.

Step 5: Then click Stat Basic StatisticsNormality Test

19
Step 6: In the ‘Normality Test’ dialogue box, select ‘RESI1’ as the variable
and press ok to proceed.

Step 7: Generated MINITAB output is given below.

20
Probability Plot of RESI1
Normal
99
Mean -6.31594E-14
StDev 22.42
95 N 9
AD 0.820
90
P-Value 0.020
80
70
Percent

60
50
40
30
20

10

1
-50 -25 0 25 50
RESI1

As indicated in the above graph, Anderson Darling (AD) test is used


to test the normality of errors assumption. The value of the test
statistic (AD = 0.820) is significant as P value is 0.020 (P <0.05).
Therefore it can be claimed that the errors are not normally
distributed. The ‘probability plot of Residuals’ indicates that the
residuals are deviated from the line.

In SPSS,

Step 1: Analyze RegressionLinear

Step 2: In the ‘Linear regression’ dialogue box, select ‘Y’ as the


dependent variable and select ‘X’ as the independent variable. Then click
‘save’ button to proceed.

Step 3: In the ‘save’ dialogue box, select ‘standardized’ under residuals


and press ok button to proceed.

21
Step 4: Generated SPSS output is given below.

Step 5: Analyze RegressionExplore

22
Step 6: In the ‘Explore dialogue box, select ‘Standardized Residuals’ in to
the dependent list and select ‘plots’ button to proceed.

Step 7: In the ‘Plots’ dialogue box, select ‘Normality plots with tests’ and
press ‘continue’ button to proceed.

23
Step 8: Generated SPSS output is given below.

24
Tests of Normality
a
Kolmogorov-Smirnov Shapiro-Wilk
Statistic df Sig. Statistic df Sig.
Standardized Residual .295 9 .023 .806 9 .024
a. Lilliefors Significance Correction

Normality assumption can be tested using ‘Kolmogorov-Smirnov’ or


‘Shapiro-Wilk’ tests. Therefore, as shown in the above output of SPSS,
both (p) sig. values are lesser than 0.05, which indicates that the
assumption of normality of errors is violated.

 Test for the mean of errors

According to the below graph, error mean value is indicated as -


6.31594E-14. i.e. mean of errors is zero.

Probability Plot of RESI1


Normal
99
Mean -6.31594E-14
StDev 22.42
95 N 9
AD 0.820
90
P-Value 0.020
80
70
Percent

60
50
40
30
20

10

1
-50 -25 0 25 50
RESI1

25
 Test for the constant variance of errors

In MINITAB,

Step 1: Stat RegressionRegression

Step 2: In the ‘regression’ dialogue box, select ‘Y’ as the response and
select ‘X’ as the predictors. Then click ‘storage’ button to proceed.

Step 3: In the ‘storage’ dialogue box, select ‘residuals’ and ‘fits’. Then
press ok button to proceed.

Step 4: In the ‘graphs’ dialogue box, under ‘Residual plots,’ select


‘residuals vs ‘fits’ and press ok button to proceed.

26
Step 5: Generated MINITAB outputs are given below.

Residuals Versus the Fitted Values


(response is Y)
50

40

30

20
Residual

10

-10

-20

200 220 240 260 280 300 320 340 360 380
Fitted Value

According to the above graph, it’s clear that data points are not scattered
randomly. This confirms that the errors are not having a constant
variance.

27
Conclusion: According to the above observations, the fitted model is (Y
= 112.833 + 9.025X) do not fulfill the assumptions of residuals.
Therefore, this model cannot use for the predictions or for the forecasting
purposes.

1.3.4 Examining the unusual observations

In MINITAB,

Step 1: Stat RegressionRegression

Step 2: In the ‘regression’ dialogue box, select ‘Y’ as the response and
select ‘X’ as the predictors. Then click ‘storage’ button to proceed.

Step 3: In the ‘storage’ dialogue box, select ‘residuals’ and ‘fits’. Then
press ok button to proceed

Step 4: Generated MINITAB output is given below.

Unusual Observations

Obs X Y Fit SE Fit Residual St Resid


1 12.0 265.00 221.13 14.73 43.87 2.32R

R denotes an observation with a large standardized residual.

The first observation is identified as unusual. If it’s removed from the


data set, and run the regression, then the results will be more
meaningful and accurate.

Example 1.5: A financial analyst is interested in studying companies


going public for the first time. He is keen to know the relationship
between size of offering (X) and the price per share (Y). A random

28
sample of 15 companies that recently went public revealed the
following.

Company Size ($min) Price (per share)


1 90.0 11.2
2 179.2 12.1
3 71.9 11.1
4 97.9 11.2
5 93.5 11.0
6 72.0 10.7
7 125.0 11.3
8 98.5 11.4
9 87.0 10.8
10 79.6 10.9

a). Is the correlation between size and price share statistically


significant?

b). Calculate the regression equation for this data.

c). Conduct the ANOVA table for the regression analysis and interpret
the results.

d). Test the validity of the model using suitable statistics.

e). Carry out the diagnostic test to check the assumption of errors.

f). Do you think the hypothesis of the financial analyst can be rejected?
Justify the reasons statistically.

a).

Correlations: Size ($min), Price (per share)


Pearson correlation of Size ($min) and Price (per share) = 0.905
P-Value = 0.000

29
The correlation coefficient between 2 variables (r=0.905, p=0.000) is
significantly greater than zero and there is a strong positive correlation
between the size and price per share.

b).

a
Coefficients
Model Unstandardized Coefficients Standardized t Sig.
Coefficients
B Std. Error Beta
(Constant) 10.059 .193 52.105 .000
1
Size .011 .002 .905 6.014 .000
a. Dependent Variable: Price

Regression equation, Price = 10.059 + 0.011* Size

c).

b
Model Summary
Model R R Square Adjusted R Std. Error of the Durbin-Watson
Square Estimate
a
1 .905 .819 .796 .1781 2.063
a. Predictors: (Constant), Size
b. Dependent Variable: Price

The above ‘Model Summary’ table indicates the strength of the


relationship between the model and the dependent variable. Generally,
R square shows the percentage of variability explained by the
independent variables. According to the above model, R Square is 0.819
i.e. 81.9% of the variability is explained by the overall model.

30
a
ANOVA
Model Sum of Squares df Mean Square F Sig.
b
Regression 1.147 1 1.147 36.164 .000
1 Residual .254 8 .032
Total 1.401 9
a. Dependent Variable: Price
b. Predictors: (Constant), Size

According to the above ANOVA table, the F value (36.164) is significant


as the corresponding P value (0.000) is less than 0.05. Therefore, it can
be concluded with 95% confidence that the fitted model is significant.

e) Diagnostic test for errors

 Test for the randomness of errors

Durbin-Watson statistics = 2.063

As the Durbin –Watson statistic is 2.063, it can be concluded that errors


are random.

 Test for the mean of errors

According to the below graph, error mean value is indicated as


1.776357E-16. i.e. mean of errors is zero.

31
 Test for the normality of errors

Probability Plot of RESI1


Normal
99
Mean 1.776357E-16
StDev 0.1679
95 N 10
AD 0.268
90
P-Value 0.599
80
70
Percent

60
50
40
30
20

10

1
-0.4 -0.3 -0.2 -0.1 0.0 0.1 0.2 0.3 0.4
RESI1

As indicated in the above graph, Anderson Darling (AD) test is used to


test the normality of errors assumption. The value of the test statistic
(AD=0.268) is significant as P value is 0.599 (P >0.05). Therefore it can
be claimed that the errors are normally distributed. The ‘probability plot
of Residuals’ indicates that the residuals are deviated from the line.

 Test for the constant variance of errors

Residuals Versus the Fitted Values


(response is Price (per share))
0.3

0.2

0.1
Residual

0.0

-0.1

-0.2

-0.3
11.00 11.25 11.50 11.75 12.00
Fitted Value

32
According to the above graph, it’s clear that data points are scattered
randomly. This confirms that the errors are having a constant variance.

Conclusion: According to the above observations, the fitted model


(Price = 10.059 + 0.011*Size) fulfill the assumptions of residuals.
Therefore, this model can be used for the predictions or for the forecasting
purposes.

33
CHAPTER TWO: MULTIPLE REGRESSION

2.1 Introduction

This section discusses about fitted models developed for the response
variable (Y) when there is more than one independent variable (X). The
general linear model in multiple regression can be written in the form
of;

The model for the ith observation can be written as,

,
1

The coefficients or the parameters are known


as ‘partial correlation coefficients’.

There are few assumptions that need to be satisfied in multiple


regression models such as;

 Relationship between dependent and the independent variables


should be in the linear form.
 Residuals should be normally distributed.
 Absence of multi-collinearity among the independent variables.
 Error variances of independent variables need to be constant.

34
2.1.1 Test the significance of the model

ANOVA or the ‘Analysis of Variance’ is a statistical method used to test


the overall significance of the model.

Analysis of Variance (ANOVA)

Source DF SS MS=SS/DF F
Regression p SSR MSR=SSR/p MSR/MSE
Residuals (Errors) n-p-1 SSE MSE=SSE/(n-p-1)
Total n-1 SST

Where, DF= degrees of freedom, p= number of explanatory variables, n=


number of observations, SS= sum of squares, SSR= regression sum of
squares, SSE= error sum of squares, SST = total sum of squares,
MSE=mean square of errors, MSR= mean square of regression, F= F
statistic.

2.1.2 Variable Selection methods for the model

There are few approaches that can be used to select the variables in to
the multiple regression models. They are;

 Fit all the variables (Best Subset Regression)

First approach is to consider all the variables and by examining the


results, deciding whether any of these variables can be dropped from
the equation.

In MINITAB,

Step 1: Stat RegressionBest Subsets

35
Step 2: In the ‘Best Subset Regression’ dialogue box, select ‘Y’ as the
response, ‘X1-X4’ as the predictors and press ok to continue.

Step 3: Generated MINITAB output is given below.

36
 Forward Selection (FS) Method

This indicates adding each variable at a time and testing the significance
of the model. FS is manually calculated using the below formula.

H0: Adding X1 to the model while having X1 is not significant.

Test statistic,

In SPSS,

Step 1: Analyze RegressionLinear

Step 2: In the ‘Linear Regression’ dialogue box, select ‘Y’ as the


dependent, ‘X1-X4’ as the Independents and press ok to continue.

37
 Backward Elimination Method

This indicates removing each variable at a time and testing the


significance of the model. This is manually calculated by using the
following formula.

Ho: Removing X1 from the model is significant.

Test Statistic;

F critical = (number of extra terms, df residuals)

In SPSS,

Step 1: Analyze RegressionLinear

38
Step 2: In the ‘Linear Regression’ dialogue box, select ‘Y’ as the
dependent, ‘X1-X4’ as the Independents and press ok to continue.

 Stepwise regression

In this method, each variable is added to the model in a sequential way


as at each step the significance of the model is checked to make sure that
their significance has not been reduced below the specific tolerable
level. If any of the variable is not significant, that will be removed from
the model. There’s a specific option given in softwares to get the results
automatically (Refer Example 2.1).

2.1.3 Multicollinearity

The problem of multicollinearity exists, when there is a correlation


between two independent variables. When there are many independent
variables in a data set, it’s essential to find out whether the

39
multicollinearity exists between them, because it will lead to mis-
interpretation of the results generated.

Detecting Multicollinearity

 Correlation between variables should be greater than 0.5.


 Any of the pairwise correlation between explanatory (X) variables
should be greater than the maximum correlation value in between
(x,y) variables.
 There should be a large F statistic and a smaller t statistic.
 Based on the Variation Inflation Factor (VIF).

1
1 1
1

- If VIF is greater than 5.0, there’s having the problem of multi


collinearity.

 Based on the ‘Conditional Index Number’ (K) determined using the


Eigen values (λ).

- If K < 100, there’s no serious problem of multicollinearity


- If 100 < K < 1000, there’s moderate to strong
multicollinearity.
- If K > 1000, there’s severe multicollinearity.

Solutions for Multicollinearity

 Increase the sample size.

40
 Use step wise regression and identify the most significantly
influential variables to the model
 Use forward selection method and add the independent variables to
the model.
 Remove the collinear independent variables from the model.

Example 2.1: Analyze the data in the table and develop a suitable
model.

Y X1 X2 X3 X4
26 27 55 60 20
23 24 75 72 8
27 30 62 73 18
30 32 79 71 11
23 24 75 72 8
22 22 62 68 16
24 27 55 60 20
16 40 90 78 32
28 32 79 71 11
31 50 84 72 12
22 40 90 78 32
24 20 50 75 15
31 50 84 72 12
29 30 62 73 18
22 27 55 60 20

1. Get the correlation matrix.

In SPSS,

Step 1: Analyze CorrelateBivariate

41
Step 2: In the ‘Bivariate’ Dialogue box, select all the variables (Y, X1, X2,
X3 & X4) in to the ‘variables’ column and press ‘options’ button.

Step 3: Generated SPSS output is given below.

42
Correlations
Y X1 X2 X3 X4
*
Pearson Correlation 1 .361 .024 -.069 -.576
Y Sig. (2-tailed) .186 .932 .806 .025
N 15 15 15 15 15
**
Pearson Correlation .361 1 .731 .344 .194
X1 Sig. (2-tailed) .186 .002 .209 .489
N 15 15 15 15 15
** *
Pearson Correlation .024 .731 1 .637 .112
X2 Sig. (2-tailed) .932 .002 .011 .692
N 15 15 15 15 15
*
Pearson Correlation -.069 .344 .637 1 .131
X3 Sig. (2-tailed) .806 .209 .011 .641
N 15 15 15 15 15
*
Pearson Correlation -.576 .194 .112 .131 1
X4 Sig. (2-tailed) .025 .489 .692 .641
N 15 15 15 15 15

Interpretation

According to the above correlation matrix, correlation between Y and X1


is 0.361, which indicates that there is a weak positive correlation
between Y and X1. The significant value of 0.186, just below 0.361
indicates that the relationship is not significant at 0.05 significant level
as (sig>0.05). Similarly, between X3 and Y, there’s weak negative
correlation of -0.069 and it’s statistically not significant as sig >0.05. In
the case of X4 and Y, there’s a negative and a moderate level relationship
between Y and X4 and its statistically significant as (p) sig value <0.05.

43
Draw a scatter plot and get the correlation matrix in MINITAB

Step 1: Graph Scatter plotSimple Select ‘Y’ for the Y variables and
‘X1, X2, X3, X4’ for the X variables and click ‘multiple graphs’ button.

Step 2: In the ‘Multiple Graphs’ dialogue box, select ‘In separate panels
of the same graph’ and press ok to proceed.

44
Step 3: Generated MINITAB output is given below.

Scatterplot of Y vs X1, X2, X3, X4


X1 X2
30

25

20

15
20 30 40 50 50 60 70 80 90
Y

X3 X4
30

25

20

15
60 65 70 75 80 10 15 20 25 30

Step 4: Stat Basic StatisticsCorrelation

45
Step 5: In the ‘Correlation’ Dialogue box, select all the variables (Y, X1,
X2, X3 & X4) in to the ‘variables’ column and press ‘ok’ button.

Step 6: Generated MINITAB output is given below.

46
2. Test the significance of the model.

In SPSS,

Step 1: Analyze RegressionLinear

Step 2: In the ‘Linear Regression’ Dialogue box, select ‘Y’ as the


dependent variable and ‘X1, X2, X3 & X4’ as the dependent variables and
press ‘statistics’ button.

47
Step 3: In the ‘Linear Regression: Statistics’ Dialogue box, select
‘Collinearity diagnostics’ and under Residuals, select ‘Durbin- Watson’
and press continue to proceed.

Step 3: Generated SPSS output is given as follows.

Model Summaryb
Model R R Square Adjusted R Std. Error of the Durbin-Watson
Square Estimate
1 .847a .717 .604 2.628 2.747
a. Predictors: (Constant), X4, X2, X3, X1
b. Dependent Variable: Y

The R square of the above table indicates that 71.7 % of the observed
variability has been captured by the fitted model.
ANOVAa
Model Sum of Squares df Mean Square F Sig.
Regression 175.342 4 43.836 6.348 .008b
1 Residual 69.058 10 6.906
Total 244.400 14

48
According to the ANOVA table, it can be seen that F value (6.348) is
statistically significant as the corresponding P value (0.008) is less than
0.05. Therefore, it can be concluded with 95% confidence that the fitted
model is statistically significant.

a
Coefficients
Model Unstandardized Standardized t Sig. Collinearity
Coefficients Coefficients Statistics
B Std. Error Beta Tolerance VIF
(Constant) 26.743 8.795 3.041 .012
X1 .418 .115 .937 3.635 .005 .425 2.350
1 X2 -.199 .094 -.658 -2.122 .060 .294 3.404
X3 .084 .159 .120 .529 .608 .554 1.805
X4 -.395 .097 -.700 -4.051 .002 .946 1.057

According to the above ‘coefficients’ table, all the VIF values of the
variables are less than 5, which indicates the absence of the multi
collinearity problem. Further, sig. column indicates that, coefficients of
X1 and X4 variables are statistically significant as sig. (p) values are
lesser than 0.05. Both coefficients of X2 and X3 variables are not
significant as p values are greater than 0.05. The fitted model can be
written as,

Y = 26.743 + 0.418 X1 – 0.199 X2 + 0.084 X3 – 0.395 X4

As X2 and X3 are not significant, it’s better to carry out the stepwise
regression method in order to find out the best fitted model.

In SPSS, Step 1: Analyze RegressionLinear

49
Step 2: In the ‘Linear Regression’ Dialogue box, select ‘Y’ as the
dependent variable and ‘X1, X2, X3 & X4’ as the dependent variables and
select Method as ‘Stepwise’ and press options button.

Step 3: In the ‘Linear Regression: Options’ Dialogue box, choose Entry as


‘0.05’ and Removal as ‘0.06’ and press continue.

50
Step 4: Generated SPSS output is given as follows.

Coefficientsa
Model Unstandardized Coefficients Standardized t Sig.
Coefficients
B Std. Error Beta
(Constant) 30.684 2.343 13.095 .000
1
X4 -.325 .128 -.576 -2.542 .025
(Constant) 24.652 3.092 7.973 .000
2 X4 -.379 .110 -.671 -3.458 .005
X1 .219 .087 .491 2.530 .026
3 (Constant) 30.920 3.756 8.233 .000
X4 -.389 .094 -.689 -4.154 .002
X1 .403 .108 .903 3.741 .003
X2 -.169 .072 -.559 -2.344 .039
a. Dependent Variable: Y

According to the output created using ‘Stepwise regression’ method, the


fitted model can be written as follows.

Y = 30.920 + 0.403 X1 – 0.169 X2 – 0.389 X4

Interpretation for;

- When X1 increases by 1 unit, Y increases by 0.403 units.

When X2 increases by 1 unit, Y decreases by 0.169 units.

When X3 increases by 1 unit, Y decreases by 0.389 units.

51
In MINITAB,

Step 1: Stat RegressionStepwise

Step 2: In the ‘Stepwise Regression’ Dialogue box, select ‘Y’ as the


response and ‘X1, X2, X3 & X4’ as the predictors and press ‘Methods’
button.

52
Step 3: In the ‘Methods’ dialogue box, change Alpha to enter as ‘0.05’ and
Alpha to remove as ‘0.06’ and press ok button to proceed.

Step 4: Generated MINITAB output is given below.

53
Similar results were obtained for the ‘Stepwise regression’ method
carried out using the MINITAB software. The coefficient values related
to the best fitted model is given in the third column that is highlighted
above. Therefore, the best fitted model can be written as;

Y = 30.920 + 0.403 X1 – 0.169 X2 – 0.389 X4

Interpretations for;

- When X1 increases by 1 unit, Y increases by 0.403 units.

When X2 increases by 1 unit, Y decreases by 0.169 units.

When X3 increases by 1 unit, Y decreases by 0.389 units.

Diagnostic test for errors

 Test for randomness of errors

Durbin-Watson statistic = 2.63468

As the Durbin –Watson statistic is 2.63468, it can be concluded that


errors are random. If the errors are random, DW statistics ~ 2.0.

 Test for the mean of errors

According to the below graph, error mean value is indicated as -


2.03689E-14. i.e. mean of errors is zero.

54
 Test for normality of errors

Probability Plot of RESI1


Normal
99
Mean -2.03689E-14
StDev 2.252
95 N 15
AD 0.472
90
P-Value 0.209
80
70
Percent

60
50
40
30
20

10

1
-5.0 -2.5 0.0 2.5 5.0
RESI1

As indicated in the above graph, Anderson Darling (AD) test is used to


test the normality of errors assumption. The value of the test statistic
(AD = 0.209) is significant as P value is 0.209 (P >0.05). Therefore it can
be claimed that the errors are normally distributed. The ‘probability plot
of Residuals’ indicates that the residuals are not deviated from the line.

 Test for the constant variance of errors

According to the below graph, it’s clear that data points are scattered
randomly. This confirms that the errors are having a constant variance.

55
Residuals Versus the Fitted Values
(response is Y)
4

1
Residual

-1

-2

-3

-4
20 22 24 26 28 30 32
Fitted Value

The plot of Residuals Vs Fitted values indicates that the data points are
scattered randomly, which can be concluded that the errors are having a
constant variance.

Conclusion: According to the above observations, the fitted model (Y =


30.920 +0.403 X1 – 0.169 X2 – 0.389 X4) fulfills the assumptions of
residuals. Therefore, this model can be used for the predictions or for
the forecasting purposes.

56
CHAPTER THREE: POLYNOMIAL REGRESSION

3.1 Introduction

It’s another type of a regression analysis, where the relationship


between X and Y is determined as the nth degree polynomial in X.
Polynomial regression is a special case of multiple regression.

(Second order model)

(Second order model)

(Second order
model)

3.2 Quadratic Regression

Example 3.1: Suppose the marketing division of a super market chain


wants to study the price elasticity of disposal items. Collected data is
given below. Develop a model to show the relationship between price
and sales.

Price Sales
65 132
65 141
65 153
65 158
65 164
85 172
85 94
85 100
85 110
105 118

57
105 127
105 76
105 85
105 102

In MINITAB,

Step 1: Get the correlation of price and sales

Correlations: Sales, Price

Pearson correlation of Sales and Price = -0.685


P-Value = 0.007

Step 2: Develop a regression model for the data set.

Durbin-Watson statistic = 1.78037

The R square of the above table indicates that 46.9 % of the observed
variability has been captured by the fitted model.

Step 3: Get the fitted line plot for sales and price.

Stat RegressionFitted Line Plot

58
Step 4: Generated output of MINITAB is given below.

Fitted Line Plot


Sales = 225.7 - 1.200 Price
S 23.3075
170 R-Sq 46.9%
R-Sq(adj) 42.5%
160
150
140
130
Sales

120
110
100
90
80

60 70 80 90 100 110
Price

Step 5: Develop a column for the Price2

59
Price Sales Price^2
65 132 4225
65 141 4225
65 153 4225
65 158 4225
65 164 4225
85 172 7225
85 94 7225
85 100 7225
85 110 7225
105 118 11025
105 127 11025
105 76 11025
105 85 11025
105 102 11025

3.2.1 Test the significance of the model

Step 6: Develop a regression model for sales, price and price2.

The R square of the above table indicates that 47.9 % of the observed
variability has been captured by the fitted model. Here the percentage
representing the variability of the model has increased by 1.0%.

60
3.2.2 Test the significance of the parameters

According to the above results, p column indicates that, coefficients of


‘price’ and ‘price2” variables are not statistically significant as sig. (p)
values are greater than 0.05. The fitted model can be written as,

Sales = 340 – 4.01 Price + 0.0165 Price 2

Step 7: Get the fitted line plot for sales, price and price2.

61
Fitted Line Plot
Sales = 340.2 - 4.005 Price
+ 0.01650 Price**2
S 24.1104
170 R-Sq 47.9%
160 R-Sq(adj) 38.5%

150
140
130
Sales

120
110
100
90
80

60 70 80 90 100 110
Price

Once the model is developed, need to test for the assumption of errors in
order to validate the model for predictions or forecasting.

 Test for the randomness of errors

Durbin-Watson statistic = 1.82865

As the Durbin –Watson statistic is 1.82865, it can be concluded that


errors are random. If the errors are random, DW statistics ~ 2.0.

 Test for the mean of errors

According to the below graph, error mean value is indicated as -


1.11657E-13. i.e. mean of errors is zero.

62
 Test for normality of errors

Probability Plot of RESI1


Normal
99
Mean -1.11657E-13
StDev 22.18
95 N 14
AD 0.368
90
P-Value 0.378
80
70
Percent

60
50
40
30
20

10

1
-50 -25 0 25 50
RESI1

As indicated in the above graph, Anderson Darling (AD) test is used to


test the normality of errors assumption. The value of the test statistic
(AD = 0.368) is significant as P value is 0.378 (P >0.05). Therefore it can
be claimed that the errors are normally distributed. The ‘probability plot
of Residuals’ indicates that the residuals are not deviated from the line.

 Test for the error constant variance

According to the below graph, it’s clear that data points are scattered
randomly. This confirms that the errors are having a constant variance.

63
Residuals Versus the Fitted Values
(response is Sales)
60

50

40

30
Residual

20

10

-10

-20

-30
100 110 120 130 140 150
Fitted Value

Conclusion: According to the above observations, the fitted model


(Sales = 340 – 4.01Price + 0.0165 Price2) fulfills the assumptions of
residuals. Therefore, this model can be used for the predictions or for
the forecasting purposes.

64
CHAPTER FOUR: NON LINEAR REGRESSION

4.1 Introduction

Regression models that are not linear on the parameters are called
nonlinear. Nonlinear models can be grouped in to two types namely,
intrinsically linear and intrinsically nonlinear.

4.1.1 Intrinsically linear models

These include models that can be transformed in to linear form such as;

(i) Exponential Growth model: can be transformed to log


(y) = log (a) + x log (b)

(ii) Exponential decay model: can be transformed to log


(y) = log (a) + x log (b)

(iii)Power model: can be transformed to log (y) = log


( ) + log (x)

(iii) Reciprocal model: Thus,

(iv) Exponential model: can be written as log (y) = a


+b t

4.1.2 Intrinsically non linear models

If a nonlinear model cannot be expressed in the linear form, it is


intrinsically nonlinear.

(i)

65
(ii)

4.2 Exponential model

Exponential model is a form of nonlinear regression model in the form

of where Y is dependent variable, X is independent variable,


a and b are coefficients.

Example 4.1: Develop a model for the data given below. Carry out
diagnostic tests to confirm the validity of the model.

t y
1 355
2 211
3 197
4 166
5 142
6 106
7 104
8 60
9 56
10 38
11 36
12 32
13 21
14 19
15 15

Step 1: Get the correlation of t and y.

Correlations: t, y

Pearson correlation of t and y = -0.907


P-Value = 0.000

66
Results indicate that correlation coefficient is close to -1.0 which
confirms that there is a negative linear relationship between these two
variables

Step 2: Develop a regression model for t and y.

The R square of the above table indicates that 82.3 % of the observed
variability has been captured by the fitted model. Further, in order to
find the relationship, regression analysis was carried out.

Step 3: Test the significance of the parameters

Above result confirms that, the parameters are significant as the


respective p values <0.05. Thus the fitted model can be written as follows.
67
y = 260 - 19.5 t

The above formula (y = 260 - 19.5 t) indicates a unit increase of t value


would decrease y by 19.5 times.

Step 4: Get the fitted line plot for y and t.

Fitted Line Plot


y = 259.6 - 19.46 t
400 S 41.8324
R-Sq 82.3%
R-Sq(adj) 81.0%

300

200
y

100

0 2 4 6 8 10 12 14 16
t

Above graph illustrates that y value decreases when t value increases.


This obvious linear relationship is described by the model equation (y =
259.6 - 19.46t) preceding the graph and confirmed by the high R-
Squared value (82.3%).

Diagnostic Testing for Errors

 Randomness assumption

The observed DW statistic = 0.803288

68
As Durbin Watson statistics is not close to 2.0, it can be confirmed that
the errors are non-random.

 Constant Variance assumption

As shown in below graph, data has been plotted with residuals and fitted
values to test whether the errors are having a constant variance.

Residuals Versus the Fitted Values


(response is y)
125

100

75
Residual

50

25

-25

-50
0 50 100 150 200 250
Fitted Value

The above plot indicates that the data points are not scattered
randomly, which can be concluded that the errors are not having a
constant variance.

 Normality assumption

Following graph is used to test whether the errors are normally


distributed.

69
Probability Plot of RESI1
Normal
99
Mean -6.63173E-14
StDev 40.31
95 N 15
AD 0.870
90
P-Value 0.019
80
70
Percent

60
50
40
30
20

10

1
-100 -50 0 50 100
RESI1

As indicated in the above graph, Anderson Darling test is used to test the
null hypothesis. (i.e. H0 : Errors are distributed normally.) The value of
the test statistic (AD = 0.870) is significant. (P = 0.019). Thus Ho is not
accepted and therefore it can be claimed that the errors are not
normally distributed.

Conclusion: As the above discussed error assumptions are violated, the


fitted model cannot be accepted.

Step 1: Get the log function of ‘y’ variable.

Calc  Calculator  Store result in ‘LOG y’  Expression LOGE (y) 


Press ok.

70
Data set with LOG y

t y LOG y
1 355 5.87212
2 211 5.35186
3 197 5.2832
4 166 5.11199
5 142 4.95583
6 106 4.66344
7 104 4.64439
8 60 4.09434
9 56 4.02535
10 38 3.63759
11 36 3.58352
12 32 3.46574
13 21 3.04452
14 19 2.94444
15 15 2.70805

Step 2: Get the correlation of t and LOG y.

Correlations: t, LOG y

Pearson correlation of LOG y and t = -0.991


P-Value = 0.000

The correlation coefficient between LOG y and t is significant (r = -0.991,


p = 0.000) confirming that the correlation coefficient between LOG y and
t is significantly lesser than zero. In fact the correlation coefficient
confirms that there is a negative linear relationship between these two
variables. Further, in order to find the relationship, regression analysis
was carried out.

71
4.2.1 Test the significance of the model

Step 3: Develop a regression model for t and y.

Analysis of Variance of the fitted model


Source DF SS MS F P
Regression 1 8.8124 8.8124 622.34 0.000
Residual Error 11 0.1558 0.0142
Total 12 8.9682

ANOVA Table is used to test the null hypothesis. (i.e. H0 : Model is not
significant). Above result indicates that F value (622.34) is significant as
the corresponding P value (0.000) is less than 0.05. Therefore, it can be
concluded with 95% confidence that the fitted model is significant.

The R2 of the fitted model is 98.3 % indicating that 98.3 % of the


observed variability has been captured by the fitted model. The
significant tests of the parameters are shown below.

4.2.2 Test the significance of the parameters

Results of the significance of the parameters in the model

Predictor Coef SE Coef T P


Constant 5.98138 0.07001 85.43 0.000
t -0.220045 0.008821 -24.95 0.000

Above results indicate that P values for the parameters are less than 0.05.
Therefore, we can conclude that the parameters are significantly different
from zero. Thus the fitted model is,

LOG y = 5.98 - 0.220 t


y = 395.836 e 0.22t

72
The line of best fit corresponding to the model equation is indicated
below.

Fitted Line Plot


LOG y = 5.981 - 0.2200 t
6.0 S 0.118996
R-Sq 98.3%
R-Sq(adj) 98.1%
5.5

5.0
LOG y

4.5

4.0

3.5

3.0
0 2 4 6 8 10 12 14
t

As shown in the above graph, when t value increases LOG y value


decreases. This relationship is described by the model equation (LOG y =
5.981 - 0.220 t) preceding the graph and confirmed by the high R-Square
value (98.3%).

4.2.3 Diagnostic Testing for Errors

 Randomness assumption

The observed DW statistic = 2.59032


As Durbin Watson statistics is close to 2.0, it can be confirmed that the
errors are random.

73
 Constant variance assumption

As shown in below graph data has been plotted with residuals and fitted
values to test whether the errors are having a constant variance.

Residuals Versus the Fitted Values


(response is LOG y)

0.2

0.1
Residual

0.0

-0.1

-0.2
3.0 3.5 4.0 4.5 5.0 5.5 6.0
Fitted Value

The plot of Residuals Vs Fitted values indicate that the data points are
scattered randomly, which can be concluded that the errors are having a
constant variance.

 Normality assumption

Following graph is used to test whether the errors are normally


distributed.

74
Probability Plot of RESI2
Normal
99
Mean -1.16146E-15
StDev 0.1139
95 N 13
AD 0.162
90
P-Value 0.928
80
70
Percent

60
50
40
30
20

10

1
-0.3 -0.2 -0.1 0.0 0.1 0.2 0.3
RESI2

As indicated in the above graph, Anderson Darling test is used to test the
null hypothesis. (i.e. H0 : Errors are distributed normally.) The value of
the test statistic (AD = 0.162) is not significant. (P = 0.928). Thus Ho is
accepted and therefore it can be claimed that the errors are normally
distributed.

Thus the fitted model (y = 395.836 e0.22t) is statistically validated and


the model can be recommended for forecasting.

75
REFERENCES

1. Anderson D. and Thomas A. (2008). Statistics for Business and


Economics. 10th Edition. Thomson Learning.

2. Argyrous G. (2013) Statistics for Research, With a Guide to SPSS.


Third Edition. (ISBN-13: 978-1849205948)

3. Crawshaw J. and Chambers J. (2001). A Concise Course in Advanced


Level Statistics: With Worked Examples. Fifteenth Edition. Oxford
Publisher.

4. Evans M. J and Rosenthal J.S (2009). Probability and Statistics. The


Science of Uncertainty. Second Edition. University of Toronto.

5. Peiris T.S.G. (2019). - Lecture Notes, Statistical Modeling in Business.


University of Moratuwa.

6. Richard I. and David S. (2004). Statistics for Management. 7th Edition.


Pearson Education.

76

You might also like