You are on page 1of 35

Multiple regression

MULTIPLE REGRESSION
• Multiple regression is not just one technique but a family of techniques that
can be used to explore the relationship between one continuous dependent
variable and a number of independent variables or predictors

• Multiple regression is based on correlation, but allows a more sophisticated


exploration of the interrelationship among a set of variables
Caution Need To Be Taken

• You cannot just throw variables into a multiple regression and hope that,
magically, answers will appear. You should have a sound theoretical or
conceptual reason for the analysis and, in particular, the order of variables
entering the equation
Address a variety of research questions
It can tell you how well a set of variables is able to predict a particular outcome

• For example, you may be interested in exploring how well a set of subscales on an intelligence
test is able to predict performance on a specific task

• Multiple regression will provide you with information about the model as a whole (all subscales)
and the relative contribution of each of the variables that make up the model (individual
subscales).

• As an extension of this, multiple regression will allow you to test whether adding a variable (e.g.
motivation) contributes to the predictive ability of the model, over and above those variables
already included in the model.
Multiple Regression Main point

• how well a set of variables is able to predict a particular outcome

• which variable in a set of variables is the best predictor of an outcome

• whether a particular predictor variable is still able to predict an outcome when


the effects of another variable are controlled for (e.g. socially desirable
responding).
ASSUMPTIONS OF MULTIPLE
REGRESSION
• Normality,
• linearity,
• homoscedasticity,
• independence of residuals
DETAILS OF EXAMPLE

• To illustrate the use of multiple regression, I will be using a

series of examples taken from the survey4ED.sav data file


• The survey was designed to explore the factors that affect respondents’ psychological
adjustment and wellbeing.

• I will be exploring the impact of respondents’ perceptions of control on their levels of


perceived stress.

• The literature in this area suggests that if people feel that they are in control of their lives, they
are less likely to experience ‘stress’
In the questionnaire, there were two different measures of control

These include the Mastery Scale, which measures the degree to which people feel
they have control over the events in their lives;

the Perceived Control of Internal States Scale (PCOISS), which measures the
degree to which people feel they have control over their internal states (their
emotions, thoughts and physical reactions).
• In this example, I am interested in exploring how well the Mastery
Scale and the PCOISS are able to predict scores on a measure of
perceived stress.
Variables:

Total perceived stress (tpstress): total score on the Perceived Stress Scale. High scores indicate
high levels of stress.

Total Perceived Control of Internal States (tpcoiss): total score on the Perceived Control of
Internal States Scale. High scores indicate greater control over internal states.

Total Mastery (tmast): total score on the Mastery Scale. High scores indicate higher levels of
perceived control over events and circumstances.

Total Social Desirability (tmarlow): total scores on the Marlowe-Crowne Social Desirability
Scale, which measures the degree to which people try to present themselves in a positive light.
• Age: age in years.
Example of research questions:

1. How well do the two measures of control (mastery, PCOISS) predict perceived
stress? How much variance in perceived stress scores can be explained by
scores on these two scales?

2. Which is the best predictor of perceived stress: control of external events


(Mastery Scale) or control of internal states (PCOISS)?

3. If we control for the possible effect of age and socially desirable responding, is
this set of variables still able to predict a signifi cant amount of the variance in
perceived stress?
What you need
1. one continuous dependent variable (Total perceived stress)

2. Two or more continuous independent variables (mastery, PCOISS).


(You can also use dichotomous independent variables, e.g. males=1,
females=2.)
What it does:

• Multiple regression tells you how much of the variance in your dependent
variable can be explained by your independent variables.

• It also gives you an indication of the relative contribution of each


independent variable.

• Tests allow you to determine the statistical significance of the results, in


terms of both the model itself and the individual independent variables.
STANDARD MULTIPLE REGRESSION

Question 1: How well do the two measures of control (mastery, PCOISS) predict perceived

stress? How much variance in perceived stress scores can be explained by scores on these two

scales?

Question 2: Which is the best predictor of perceived stress: control of external events (Mastery
Scale) or control of internal states (PCOISS)?
Step 1: Checking the assumptions

Check that your independent variables show at


least some relationship with your dependent
variable (above .3 preferably).
In this case, both of the scales (Total Mastery and Total
PCOISS) correlate substantially with Total perceived stress
(–.61 and –.58 respectively).
Step 1: Checking the assumptions

You probably don’t want to include two variables with a bivariate correlation of .7 or more in
the same analysis.

If you find yourself in this situation, you may need to consider omitting one of the variables or
forming a composite variable from the scores of the two highly correlated variables. In the
example presented here the correlation is .52, which is less than .7; therefore all variables will
be retained.
Collinearity Diagnostics

Tolerance is an indicator of how much of the variability of the specified independent


is not explained by the other independent variables in the model and is calculated
using the formula 1–R squared for each variable. If this value is very small (less than .
10) it indicates that the multiple correlation with other variables is high, suggesting
the possibility of multicollinearity.
The other value given is the VIF (Variance inflation factor), which is just the inverse
of the Tolerance value (1 divided by Tolerance). VIF values above 10 would be a
concern here, indicating multicollinearity.
Step 2: Evaluating the model

R Square. This tells you how much of the variance in the dependent
variable (perceived stress) is explained by the model (which includes the
variables of Total Mastery and Total PCOISS). In this case, the value is .
468.
Step 3: Evaluating each of the independent variables

which of the variables included in the model contributed to the prediction of the dependent variable

we are interested in comparing the contribution of each independent variable;


therefore we will use the beta values
Look down the Beta column and find which beta value is the largest (ignoring any negative signs out
the front).

In this case the largest beta coeffi cient is –.42, which is for Total Mastery.

This means that this variable makes the strongest unique contribution to explaining the dependent
variable, when the variance explained by all other variables in the model is controlled for. The Beta
value for Total PCOISS was slightly lower (–.36), indicating that it made less of a unique contribution.
In this case, both Total Mastery and Total PCOISS made
check the value in the column marked Sig. This tells a unique, and statistically significant, contribution to the
you whether this variable is making a statistically prediction of perceived stress scores.
significant unique contribution to the equation.

If the Sig. value is less than .05 (.01, .0001, etc.), the
variable is making a significant unique contribution to
the prediction of the dependent variable.
In this example, the Mastery Scale has a part correlation co-efficient of –.36. If we square this (multiply it
by itself) we get .13, indicating that Mastery uniquely explains 13 per cent of the variance in Total
perceived stress scores. For the PCOISS the value is –.31, which squared gives us .09, indicating a unique
contribution of 9 per cent to the explanation of variance in perceived stress.
The next thing we want to know is which of the variables included in the model
contributed to the prediction of the dependent variable. We find this information in
the output box labelled Coefficients. Look in the column labelled Beta under
Standardised Coefficients. To compare the different variables it is important that
you look at the standardised coefficients, not the unstandardised ones.
‘Standardised’ means that these values for each of the different variables have been
converted to the same scale so that you can compare them. If you were interested
in constructing a regression equation, you would use the unstandardised coefficient
values listed as B.

You might also like