You are on page 1of 8

STA404

Regression & Correlation Analysis


Lesson 6

Multiple Linear Regression


Analysis
Why move to multiple linear
regression?
Why one variable is NOT enough?
• Yield of crops depends on … ???
– Fertility of land
– Fertilizer applied
– Rainfall
– Quality of seeds
• Sale of a company depends on … ???
– Price
– Demand
– Advertisement

Example: Concept
• Suppose a nursing instructor wishes to see whether there is a relationship
between a student’s grade point average (GPA), age, and score on the
board examination. The two independent variables are GPA (denoted by
x1) and age (denoted by x2).

• The instructor will collect the data for all three variables for a sample of
nursing students.

• Rather than conduct two separate simple regression studies, one using
the GPA and state board scores and another using ages and state board
scores, the instructor can conduct one study using multiple regression
analysis with two independent variables— GPA and ages—and one
dependent variable—state board scores.
Example: Looks like
Simple >>> Multiple Regression

Simple Linear Regression Multiple Linear Regression


• In simple linear regression, • A multiple regression
the regression equation equation with two
contains one independent independent variables (x1
variable x and one and x2) and one dependent
dependent variable y and is variable has the form:
written as

^ =𝑎 +𝑏𝑋
𝑌 ^ =𝑎+𝑏 1 𝑋 1+𝑏2 𝑋 2
𝑌
General form
• A multiple regression equation with three independent variables (x1, x2
and x3) and one dependent variable has the form:

^ =𝑎+𝑏1 𝑋 1+𝑏2 𝑋 2+𝑏3 𝑋 3


𝑌
• The general form of the multiple regression equation with k independent
variables is

^ =𝑎+𝑏 1 𝑋 1+𝑏 2 𝑋 2+∙∙ ∙+𝑏 𝑘 𝑋 𝑘


𝑌
• Y is called DEPENDENT variable
• The x’s are the independent variables.
• The b’s are called partial regression coefficients.
• K is number of independent variables
Assumptions
• The assumption for Linear Multiple Regression line are similar
to those Simple Linear Regression , as already discussed. In a
summary form:

• The mean of the responses is a Linear function 


• The errors are Independent.
• The errors are Normally distributed.
• The errors have Equal variances (denoted σ2) for all x values.

• And For Multiple: we add this:


• We also consider that the variables X1 and X2 are NOT
correlated with each other.
Summary
• Multiple: Because we are dealing with more
than one independent variables
• With TWO variables it looks like this:
𝑌^ =𝑎+𝑏 1 𝑋 1+𝑏2 𝑋 2
• Coefficient are called partial coefficients
• Similar assumption like the simple liner
regression

You might also like