Professional Documents
Culture Documents
CHAPTER 5
BIVARIATE ANALYSIS
LEARNING OUTCOMES
At the end of this topic, students are expected to be able to:
1
12/6/2020
INTRODUCTION
o Regression and correlation analysis are used to describe a relationship
b t
between d d t variable
dependent i bl andd independent
i d d t variable(s).
i bl ( )
o Example:
a) To describe the relationship between household expenditure and
number of children.
b) To
T predict
di t the
th number
b off complaints
l i t based
b d on type
t off room.
c) To investigate the relationship between house price and house area,
number of bedrooms, number of bathrooms and location of the
house.
3
INTRODUCTION
2
12/6/2020
SCATTER DIAGRAM
o In a scatter diagram, the independent variable is plotted along the
X axis and the dependent variable is plotted along the
horizontal X-axis
vertical Y-axis.
o The less scattered the points in the scatter diagram, the higher is
the degree of relationship between the dependent and
independent variable.
5
TYPES OF RELATIONSHIP
Y Y
X X
Positive Linear Relationship Negative Linear Relationship
X
Y No Relationship Y
X X
Nonlinear Relationship Nonlinear Relationship
6
3
12/6/2020
SCATTER DIAGRAM
Example 1
A dietician wants to check the association between height (in inches)
and weight (in pounds) of a baseball player.
player A sample of 9 major
league baseball players is selected at random. The data were
recorded as follows:
Height 73 69 72 70 72 66 72 72 74
Weight 201 170 180 200 190 175 205 185 186
SCATTER DIAGRAM
Solution
Weight (lbs)
210 ---
x
200 --- x x
190 --- x
x
x
180 --- x
x
170 --- x
160 ---
0
| | | | | | | | | | | Height (in)
65 66 67 68 69 70 71 72 73 74 75
4
12/6/2020
CORRELATION ANALYSIS
o In a correlation analysis,
analysis the correlation between two variables
is measured by using a linear correlation coefficient, r.
CORRELATION ANALYSIS
o The Pearson correlation coefficient is calculated as follows:
Where,
X Y
SS XY XY n
SS XX X 2
X 2
SS XY n
r
SS XX SSYY SSYY Y
2
Y 2
10
5
12/6/2020
CORRELATION ANALYSIS
o r can only takes a value of between -1 and 1 i.e. -1 < r < 1
o Positive value of r indicates that there is a positive correlation between
Y and X (Y increases as X increases)
o Negative value of r indicates that there is a negative correlation
between Y and X (Y decreases as X increases)
o How to determine the strength of a correlation based on the value of r?
CORRELATION ANALYSIS
Example 2
Recall the weight and height of 9 baseball players given in Example 1.
Compute the Pearson coefficient of correlation and interpret the
result
result.
Height 73 69 72 70 72 66 72 72 74
Weight 201 170 180 200 190 175 205 185 186
SOLUTION
Find the following summary of totals by using calculator.
Y _____ Y 2
_____ XY _____ X _____ X
2
_____ n _____
12
6
12/6/2020
CORRELATION ANALYSIS
Instructions:
1 Click MODE two times REG (2) LIN (1)
(Small indicator “REG” should appear at the top of the screen
indicating that you are currently on the regression mode)
CORRELATION ANALYSIS
Cont...
From calculator,
X 2
45,558 X 640 n9 Y 2
319,272 Y 1,692 XY120,437
Interpretation:
14
7
12/6/2020
CORRELATION ANALYSIS
Example 3
Suppose we take a sample of seven households from a low to
moderate income neighborhood and collect information on their
incomes and food expenditures for the past month. Determine
h th there
whether th is i any correlation
l ti between
b t i
income andd expenditures.
dit
If yes, how strong is the correlation?
CORRELATION ANALYSIS
Solution
X 2
7,222 X 212 n7 Y
2
646 Y 64 XY 2,150
Interpretation:
16
8
12/6/2020
REGRESSION ANALYSIS
o M
Multiple
lti l linear
li regression
i – analyze
l th relationship
the l ti hi
between one dependent variable and more than one
independent variables.
17
REGRESSION ANALYSIS
o Regression analysis produces an equation that express the
dependent variable (Y) as a function of independent variables (X).
18
9
12/6/2020
REGRESSION ANALYSIS
o In general, a simple linear regression model is written as:
y A Bx
Where,
y = Dependent variable
x = Independent variable
A = Y-intercept
B = Slope of the regression line / Regression coefficient
= Random error term
19
REGRESSION ANALYSIS
ASSUMPTIONS OF MODEL
20
10
12/6/2020
REGRESSION ANALYSIS
ESTIMATED MODEL
o In the linear regression model,
model the values of A and B are
unknown. Therefore, we have to estimate their values by using
the least square method.
ŷ = a + bx
21
REGRESSION ANALYSIS
xy
x y
SS xy n
b
SS xx 2
x
x
2
y x
a b or a y bx
n n
22
11
12/6/2020
REGRESSION ANALYSIS
REGRESSION ANALYSIS
Example 4
Refer to Example 1. Find the equation of the regression line and
interpret the values of the regression coefficient.
Solution
From the previous calculation,
24
12
12/6/2020
REGRESSION ANALYSIS
Cont…
25
REGRESSION ANALYSIS
Example 5
Refer to Example 3. Find the equation of the regression line and
interpret the values of the regression coefficient.
Solution
From the previous calculation,
26
13
12/6/2020
REGRESSION ANALYSIS
Cont…
27
REGRESSION ANALYSIS
MAKING PREDICTION
o Given a value of X, we can predict the value of Y by using the
g
estimated regression equation.
q
Example 6
Using the equation of the regression line found in Example 4,
predict the weight of a baseball player whose height is 72 inches.
Solution
S l ti
28
14
12/6/2020
REGRESSION ANALYSIS
MAKING PREDICTION
Example 7
Using the equation of the regression line found in Example 5,
predict the food expenditure of a household with income $3500.
Solution
29
REGRESSION ANALYSIS
COEFFICIENT OF DETERMINATION
o Coefficient of determination, r2, measures the total variation in
Y that is explained by the independent variable, X.
SSxy
r2 b
SSyy
correlation coefficient
2
o For
F iinstance, if r2 =0.80,
0 80 this
hi means that
h 80% off the
h totall
variation in Y is explained by X.
o If r2 > 0.80, we say that the regression model fits the data well.
Hence, the model is useful in predicting the dependent variable.
30
15
12/6/2020
REGRESSION ANALYSIS
Example 8
Compute the coefficient of determination for the data given in
Example 1. Interpret the values.
Solution
From the previous calculation,
SS xy 117 SSyy 1,176 b 2.4952
31
REGRESSION ANALYSIS
Example 9
Compute the coefficient of determination for Example 3.
Interpret the values.
Solution
From the previous calculation,
SS xy 211.7143 SSyy 60.8571 b 0.2642
32
16
12/6/2020
REGRESSION ANALYSIS
Example 10
An observation was carried out to determine the relationship between the age of
a chef and the time (in minutes) needed to prepare a dish. The table below shows
the data recorded by eight randomly selected chefs.
Model Summary
Model R R Square Adjusted R Square Std. Error of the Estimate
1 .901 .811 .780 2.193
Coefficients
Unstandardized Coefficients Standardized Coefficients
Model B Std. Error Beta t Sig.
(Constant) 71.133 3.246 21.914 .000
1
Age -0.409 0.081 -0.901 -5.078 .002
17
12/6/2020
REGRESSION ANALYSIS
Solution
a) Independent variable:
Dependent variable:
b)
35
REGRESSION ANALYSIS
Solution
36
18
12/6/2020
Fiza owns a bakery shop. She has advertised the shop through newspaper and
magazine in order to persuade potential customers to buy her products. In an
sales she
attempt to analyze the relationship between advertising cost and sales,
recorded the monthly advertising cost and sales (RM’00) for a sample of twelve
months. The data are listed in Table 1.
Fiza used the SPSS software to analyze the data. The following table and figure
show the results of regression analysis performed on the data.
37
Coefficients
Unstandardized Standardized
Coefficients Coefficients
Model B Std. Error Beta t Sig.
(Constant) 19.707 3.179 6.199 .000
1
Advertising .069 .008 .940 8.726 .000
38
19
12/6/2020
d) Interpret the slope of the regression line in the context of the problem.
39
Solution
40
20
12/6/2020
Solution
41
A random sample of eight drivers insured with a company and having similar
auto insurance policies was selected. The following table lists their driving
premiums
experiences (in years) and monthly auto insurance premiums.
21
12/6/2020
f) D the
Draw th regression
i equation
ti on the
th graphh in
i a).)
43
Solution
44
22
12/6/2020
Solution
45
Solution
46
23
12/6/2020
Solution
47
The table below shows the results obtained in investigating the relationship
between the amount (in million of RM) spent on marketing and revenue (in millions
of RM) for given year of six different hotels:
Model Summary
Adjusted R Std. Error of the
Model R R Square
Square Estimate
1 .992a .985 .981 3.586
a. Predictors: (Constant), Marketing
Coefficientsa
Unstandardized
U t d di d Standardized
St d di d
Model Coefficients Coefficients t Sig.
B Std. Error Beta
1 (Constant) 6.222 3.019 2.061 .108
Marketing 18.309 1.148 .992 15.951 .000
a. Dependent Variable: Revenue
48
24
12/6/2020
e) Forecast the total revenues of a hotel that plans to spend RM5 million
on marketing for the year 2010.
49
Solution
50
25
12/6/2020
Coefficientsa
Unstandardized Standardized
Model Coefficients Coefficients t Sig.
B Std. Error Beta
(Constant) 60.981 1.909 31.948 .000
1
p
Experience 1.196 .125 .959 9.558 .000
a. Dependent Variable: Scores
Solution
52
26
12/6/2020
END
OF
SYLLABUS..!!
27