You are on page 1of 15

APPLIED REGRESSION

ANALYSIS
(Lecture 1)

Rika Fitriani
Dept. of Mathematics
Universitas Gadjah Mada
Course Contents:
Simple linear regression and correlation, model adequacy checking, multiple
linear regression, indicator variables, variable selection and model building.

References:
Kutner, M.H., Nater, J., Nachtsheim, C.J. & Li, W. 2005. Applied Linear
Statistical Models. McGraw-Hill/Irwin, New York.
Montgomery, D.C., Peck, E.A. & Vining, G.G. 2012. Introduction to Linear
Regression Analysis. John Wiley & Sons, Inc., New Jersey.
Weisberg, S. 2014. Applied Linear Regression. John Wiley & Sons, Inc., New
Jersey.
The weight of assessment will be as follows:
• Quiz, assignment, laboratory work 30%
• Mid semester exam 35%
• Final exam 35%

Grade scale:
Introduction

Regression analysis is a statistical technique for investigating and


modeling the relationship between variables.

The relationship is expressed in the form of an equation or a model


connecting the response or dependent variable and one or more
explanatory or predictor variables.

We denote the response variable by and the set of predictor


variables by where denotes the number of predictor variables.
Introduction (2)

Applications of regression are numerous and occur in almost every


field, including engineering, the physical and chemical sciences,
economics, management, life and biological sciences, and the social
sciences.

Example:
• A real estate appraiser may wish to relate the sale price of a home
from selected characteristics of the building.
• We may wish to examine whether cigarette consumption is related
to various socioeconomic and demographic variables such as age,
education, income, and price of cigarettes.
• Suppose that a Regional Delivery Service want to analyze a
relationship between the total travel time to the miles traveled
and number of deliveries on each trip.
Scatter Plot
An essential first step in regression analysis is to draw appropriate graphs
of the data. This graph is called a scatter diagram or scatter plot.

Example 1
Suppose that a Regional Delivery Service want to analyze a relationship
between the total travel time to the miles traveled on each trip.
Miles Traveled Travel Time (hrs)
89 6.9
a. Identify the predictor and the
66 5.4
78 5.4 response.
111 7.4 b. Draw the scatter plot of miles
44 3.5 traveled versus travel time.
77 5.6
80 6.1
66 4.9
109 7.3
76 6.4
Scatter Plot (2)
a. The predictor variable is miles traveled.
The response variable is travel time.
b. The scatter plot of miles traveled versus travel time:
Regression vs Correlation
• Closely related to but conceptually very much different from regression
analysis is correlation analysis, where the primary objective is to
measure the strength or degree of linear association between two
variables.
• In regression analysis, as already noted, we are not primarily interested
in such a measure. Instead, we try to estimate or predict the average
value of one variable on the basis of the fixed values of other variables.
• In correlation analysis, on the other hand, we treat any (two) variables
symmetrically; there is no distinction between the dependent and
explanatory variables.
• In regression analysis there is an asymmetry in the way the dependent
and explanatory variables are treated. The dependent variable is
assumed to be random, that is, to have a probability distribution. The
explanatory variables, on the other hand, are assumed to have fixed
values.
Correlation
Notation for the data used in correlation:
Observation Number Y X
1
2

The correlation coefficient between X and Y is given by


Correlation (2)

Cor(X, Y) satisfies

The magnitude of Cor(X, Y) measures the strength of the linear


relationship between X and Y .

The closer Cor(X, Y) is to 1 or -1, the stronger is the relationship between


X and Y.

The sign of Cor(X, Y) indicates the direction of the relationship between X


and Y. That is, Cor(X, Y) > 0 implies that X and Y are positively related.
Conversely, Cor(X, Y) < 0, implies that X and Y are negatively related.
Correlation (3)

Example 2
Consider the data in Example 1. Calculate the correlation coefficient
between miles traveled and travel time.
Exercise
Consider a case of a company that repairs small computers. We wish to study the
relationship between the length of a service call (in minutes) and the number of
electronic components in the computer that must be repaired.
Length of Service Calls Number of Units
(in Minutes) Repaired
23 1
29 2 a. Identify the predictor and the
49 3 response.
64 4 b. Draw the scatter plot of length
74 4 of service calls versus number
87 5 of units repaired.
96 6 c. Calculate the correlation
97 6
109 7 coefficient between length of
119 8 service calls and number of
149 9 units repaired.
145 9
154 10
166 10
THANK YOU

You might also like