You are on page 1of 14

17-1

Regression
Regression
Specific statistical methods for finding the line of best fit for one response (dependent) numerical variable based on one or more explanatory (independent) variables.

17-2

Regression: 3 Main Purposes


To describe (or model)
To predict (or estimate) To control (or administer)

17-3

Regression analysis examines associative relationships between a metric dependent variable and one or more independent variables in the following ways: Determine whether the independent variables explain a significant variation in the dependent variable Determine how much of the variation in the dependent variable can be explained by the independent variables: strength of the relationship. Predict the values of the dependent variable.

Regression Analysis

17-4

Example
Plan an outdoor party.
Estimate number of soft drinks to buy per person, based on how hot the weather is. Use Temperature/Water data and regression.

17-5

Real Life Applications


Estimating Seasonal Sales for Department Stores (Periodic)

17-6

Real Life Applications


Predicting Student Grades Based on Time Spent Studying

17-7

Practice Problems
Can the number of points scored in a basketball game be predicted by
The time a player plays in the game? By the players height?

17-8

Types of Regression Models


Positive Linear Relationship Relationship NOT Linear

Negative Linear Relationship

No Relationship

17-9

Least square method


The equation for regression line assumed by Least Squares method is Y=a+bx+ei Where ei =Yi-i Where Y is the dependent variable X is the independent variable a is the Y-intercept b is the slope of the line b=( (n(XY)-(XY))/ ((n(X2)-(X)2) a=Y-bX

Calculations for determining constants a and b


Man Hours(X) 3.6 4.8 2.4 7.2 6.9 8.4 10.7 11.2 6.1 7.9 9.5 Productivity in units(Y) 9.3 10.2 9.7 11.5 12 14.2 18.6 28.4 13.2 10.8 22.7 XY 33.48 48.96 23.28 82.8 82.8 119.28 199.02 318.08 80.52 85.32 215.65 X2 12.96 23.04 5.76 51.84 47.61 70.56 114.49 125.44 37.21 62.41 90.25

17-10

5.4 X=84.1

12.3 Y=172.9

66.42 XY=1355.61

29.16 X2

17-11

b=1.768 a=2.01
Y=2.01+1.768X

17-12

17-13

Coefficient of Multiple Determination


The coefficient of multiple determination measures the magnitude of the association of the variables involved in multiple regression. It is denoted by R2. In mathematical terms, it measures the percentage of variation in variable Y explained by the independent variables. R2 = ( Explained Variance) / ( Total Variance)

17-14

The Strength of Association R2


R2 = ( Explained Variance) / ( Total Variance) Total Variance = (Explained Variance)+ (Unexplained Variance) Explained Variance=(Total Variance ) (Unexplained Variance) R2 = (Total variance-Unexplained Variance) / Total Variance R2 = 1 ( Unexplained Variance/Total