You are on page 1of 58

Regression Analysis Using

Statgraphics Centurion

Neil W. Polhemus, CTO, StatPoint Technologies, Inc.

Copyright 2011 by StatPoint Technologies, Inc.

Web site: www.statgraphics.com


Outline
 Regression Models
 Examples – Single X
 Simple regression
 Nonlinear models
 Calibration
 Comparison of regression lines
 Examples – Multiple X
 Regression model selection (stepwise, all possible)
 Logistic regression
 Poisson regression

2
Regression Model Setup

 Dependent variable:Y

 Independent variable(s): X1, X2, …, Xk

 Error term: e

Model: Y = f (X1, X2, …, Xk) + e

3
Types of Regression Models (#1)
Procedure Dependent variable Independent variables

Simple Regression continuous 1 continuous

Polynomial Regression continuous 1 continuous

Box-Cox Transformations continuous 1 continuous

Calibration Models continuous 1 continuous

Comparison of Regression continuous 1 continuous and 1


Lines categorical

4
Types of Regression Models (#2)
Procedure Dependent variable Independent variables

Multiple Regression continuous 2+ continuous

Regression Model Selection continuous 2+ continuous

Nonlinear Regression continuous 1+ continuous

Ridge Regression continuous 2+ continuous

Partial Least Squares continuous 2+ continuous

General Linear Models 1+ continuous 2+ continuous or categorical


variables

5
Types of Regression Models (#3)
Procedure Dependent variable Independent variables

Logistic Regression proportions 1+ continuous or categorical

Probit Analysis proportions 1+ continuous or categorical

Poisson Regression counts 1+ continuous or categorical

Negative Binomial counts 1+ continuous or categorical


Regression
Life Data - Parametric failure times 1+ continuous or categorical
Models

6
Example 1: Stability study
Y: percent of available chlorine

X: number of weeks since production

Lower acceptable limit for Y: 0.40

7
X-Y Scatterplot with Smooth

8
Simple Regression

9
Analysis Options

10
Tables and Graphs

11
Analysis Window

12
Analysis Summary

13
Lack-of-Fit Test

14
Comparison of Alternative Models

15
Fitted Reciprocal-X Model
Plot of Fitted Model
chlorine = 0.368053 + 1.02553/weeks

0.5

0.48

0.46
chlorine

0.44

0.42

0.4

0.38
0 10 20 30 40 50
weeks

16
Lower 95% Prediction Limit

17
Outlier Removal
Plot of Fitted Model
chlorine = 0.366628 + 1.02548/weeks

0.5

0.48

0.46
chlorine

0.44

0.42

0.4

0.38
0 10 20 30 40 50
weeks

18
Example 2: Nonlinear Regression
Draper and Smith in Applied Regression Analysis suggest fitting
a model of the form

Y = a + (0.49-a)exp[-b(x-8)]

Since the model is nonlinear in the parameters, it requires a


search procedure to find the best solution.

19
Data Input Dialog Box

20
Initial Parameter Estimates

21
Analysis Options

22
Plot of Fitted Model
Plot of Fitted Model
chlorine = 0.390144+(0.49-0.390144)*exp(-0.101644*(weeks-8))
0.5

0.48

0.46
chlorine

0.44

0.42

0.4

0.38
0 10 20 30 40 50
weeks

23
Example 3: Calibration
The general calibration problem is that
of determining the likely value of X
given an observed value of Y.
Typically: X = item characteristic,Y =
measured value
Step 1: Build a regression model using
samples with known values of X
(“golden samples”).
Step 2: For another sample with
unknown X, predict X from Y.

24
Data Input Dialog Box

25
Reverse Prediction

26
Plot of Fitted Model
Plot of Fitted Model
measured = -0.0896667 + 1.01433*known

12

10

8
measured

6 5.85

5.85573 (5.59032,6.1215)
0
0 2 4 6 8 10
known

27
Example 4: Comparison of
Regression Lines
Y: amount of scrap produced

X: production line speed

Levels: line number

28
Data Input Dialog Box

29
Analysis Options

30
Plot of Fitted Model
Plot of Fitted Model

540 Line
1
2
440
Scrap

340

240

140
100 140 180 220 260 300 340
Speed

31
Significance Tests

32
Parallel Slope Model

33
Example 5: Multiple Regression

34
Stepwise Regression

35
Analysis Options

36
Selected Variables

37
Residual Plot

38
All Possible Regressions

39
Analysis Options

40
Best Adjusted R-Squared Models

41
Example 6: Logistic Regression
Response variable may be in the form of proportions or binary (0/1).

42
Logistic Model
Let P(Event) be the probability an event occurs at specified values of
the independent variables X.

1
P( Event ) 
1  exp  (  0  1 X 1   2 X 2  ...   k X k ) (1)

 P( Event) 
log    0  1 X 1   2 X 2  ...   k X k (2)
 1  P( Event) 

43
Data Input - Proportions

44
Analysis Options

45
Plot of Fitted Model
Plot of Fitted Model
with 95.0% confidence limits

0.8
Failures/Specimens

0.6

0.4

0.2

0
0 20 40 60 80 100
Load

46
Statistical Results

47
Data Input - Binary

48
Analysis Options

49
Analysis Summary

50
Example 7: Poisson Regression
Response variable is a count.

51
Poisson Model
Values of the response variable are assumed to follow a Poisson
distribution:


e Y
pY  
Y!
The rate parameter  is related to the predictor variables through a log-
linear link function:

log    0  1 X 1   2 X 2  ...   k X k

52
Data Input

53
Analysis Options

54
Statistical Results

55
Plot of Fitted Model
Plot of Fitted Model
with 95.0% confidence limits

5
Thickness=170.0
Extraction=75.0
4 Height=55.0

3
Injuries

0
0 10 20 30 40
Years

56
References
 Applied Logistic Regression (second edition) – Hosmer and
Lemeshow, Wiley, 2000.
 Applied Regression Analysis (third edition) – Draper and
Smith, Wiley, 1998.
 Applied Linear Statistical Models (fifth edition) – Kutner et
al., McGraw-Hill, 2004.
 Classical and Modern Regression with Applications (second
edition) – Myers, Brookes-Cole, 1990.

57
More Information
Go to www.statgraphics.com

Or send e-mail to info@statgraphics.com

58

You might also like