You are on page 1of 58

# Neil W. Polhemus, CTO, StatPoint Technologies, Inc.

Regression Analysis Using
Statgraphics Centurion
Copyright 2011 by StatPoint Technologies, Inc.
Web site: www.statgraphics.com
Outline
 Regression Models
 Examples – Single X
 Simple regression
 Nonlinear models
 Calibration
 Comparison of regression lines
 Examples – Multiple X
 Regression model selection (stepwise, all possible)
 Logistic regression
 Poisson regression
2
Regression Model Setup
3
 Dependent variable: Y
 Independent variable(s): X
1
, X
2
, …, X
k
 Error term: c
Model: Y = f (X
1
, X
2
, …, X
k
) + c
Types of Regression Models (#1)
4
Procedure Dependent variable Independent variables
Simple Regression continuous 1 continuous
Polynomial Regression continuous 1 continuous
Box-Cox Transformations continuous 1 continuous
Calibration Models continuous 1 continuous
Comparison of Regression
Lines
continuous 1 continuous and 1
categorical
Types of Regression Models (#2)
5
Procedure Dependent variable Independent variables
Multiple Regression continuous 2+ continuous
Regression Model Selection continuous 2+ continuous
Nonlinear Regression continuous 1+ continuous
Ridge Regression continuous 2+ continuous
Partial Least Squares continuous 2+ continuous
General Linear Models 1+ continuous 2+ continuous or categorical
variables
Types of Regression Models (#3)
6
Procedure Dependent variable Independent variables
Logistic Regression proportions 1+ continuous or categorical
Probit Analysis proportions 1+ continuous or categorical
Poisson Regression counts 1+ continuous or categorical
Negative Binomial
Regression
counts 1+ continuous or categorical
Life Data - Parametric
Models
failure times 1+ continuous or categorical
Example 1: Stability study
7
Y: percent of available chlorine
X: number of weeks since production
Lower acceptable limit for Y: 0.40
X-Y Scatterplot with Smooth
8
Simple Regression
9
Analysis Options
10
Tables and Graphs
11
Analysis Window
12
Analysis Summary
13
Lack-of-Fit Test
14
Comparison of Alternative Models
15
Fitted Reciprocal-X Model
16
Plot of Fitted Model
chlorine = 0.368053 + 1.02553/weeks
0 10 20 30 40 50
weeks
0.38
0.4
0.42
0.44
0.46
0.48
0.5
c
h
l
o
r
i
n
e
Lower 95% Prediction Limit
17
Outlier Removal
18
Plot of Fitted Model
chlorine = 0.366628 + 1.02548/weeks
0 10 20 30 40 50
weeks
0.38
0.4
0.42
0.44
0.46
0.48
0.5
c
h
l
o
r
i
n
e
Example 2: Nonlinear Regression
19
Draper and Smith in Applied Regression Analysis suggest fitting
a model of the form
Y = a + (0.49-a)exp[-b(x-8)]
Since the model is nonlinear in the parameters, it requires a
search procedure to find the best solution.
Data Input Dialog Box
20
Initial Parameter Estimates
21
Analysis Options
22
Plot of Fitted Model
23
Plot of Fitted Model
0 10 20 30 40 50
weeks
0.38
0.4
0.42
0.44
0.46
0.48
0.5
c
h
l
o
r
i
n
e
chlorine = 0.390144+(0.49-0.390144)*exp(-0.101644*(weeks-8))
Example 3: Calibration
24
The general calibration problem is that
of determining the likely value of X
given an observed value of Y.
Typically: X = item characteristic, Y =
measured value
Step 1: Build a regression model using
samples with known values of X
(“golden samples”).
Step 2: For another sample with
unknown X, predict X from Y.
Data Input Dialog Box
25
Reverse Prediction
26
Plot of Fitted Model
27
Plot of Fitted Model
measured = -0.0896667 + 1.01433*known
0 2 4 6 8 10
known
0
2
4
6
8
10
12
m
e
a
s
u
r
e
d
5.85573 (5.59032,6.1215)
5.85
Example 4: Comparison of
Regression Lines
28
Y: amount of scrap produced
X: production line speed
Levels: line number
Data Input Dialog Box
29
Analysis Options
30
Plot of Fitted Model
31
Line
1
2
Plot of Fitted Model
100 140 180 220 260 300 340
Speed
140
240
340
440
540
S
c
r
a
p
Significance Tests
32
Parallel Slope Model
33
Example 5: Multiple Regression
34
Stepwise Regression
35
Analysis Options
36
Selected Variables
37
Residual Plot
38
All Possible Regressions
39
Analysis Options
40
41
Example 6: Logistic Regression
42
Response variable may be in the form of proportions or binary (0/1).
Logistic Model
43
| | ) ... ( exp 1
1
) (
2 2 1 1 0 k k
X X X
Event P
| | | | + + + + ÷ +
=
k k
X X X
Event P
Event P
| | | | + + + + =
|
|
.
|

\
|
÷
...
) ( 1
) (
log
2 2 1 1 0
Let P(Event) be the probability an event occurs at specified values of
the independent variables X.
(1)
(2)
Data Input - Proportions
44
Analysis Options
45
Plot of Fitted Model
46
0 20 40 60 80 100
Plot of Fitted Model
with 95.0%confidence limits
0
0.2
0.4
0.6
0.8
1
F
a
i
l
u
r
e
s
/
S
p
e
c
i
m
e
n
s
Statistical Results
47
Data Input - Binary
48
Analysis Options
49
Analysis Summary
50
Example 7: Poisson Regression
51
Response variable is a count.
Poisson Model
52
Values of the response variable are assumed to follow a Poisson
distribution:
( )
k k
X X X | | | | ì + + + + = ... log
2 2 1 1 0
The rate parameter ì is related to the predictor variables through a log-
( )
! Y
e
Y p
Y ì
ì
÷
=
Data Input
53
Analysis Options
54
Statistical Results
55
Plot of Fitted Model
56
Thickness=170.0
Extraction=75.0
Height=55.0
0 10 20 30 40
Years
Plot of Fitted Model
with 95.0%confidence limits
0
1
2
3
4
5
I
n
j
u
r
i
e
s
References
 Applied Logistic Regression (second edition) – Hosmer and
Lemeshow, Wiley, 2000.
 Applied Regression Analysis (third edition) – Draper and
Smith, Wiley, 1998.
 Applied Linear Statistical Models (fifth edition) – Kutner et
al., McGraw-Hill, 2004.
 Classical and Modern Regression with Applications (second
edition) – Myers, Brookes-Cole, 1990.
57