You are on page 1of 33

Modelos lineales

Ivan Aliaga
16 de septiembre de 2015
demo("glm.vr")
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##

demo(glm.vr)
---- ~~~~~~
> # Copyright (C) 1997-2009 The R Core Team
>
> #### -*- R -*> require(stats)
> Fr <- c(68,42,42,30, 37,52,24,43,
+
66,50,33,23, 47,55,23,47,
+
63,53,29,27, 57,49,19,29)
> Temp <- gl(2, 2, 24, labels = c("Low", "High"))
> Soft <- gl(3, 8, 24, labels = c("Hard","Medium","Soft"))
> M.user <- gl(2, 4, 24, labels = c("N", "Y"))
> Brand <- gl(2, 1, 24, labels = c("X", "M"))
> detg <- data.frame(Fr,Temp, Soft,M.user, Brand)
> detg.m0 <- glm(Fr ~ M.user*Temp*Soft + Brand, family = poisson, data = detg)
> summary(detg.m0)
Call:
glm(formula = Fr ~ M.user * Temp * Soft + Brand, family = poisson,
data = detg)
Deviance Residuals:
Min
1Q
Median
-2.20876 -0.99190 -0.00126

3Q
0.93542

Max
1.97601

Coefficients:
(Intercept)
M.userY
TempHigh
SoftMedium
SoftSoft
BrandM
M.userY:TempHigh

Estimate Std. Error z value Pr(>|z|)


4.01524
0.10034 40.018 < 2e-16 ***
-0.21184
0.14257 -1.486 0.13731
-0.42381
0.15159 -2.796 0.00518 **
0.05311
0.13308
0.399 0.68984
0.05311
0.13308
0.399 0.68984
-0.01587
0.06300 -0.252 0.80106
0.13987
0.22168
0.631 0.52806
1

##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##

M.userY:SoftMedium
0.08323
M.userY:SoftSoft
0.12169
TempHigh:SoftMedium
-0.30442
TempHigh:SoftSoft
-0.30442
M.userY:TempHigh:SoftMedium 0.21189
M.userY:TempHigh:SoftSoft
-0.20387
--Signif. codes: 0 *** 0.001 ** 0.01

0.19685
0.19591
0.22239
0.22239
0.31577
0.32540

0.423
0.621
-1.369
-1.369
0.671
-0.627

0.67245
0.53449
0.17104
0.17104
0.50220
0.53098

* 0.05 . 0.1 1

(Dispersion parameter for poisson family taken to be 1)


Null deviance: 118.627
Residual deviance: 32.826
AIC: 191.24

on 23
on 11

degrees of freedom
degrees of freedom

Number of Fisher Scoring iterations: 4

> detg.mod <- glm(terms(Fr ~ M.user*Temp*Soft + Brand*M.user*Temp,


+
keep.order = TRUE),
+
family = poisson, data = detg)
> summary(detg.mod)
Call:
glm(formula = terms(Fr ~ M.user * Temp * Soft + Brand * M.user *
Temp, keep.order = TRUE), family = poisson, data = detg)
Deviance Residuals:
Min
1Q
-0.91365 -0.35585

Median
0.00253

3Q
0.33027

Max
0.92146

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept)
4.14887
0.10603 39.128 < 2e-16
M.userY
-0.40521
0.16188 -2.503 0.01231
TempHigh
-0.44275
0.17121 -2.586 0.00971
M.userY:TempHigh
-0.12692
0.26257 -0.483 0.62883
SoftMedium
0.05311
0.13308
0.399 0.68984
SoftSoft
0.05311
0.13308
0.399 0.68984
M.userY:SoftMedium
0.08323
0.19685
0.423 0.67245
M.userY:SoftSoft
0.12169
0.19591
0.621 0.53449
TempHigh:SoftMedium
-0.30442
0.22239 -1.369 0.17104
TempHigh:SoftSoft
-0.30442
0.22239 -1.369 0.17104
M.userY:TempHigh:SoftMedium 0.21189
0.31577
0.671 0.50220
M.userY:TempHigh:SoftSoft
-0.20387
0.32540 -0.627 0.53098
BrandM
-0.30647
0.10942 -2.801 0.00510
M.userY:BrandM
0.40757
0.15961
2.554 0.01066
TempHigh:BrandM
0.04411
0.18463
0.239 0.81119
M.userY:TempHigh:BrandM
0.44427
0.26673
1.666 0.09579
--Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
(Dispersion parameter for poisson family taken to be 1)

***
*
**

**
*
.

##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##

Null deviance: 118.627


Residual deviance:
5.656
AIC: 170.07

on 23
on 8

degrees of freedom
degrees of freedom

Number of Fisher Scoring iterations: 4

> summary(detg.mod, correlation = TRUE, symbolic.cor = TRUE)


Call:
glm(formula = terms(Fr ~ M.user * Temp * Soft + Brand * M.user *
Temp, keep.order = TRUE), family = poisson, data = detg)
Deviance Residuals:
Min
1Q
-0.91365 -0.35585

Median
0.00253

3Q
0.33027

Max
0.92146

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept)
4.14887
0.10603 39.128 < 2e-16
M.userY
-0.40521
0.16188 -2.503 0.01231
TempHigh
-0.44275
0.17121 -2.586 0.00971
M.userY:TempHigh
-0.12692
0.26257 -0.483 0.62883
SoftMedium
0.05311
0.13308
0.399 0.68984
SoftSoft
0.05311
0.13308
0.399 0.68984
M.userY:SoftMedium
0.08323
0.19685
0.423 0.67245
M.userY:SoftSoft
0.12169
0.19591
0.621 0.53449
TempHigh:SoftMedium
-0.30442
0.22239 -1.369 0.17104
TempHigh:SoftSoft
-0.30442
0.22239 -1.369 0.17104
M.userY:TempHigh:SoftMedium 0.21189
0.31577
0.671 0.50220
M.userY:TempHigh:SoftSoft
-0.20387
0.32540 -0.627 0.53098
BrandM
-0.30647
0.10942 -2.801 0.00510
M.userY:BrandM
0.40757
0.15961
2.554 0.01066
TempHigh:BrandM
0.04411
0.18463
0.239 0.81119
M.userY:TempHigh:BrandM
0.44427
0.26673
1.666 0.09579
--Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
(Dispersion parameter for poisson family taken to be 1)
Null deviance: 118.627
Residual deviance:
5.656
AIC: 170.07

on 23
on 8

degrees of freedom
degrees of freedom

Number of Fisher Scoring iterations: 4


Correlation of Coefficients:
(Intercept)
M.userY
TempHigh
M.userY:TempHigh
SoftMedium

1
,
,
.
,

1
. 1
, , 1
. .
1

***
*
**

**
*
.

##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##

SoftSoft
,
M.userY:SoftMedium
.
M.userY:SoftSoft
.
TempHigh:SoftMedium
.
TempHigh:SoftSoft
.
M.userY:TempHigh:SoftMedium
M.userY:TempHigh:SoftSoft
BrandM
.
M.userY:BrandM
TempHigh:BrandM
M.userY:TempHigh:BrandM
attr(,"legend")
[1] 0 0.3 . 0.6 , 0.8

. .
.
,
. ,
,
. .
, . .
, . .
. . . .
. . .

1
. 1
, . 1
. .
1
.
. . 1
, . , . 1
. . , . , . 1
1
, 1
. . 1
. . , 1

.
. .
. .

+ 0.9 * 0.95 B 1

> anova(detg.m0, detg.mod)


Analysis of Deviance Table
Model 1:
Model 2:
Resid.
1
2

Fr ~ M.user * Temp * Soft + Brand


Fr ~ M.user * Temp * Soft + Brand * M.user * Temp
Df Resid. Dev Df Deviance
11
32.826
8
5.656 3
27.17

demo("lm.glm")
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##

demo(lm.glm)
---- ~~~~~~
>
>
>
>
>
>
>
>

### Examples from: "An Introduction to Statistical Modelling"


###
By Annette Dobson
###
### == with some additions ==
#

Copyright (C) 1997-2008 The R Core Team

require(stats); require(graphics)

> ## Plant Weight Data (Page 9)


> ctl <- c(4.17,5.58,5.18,6.11,4.50,4.61,5.17,4.53,5.33,5.14)
> trt <- c(4.81,4.17,4.41,3.59,5.87,3.83,6.03,4.89,4.32,4.69)
> group <- gl(2,10, labels=c("Ctl","Trt"))
> weight <- c(ctl,trt)
> anova (lm(weight~group))
Analysis of Variance Table
Response: weight
Df Sum Sq Mean Sq F value Pr(>F)
4

##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##

group
1 0.6882 0.68820
Residuals 18 8.7292 0.48496

1.4191

0.249

> summary(lm(weight~group -1))


Call:
lm(formula = weight ~ group - 1)
Residuals:
Min
1Q
-1.0710 -0.4938

Median
0.0685

3Q
0.2462

Max
1.3690

Coefficients:
Estimate Std. Error t value Pr(>|t|)
groupCtl
5.0320
0.2202
22.85 9.55e-15 ***
groupTrt
4.6610
0.2202
21.16 3.62e-14 ***
--Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Residual standard error: 0.6964 on 18 degrees of freedom
Multiple R-squared: 0.9818, Adjusted R-squared: 0.9798
F-statistic: 485.1 on 2 and 18 DF, p-value: < 2.2e-16

> ## Birth Weight Data (Page 14)


> age <- c(40, 38, 40, 35, 36, 37, 41, 40, 37, 38, 40, 38,
+
40, 36, 40, 38, 42, 39, 40, 37, 36, 38, 39, 40)
> birthw <- c(2968, 2795, 3163, 2925, 2625, 2847, 3292, 3473, 2628, 3176,
+
3421, 2975, 3317, 2729, 2935, 2754, 3210, 2817, 3126, 2539,
+
2412, 2991, 2875, 3231)
> sex <- gl(2,12, labels=c("M","F"))
> plot(age, birthw, col=as.numeric(sex), pch=3*as.numeric(sex),
+
main="Dobsons Birth Weight Data")

Dobson's Birth Weight Data

2800
2400

birthw

3200

M
F

35

36

37

38

39

40

41

age

##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##

> lines(lowess(age[sex==M], birthw[sex==M]), col=1)


> lines(lowess(age[sex==F], birthw[sex==F]), col=2)
> legend("topleft", levels(sex), col=1:2, pch=3*(1:2), lty=1, bty="n")
> summary(l1 <- lm(birthw ~ sex + age),

correlation=TRUE)

Call:
lm(formula = birthw ~ sex + age)
Residuals:
Min
1Q
-257.49 -125.28

Median
-58.44

3Q
169.00

Max
303.98

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -1610.28
786.08 -2.049
0.0532 .
sexF
-163.04
72.81 -2.239
0.0361 *
age
120.89
20.46
5.908 7.28e-06 ***
--Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Residual standard error: 177.1 on 21 degrees of freedom
Multiple R-squared:
0.64, Adjusted R-squared: 0.6057
6

42

##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##

F-statistic: 18.67 on 2 and 21 DF,

p-value: 2.194e-05

Correlation of Coefficients:
(Intercept) sexF
sexF 0.07
age -1.00
-0.12

> summary(l0 <- lm(birthw ~ sex + age -1), correlation=TRUE)


Call:
lm(formula = birthw ~ sex + age - 1)
Residuals:
Min
1Q
-257.49 -125.28

Median
-58.44

3Q
169.00

Max
303.98

Coefficients:
Estimate Std. Error t value Pr(>|t|)
sexM -1610.28
786.08 -2.049
0.0532 .
sexF -1773.32
794.59 -2.232
0.0367 *
age
120.89
20.46
5.908 7.28e-06 ***
--Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Residual standard error: 177.1 on 21 degrees of freedom
Multiple R-squared: 0.9969, Adjusted R-squared: 0.9965
F-statistic: 2258 on 3 and 21 DF, p-value: < 2.2e-16
Correlation of Coefficients:
sexM sexF
sexF 1.00
age -1.00 -1.00

> anova(l1,l0)
Analysis of Variance Table
Model 1:
Model 2:
Res.Df
1
21
2
21

birthw ~ sex + age


birthw ~ sex + age - 1
RSS Df Sum of Sq F Pr(>F)
658771
658771 0 -4.191e-09

> summary(li <- lm(birthw ~ sex + sex:age -1), correlation=TRUE)


Call:
lm(formula = birthw ~ sex + sex:age - 1)
Residuals:
Min
1Q
-246.69 -138.11

Median
-39.13

3Q
176.57

Max
274.28

Coefficients:

##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##

Estimate Std. Error t value Pr(>|t|)


sexM
-1268.67
1114.64 -1.138 0.268492
sexF
-2141.67
1163.60 -1.841 0.080574 .
sexM:age
111.98
29.05
3.855 0.000986 ***
sexF:age
130.40
30.00
4.347 0.000313 ***
--Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Residual standard error: 180.6 on 20 degrees of freedom
Multiple R-squared: 0.9969, Adjusted R-squared: 0.9963
F-statistic: 1629 on 4 and 20 DF, p-value: < 2.2e-16
Correlation of Coefficients:
sexM sexF sexM:age
sexF
0.00
sexM:age -1.00 0.00
sexF:age 0.00 -1.00 0.00

> anova(li,l0)
Analysis of Variance Table
Model 1:
Model 2:
Res.Df
1
20
2
21

birthw
birthw
RSS
652425
658771

~ sex + sex:age - 1
~ sex + age - 1
Df Sum of Sq
F Pr(>F)
-1

-6346.2 0.1945 0.6639

> summary(zi <- glm(birthw ~ sex + age, family=gaussian()))


Call:
glm(formula = birthw ~ sex + age, family = gaussian())
Deviance Residuals:
Min
1Q
Median
-257.49 -125.28
-58.44

3Q
169.00

Max
303.98

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -1610.28
786.08 -2.049
0.0532 .
sexF
-163.04
72.81 -2.239
0.0361 *
age
120.89
20.46
5.908 7.28e-06 ***
--Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
(Dispersion parameter for gaussian family taken to be 31370.04)
Null deviance: 1829873
Residual deviance: 658771
AIC: 321.39

on 23
on 21

degrees of freedom
degrees of freedom

Number of Fisher Scoring iterations: 2

##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##

> summary(z0 <- glm(birthw ~ sex + age - 1, family=gaussian()))


Call:
glm(formula = birthw ~ sex + age - 1, family = gaussian())
Deviance Residuals:
Min
1Q
Median
-257.49 -125.28
-58.44

3Q
169.00

Max
303.98

Coefficients:
Estimate Std. Error t value Pr(>|t|)
sexM -1610.28
786.08 -2.049
0.0532 .
sexF -1773.32
794.59 -2.232
0.0367 *
age
120.89
20.46
5.908 7.28e-06 ***
--Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
(Dispersion parameter for gaussian family taken to be 31370.04)
Null deviance: 213198964
Residual deviance:
658771
AIC: 321.39

on 24
on 21

degrees of freedom
degrees of freedom

Number of Fisher Scoring iterations: 2

> anova(zi, z0)


Analysis of Deviance Table
Model 1:
Model 2:
Resid.
1
2

birthw ~ sex + age


birthw ~ sex + age - 1
Df Resid. Dev Df
Deviance
21
658771
21
658771 0 5.8208e-10

> summary(z.o4 <- update(z0, subset = -4))


Call:
glm(formula = birthw ~ sex + age - 1, family = gaussian(), subset = -4)
Deviance Residuals:
Min
1Q
Median
-253.86 -129.46
-53.46

3Q
165.04

Max
251.14

Coefficients:
Estimate Std. Error t value Pr(>|t|)
sexM -2318.03
801.57 -2.892 0.00902 **
sexF -2455.44
803.79 -3.055 0.00625 **
age
138.50
20.71
6.688 1.65e-06 ***
--Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
(Dispersion parameter for gaussian family taken to be 26925.39)

##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##

Null deviance: 204643339


Residual deviance:
538508
AIC: 304.68

on 23
on 20

degrees of freedom
degrees of freedom

Number of Fisher Scoring iterations: 2

> summary(zz <- update(z0, birthw ~ sex+age-1 + sex:age))


Call:
glm(formula = birthw ~ sex + age + sex:age - 1, family = gaussian())
Deviance Residuals:
Min
1Q
Median
-246.69 -138.11
-39.13

3Q
176.57

Max
274.28

Coefficients:
Estimate Std. Error t value Pr(>|t|)
sexM
-1268.67
1114.64 -1.138 0.268492
sexF
-2141.67
1163.60 -1.841 0.080574 .
age
111.98
29.05
3.855 0.000986 ***
sexF:age
18.42
41.76
0.441 0.663893
--Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
(Dispersion parameter for gaussian family taken to be 32621.23)
Null deviance: 213198964
Residual deviance:
652425
AIC: 323.16

on 24
on 20

degrees of freedom
degrees of freedom

Number of Fisher Scoring iterations: 2

> anova(z0,zz)
Analysis of Deviance Table
Model 1:
Model 2:
Resid.
1
2

birthw ~ sex + age - 1


birthw ~ sex + age + sex:age - 1
Df Resid. Dev Df Deviance
21
658771
20
652425 1
6346.2

> ## Poisson Regression Data (Page 42)


> x <- c(-1,-1,0,0,0,0,1,1,1)
> y <- c(2,3,6,7,8,9,10,12,15)
> summary(glm(y~x, family=poisson(link="identity")))
Call:
glm(formula = y ~ x, family = poisson(link = "identity"))
Deviance Residuals:

10

##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##

Min
-0.7019

1Q
-0.3377

Median
-0.1105

3Q
0.2958

Max
0.7184

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept)
7.4516
0.8841
8.428 < 2e-16 ***
x
4.9353
1.0892
4.531 5.86e-06 ***
--Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
(Dispersion parameter for poisson family taken to be 1)
Null deviance: 18.4206
Residual deviance: 1.8947
AIC: 40.008

on 8
on 7

degrees of freedom
degrees of freedom

Number of Fisher Scoring iterations: 3

> ## Calorie Data (Page 45)


> calorie <- data.frame(
+
carb = c(33,40,37,27,30,43,34,48,30,38,
+
50,51,30,36,41,42,46,24,35,37),
+
age
= c(33,47,49,35,46,52,62,23,32,42,
+
31,61,63,40,50,64,56,61,48,28),
+
wgt
= c(100, 92,135,144,140,101, 95,101, 98,105,
+
108, 85,130,127,109,107,117,100,118,102),
+
prot = c(14,15,18,12,15,15,14,17,15,14,
+
17,19,19,20,15,16,18,13,18,14))
> summary(lmcal <- lm(carb~age+wgt+prot, data= calorie))
Call:
lm(formula = carb ~ age + wgt + prot, data = calorie)
Residuals:
Min
1Q
-10.3424 -4.8203

Median
0.9897

3Q
3.8553

Max
7.9087

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 36.96006
13.07128
2.828 0.01213 *
age
-0.11368
0.10933 -1.040 0.31389
wgt
-0.22802
0.08329 -2.738 0.01460 *
prot
1.95771
0.63489
3.084 0.00712 **
--Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Residual standard error: 5.956 on 16 degrees of freedom
Multiple R-squared: 0.4805, Adjusted R-squared: 0.3831
F-statistic: 4.934 on 3 and 16 DF, p-value: 0.01297

> ## Extended Plant Data (Page 59)

11

##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##

> ctl <- c(4.17,5.58,5.18,6.11,4.50,4.61,5.17,4.53,5.33,5.14)


> trtA <- c(4.81,4.17,4.41,3.59,5.87,3.83,6.03,4.89,4.32,4.69)
> trtB <- c(6.31,5.12,5.54,5.50,5.37,5.29,4.92,6.15,5.80,5.26)
> group <- gl(3, length(ctl), labels=c("Ctl","A","B"))
> weight <- c(ctl,trtA,trtB)
> anova(lmwg <- lm(weight~group))
Analysis of Variance Table
Response: weight
Df Sum Sq Mean Sq F value Pr(>F)
group
2 3.7663 1.8832 4.8461 0.01591 *
Residuals 27 10.4921 0.3886
--Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
> summary(lmwg)
Call:
lm(formula = weight ~ group)
Residuals:
Min
1Q Median
-1.0710 -0.4180 -0.0060

3Q
0.2627

Max
1.3690

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)
5.0320
0.1971 25.527
<2e-16 ***
groupA
-0.3710
0.2788 -1.331
0.1944
groupB
0.4940
0.2788
1.772
0.0877 .
--Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Residual standard error: 0.6234 on 27 degrees of freedom
Multiple R-squared: 0.2641, Adjusted R-squared: 0.2096
F-statistic: 4.846 on 2 and 27 DF, p-value: 0.01591

> coef(lmwg)
(Intercept)
5.032

groupA
-0.371

groupB
0.494

> coef(summary(lmwg))#- incl. std.err, t- and P- values.


Estimate Std. Error
t value
Pr(>|t|)
(Intercept)
5.032 0.1971284 25.526514 1.936575e-20
groupA
-0.371 0.2787816 -1.330791 1.943879e-01
groupB
0.494 0.2787816 1.771996 8.768168e-02
> ## Fictitious Anova Data (Page 64)
> y <- c(6.8,6.6,5.3,6.1,7.5,7.4,7.2,6.5,7.8,9.1,8.8,9.1)

12

##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##

> a <- gl(3,4)


> b <- gl(2,2, length(a))
> anova(z <- lm(y~a*b))
Analysis of Variance Table
Response: y
Df Sum Sq Mean Sq F value
Pr(>F)
a
2 12.7400 6.3700 25.8243 0.001127 **
b
1 0.4033 0.4033 1.6351 0.248225
a:b
2 1.2067 0.6033 2.4459 0.167164
Residuals 6 1.4800 0.2467
--Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
> ## Achievement Scores (Page 70)
> y <- c(6,4,5,3,4,3,6, 8,9,7,9,8,5,7, 6,7,7,7,8,5,7)
> x <- c(3,1,3,1,2,1,4, 4,5,5,4,3,1,2, 3,2,2,3,4,1,4)
> m <- gl(3,7)
> anova(z <- lm(y~x+m))
Analysis of Variance Table
Response: y
Df Sum Sq Mean Sq F value
Pr(>F)
x
1 36.575 36.575 60.355 5.428e-07 ***
m
2 16.932
8.466 13.970 0.0002579 ***
Residuals 17 10.302
0.606
--Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
> ## Beetle Data (Page 78)
> dose <- c(1.6907, 1.7242, 1.7552, 1.7842, 1.8113, 1.8369, 1.861, 1.8839)
> x <- c( 6, 13, 18, 28, 52, 53, 61, 60)
> n <- c(59, 60, 62, 56, 63, 59, 62, 60)
> dead <- cbind(x, n-x)
> summary(

glm(dead ~ dose, family=binomial(link=logit)))

Call:
glm(formula = dead ~ dose, family = binomial(link = logit))
Deviance Residuals:
Min
1Q
Median
-1.5941 -0.3944
0.8329

3Q
1.2592

Max
1.5940

Coefficients:

13

##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##

Estimate Std. Error z value Pr(>|z|)


(Intercept) -60.717
5.181 -11.72
<2e-16 ***
dose
34.270
2.912
11.77
<2e-16 ***
--Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 284.202
Residual deviance: 11.232
AIC: 41.43

on 7
on 6

degrees of freedom
degrees of freedom

Number of Fisher Scoring iterations: 4

> summary(

glm(dead ~ dose, family=binomial(link=probit)))

Call:
glm(formula = dead ~ dose, family = binomial(link = probit))
Deviance Residuals:
Min
1Q
Median
-1.5714 -0.4703
0.7501

3Q
1.0632

Max
1.3449

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -34.935
2.648 -13.19
<2e-16 ***
dose
19.728
1.487
13.27
<2e-16 ***
--Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 284.20
Residual deviance: 10.12
AIC: 40.318

on 7
on 6

degrees of freedom
degrees of freedom

Number of Fisher Scoring iterations: 4

> summary(z <- glm(dead ~ dose, family=binomial(link=cloglog)))


Call:
glm(formula = dead ~ dose, family = binomial(link = cloglog))
Deviance Residuals:
Min
1Q
-0.80329 -0.55135

Median
0.03089

3Q
0.38315

Max
1.28883

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -39.572
3.240 -12.21
<2e-16 ***
dose
22.041
1.799
12.25
<2e-16 ***
---

14

##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##

Signif. codes:

0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

(Dispersion parameter for binomial family taken to be 1)


Null deviance: 284.2024
Residual deviance:
3.4464
AIC: 33.644

on 7
on 6

degrees of freedom
degrees of freedom

Number of Fisher Scoring iterations: 4

> anova(z, update(z, dead ~ dose -1))


Analysis of Deviance Table
Model 1: dead ~ dose
Model 2: dead ~ dose - 1
Resid. Df Resid. Dev Df Deviance
1
6
3.446
2
7
285.222 -1 -281.78
>
>
>
>
>
>

## Anther Data (Page 84)


## Note that the proportions below are not exactly
## in accord with the sample sizes quoted below.
## In particular, the value 0.555 does not seem sensible.
## [MM: huh? round(round(n*p)/n, 3) looks almost exactly like "p" !]
n <- c(102, 99,
108,
76,
81,
90)

> p <- c(0.539,0.525,0.528,0.724,0.617,0.555)


> x <- round(n*p)
> ## x <- n*p
> y <- cbind(x,n-x)
> f <- rep(c(40,150,350),2)
> (g <- gl(2,3))
[1] 1 1 1 2 2 2
Levels: 1 2
> summary(glm(y ~ g*f, family=binomial(link="logit")))
Call:
glm(formula = y ~ g * f, family = binomial(link = "logit"))
Deviance Residuals:
1
2
0.08269 -0.12998

3
0.04414

4
0.42320

5
-0.60082

6
0.19522

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.1456719 0.1975451
0.737
0.4609
g2
0.7963143 0.3125046
2.548
0.0108 *
f
-0.0001227 0.0008782 -0.140
0.8889

15

##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##

g2:f
-0.0020493 0.0013483 -1.520
0.1285
--Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 10.45197
Residual deviance: 0.60387
AIC: 38.172

on 5
on 2

degrees of freedom
degrees of freedom

Number of Fisher Scoring iterations: 3

> summary(glm(y ~ g + f, family=binomial(link="logit")))


Call:
glm(formula = y ~ g + f, family = binomial(link = "logit"))
Deviance Residuals:
1
2
3
-0.5507 -0.2781
0.7973

4
1.1558

5
-0.3688

6
-0.6584

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.306643
0.167629
1.829
0.0674 .
g2
0.405554
0.174560
2.323
0.0202 *
f
-0.000997
0.000665 -1.499
0.1338
--Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 10.4520
Residual deviance: 2.9218
AIC: 38.49

on 5
on 3

degrees of freedom
degrees of freedom

Number of Fisher Scoring iterations: 3

> ## The "final model"


> summary(glm.p84 <- glm(y~g,

family=binomial(link="logit")))

Call:
glm(formula = y ~ g, family = binomial(link = "logit"))
Deviance Residuals:
1
2
3
0.17150 -0.10947 -0.06177

4
1.77208

5
-0.19040

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept)
0.1231
0.1140
1.080
0.2801
g2
0.3985
0.1741
2.289
0.0221 *
---

16

6
-1.39686

##
##
##
##
##
##
##
##
##
##
##
##
##
##

Signif. codes:

0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

(Dispersion parameter for binomial family taken to be 1)


Null deviance: 10.452
Residual deviance: 5.173
AIC: 38.741

on 5
on 4

degrees of freedom
degrees of freedom

Number of Fisher Scoring iterations: 3

> op <- par(mfrow = c(2,2), oma = c(0,0,1,0))


> plot(glm.p84) # well ?

5
6

0.2

0.3

0.4

0.5

Normal QQ
4

10 10

Residuals

Residuals vs Fitted

Std. deviance resid.

glm(y ~ g)

1
6

1.0 0.5

0.4

0.5

Predicted values

##
##
##
##
##
##
##
##
##
##

0.3

1.0

Residuals vs Leverage
1
0.5
5

Cook's distance

4
6

Std. Pearson resid.

ScaleLocation

0.2

0.5

Theoretical Quantiles

Std. deviance resid.

Predicted values

0.0

0.0

0.1

0.2

0.3

Leverage

> par(op)
> ## Tumour Data (Page 92)
> counts <- c(22,2,10,16,54,115,19,33,73,11,17,28)
> type <- gl(4,3,12,labels=c("freckle","superficial","nodular","indeterminate"))
> site <- gl(3,1,12,labels=c("head/neck","trunk","extremities"))

17

0.5
1

##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##

> data.frame(counts,type,site)
counts
type
site
1
22
freckle
head/neck
2
2
freckle
trunk
3
10
freckle extremities
4
16
superficial
head/neck
5
54
superficial
trunk
6
115
superficial extremities
7
19
nodular
head/neck
8
33
nodular
trunk
9
73
nodular extremities
10
11 indeterminate
head/neck
11
17 indeterminate
trunk
12
28 indeterminate extremities
> summary(z <- glm(counts ~ type + site, family=poisson()))
Call:
glm(formula = counts ~ type + site, family = poisson())
Deviance Residuals:
Min
1Q
Median
-3.0453 -1.0741
0.1297

3Q
0.5857

Max
5.1354

Coefficients:
Estimate Std. Error z
(Intercept)
1.7544
0.2040
typesuperficial
1.6940
0.1866
typenodular
1.3020
0.1934
typeindeterminate
0.4990
0.2174
sitetrunk
0.4439
0.1554
siteextremities
1.2010
0.1383
--Signif. codes: 0 *** 0.001 ** 0.01

value Pr(>|z|)
8.600 < 2e-16 ***
9.079 < 2e-16 ***
6.731 1.68e-11 ***
2.295 0.02173 *
2.857 0.00427 **
8.683 < 2e-16 ***
* 0.05 . 0.1 1

(Dispersion parameter for poisson family taken to be 1)


Null deviance: 295.203
Residual deviance: 51.795
AIC: 122.91

on 11
on 6

degrees of freedom
degrees of freedom

Number of Fisher Scoring iterations: 5

> ## Randomized Controlled Trial (Page 93)


> counts <- c(18,17,15, 20,10,20, 25,13,12)
> outcome

<- gl(3, 1, length(counts))

> treatment <- gl(3, 3)


> summary(z <- glm(counts ~ outcome + treatment, family=poisson()))
Call:

18

##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##

glm(formula = counts ~ outcome + treatment, family = poisson())


Deviance Residuals:
1
2
3
-0.67125
0.96272 -0.16965
8
9
-0.09167 -0.96656

4
-0.21999

5
-0.95552

6
1.04939

7
0.84715

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 3.045e+00 1.709e-01 17.815
<2e-16 ***
outcome2
-4.543e-01 2.022e-01 -2.247
0.0246 *
outcome3
-2.930e-01 1.927e-01 -1.520
0.1285
treatment2
1.338e-15 2.000e-01
0.000
1.0000
treatment3
1.421e-15 2.000e-01
0.000
1.0000
--Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
(Dispersion parameter for poisson family taken to be 1)
Null deviance: 10.5814
Residual deviance: 5.1291
AIC: 56.761

on 8
on 4

degrees of freedom
degrees of freedom

Number of Fisher Scoring iterations: 4

> ## Peptic Ulcers and Blood Groups


> counts <- c(579, 4219, 911, 4578, 246, 3775, 361, 4532, 291, 5261, 396, 6598)
> group <- gl(2, 1, 12, labels=c("cases","controls"))
> blood <- gl(2, 2, 12, labels=c("A","O"))
> city

<- gl(3, 4, 12, labels=c("London","Manchester","Newcastle"))

> cbind(group, blood, city, counts) # gives internal codes for the factors
group blood city counts
[1,]
1
1
1
579
[2,]
2
1
1
4219
[3,]
1
2
1
911
[4,]
2
2
1
4578
[5,]
1
1
2
246
[6,]
2
1
2
3775
[7,]
1
2
2
361
[8,]
2
2
2
4532
[9,]
1
1
3
291
[10,]
2
1
3
5261
[11,]
1
2
3
396
[12,]
2
2
3
6598
> summary(z1 <- glm(counts ~ group*(city + blood), family=poisson()))
Call:

19

##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##

glm(formula = counts ~ group * (city + blood), family = poisson())


Deviance Residuals:
1
2
3
-0.7520
3.0183
0.6099
9
10
11
0.9318 -2.2691 -0.7742

4
-2.8137
12
2.0648

5
0.1713

6
-0.4339

7
-0.1405

8
0.3977

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept)
6.39239
0.03476 183.92 < 2e-16
groupcontrols
1.90813
0.03691
51.69 < 2e-16
cityManchester
-0.89800
0.04815 -18.65 < 2e-16
cityNewcastle
-0.77420
0.04612 -16.79 < 2e-16
bloodO
0.40187
0.03867
10.39 < 2e-16
groupcontrols:cityManchester 0.84069
0.05052
16.64 < 2e-16
groupcontrols:cityNewcastle
1.07287
0.04822
22.25 < 2e-16
groupcontrols:bloodO
-0.23208
0.04043
-5.74 9.46e-09
--Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

***
***
***
***
***
***
***
***

(Dispersion parameter for poisson family taken to be 1)


Null deviance: 26717.157
Residual deviance:
29.241
AIC: 154.32

on 11
on 4

degrees of freedom
degrees of freedom

Number of Fisher Scoring iterations: 3

> summary(z2 <- glm(counts ~ group*city + blood, family=poisson()),


+
correlation = TRUE)
Call:
glm(formula = counts ~ group * city + blood, family = poisson())
Deviance Residuals:
1
2
3
-3.7688
3.7168
3.2813
9
10
11
-1.1458 -1.4687
1.0218

4
-3.4418
12
1.3275

5
-1.7675

6
0.2387

7
1.5565

8
-0.2174

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept)
6.51395
0.02663 244.60
<2e-16
groupcontrols
1.77563
0.02801
63.38
<2e-16
cityManchester
-0.89800
0.04815 -18.65
<2e-16
cityNewcastle
-0.77420
0.04612 -16.79
<2e-16
bloodO
0.18988
0.01128
16.84
<2e-16
groupcontrols:cityManchester 0.84069
0.05052
16.64
<2e-16
groupcontrols:cityNewcastle
1.07287
0.04822
22.25
<2e-16
--Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

20

***
***
***
***
***
***
***

##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##

(Dispersion parameter for poisson family taken to be 1)


Null deviance: 26717.157
Residual deviance:
62.558
AIC: 185.63

on 11
on 5

degrees of freedom
degrees of freedom

Number of Fisher Scoring iterations: 4


Correlation of Coefficients:
groupcontrols
cityManchester
cityNewcastle
bloodO
groupcontrols:cityManchester
groupcontrols:cityNewcastle

(Intercept) groupcontrols cityManchester


-0.90
-0.52
0.50
-0.55
0.52
0.30
-0.23
0.00
0.00
0.50
-0.55
-0.95
0.52
-0.58
-0.29
cityNewcastle bloodO

groupcontrols
cityManchester
cityNewcastle
bloodO
0.00
groupcontrols:cityManchester -0.29
0.00
groupcontrols:cityNewcastle -0.96
0.00
groupcontrols:cityManchester
groupcontrols
cityManchester
cityNewcastle
bloodO
groupcontrols:cityManchester
groupcontrols:cityNewcastle
0.32

> anova(z2, z1, test = "Chisq")


Analysis of Deviance Table
Model 1: counts ~ group * city + blood
Model 2: counts ~ group * (city + blood)
Resid. Df Resid. Dev Df Deviance Pr(>Chi)
1
5
62.558
2
4
29.241 1
33.318 7.827e-09 ***
--Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

demo("nlm")
##
##
##
##
##
##
##
##
##

demo(nlm)
---- ~~~
> # Copyright (C) 1997-2009 The R Core Team
>
> ### Helical Valley Function
> ### Page 362 Dennis + Schnabel
21

>
> require(stats); require(graphics)
> theta <- function(x1,x2) (atan(x2/x1) + (if(x1 <= 0) pi else 0))/ (2*pi)
> ## but this is easier :
> theta <- function(x1,x2) atan2(x2, x1)/(2*pi)
> f <- function(x) {
+
f1 <- 10*(x[3] - 10*theta(x[1],x[2]))
+
f2 <- 10*(sqrt(x[1]^2+x[2]^2)-1)
+
f3 <- x[3]
+
return(f1^2+f2^2+f3^2)
+ }
> ## explore surface {at x3 = 0}
> x <- seq(-1, 2, length.out=50)
> y <- seq(-1, 1, length.out=50)
> z <- apply(as.matrix(expand.grid(x, y)), 1, function(x) f(c(x, 0)))
> contour(x, y, matrix(log10(z), 50, 50))

1.0

##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##

0.5

3
0

2.5

0.0

0.5

1.0

0.5

1.5

1.0

0.5

0.0

0.5

##
22

1.0

1.5

2.0

##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##

> str(nlm.f <List of 6


$ minimum
:
$ estimate :
$ gradient :
$ hessian
:
$ code
:
$ iterations:

nlm(f, c(-1,0,0), hessian = TRUE))


num
num
num
num
int
int

1.24e-14
[1:3] 1.00 3.07e-09 -6.06e-09
[1:3] -3.76e-07 3.49e-06 -2.20e-06
[1:3, 1:3] 2.00e+02 -4.07e-02 9.77e-07 -4.07e-02 5.07e+02 ...
2
27

> points(rbind(nlm.f$estim[1:2]), col = "red", pch = 20)


>
>
>
+
+
+
+

### the Rosenbrock banana valley function

>
>
+
+
+
+

## explore surface
fx <- function(x)
{
## vectorized version of fR()
x1 <- x[,1]; x2 <- x[,2]
100*(x2 - x1*x1)^2 + (1-x1)^2
}

fR <- function(x)
{
x1 <- x[1]; x2 <- x[2]
100*(x2 - x1*x1)^2 + (1-x1)^2
}

> x <- seq(-2, 2, length.out=100)


> y <- seq(-0.5, 1.5, length.out=100)
> z <- fx(expand.grid(x, y))
> op <- par(mfrow = c(2,1), mar = 0.1 + c(3,3,0,0))
> contour(x, y, matrix(log10(z), length(x)))

> str(nlm.f2 <- nlm(fR, c(-1.2, 1), hessian = TRUE))


List of 6
$ minimum
: num 3.97e-12
$ estimate : num [1:2] 1 1
$ gradient : num [1:2] -6.54e-07 3.34e-07
$ hessian
: num [1:2, 1:2] 802 -400 -400 200
$ code
: int 1
$ iterations: int 23
> points(rbind(nlm.f2$estim[1:2]), col = "red", pch = 20)
> ## Zoom in :
> rect(0.9, 0.9, 1.1, 1.1, border = "orange", lwd = 2)
> x <- y <- seq(0.9, 1.1, length.out=100)
> z <- fx(expand.grid(x, y))
23

0.5 1.0 1.5

##
## > contour(x, y, matrix(log10(z), length(x)))

2
1.5

0.

0.5

1.5

0
zoomed in

1.10

1.5

2.

2.

0.5

1.05

1.10

.5

1.00

0.5

2
1

0.90

0.90

##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##

.5

2.

.5

.5

1
1

.5

0.95

1.00

> mtext("zoomed in");box(col = "orange")


> points(rbind(nlm.f2$estim[1:2]), col = "red", pch = 20)
> par(op)
> fg <- function(x)
+ {
+
gr <- function(x1, x2) {
+
c(-400*x1*(x2 - x1*x1)-2*(1-x1), 200*(x2 - x1*x1))
+
}
+
x1 <- x[1]; x2 <- x[2]
+
res<- 100*(x2 - x1*x1)^2 + (1-x1)^2
+
attr(res, "gradient") <- gr(x1, x2)
+
return(res)
+ }
> nlm(fg, c(-1.2, 1), hessian = TRUE)
$minimum
[1] 1.182096e-20

24

0.5

##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##

$estimate
[1] 1 1
$gradient
[1] 2.583521e-09 -1.201128e-09
$hessian
[,1]
[,2]
[1,] 802.24 -400.02
[2,] -400.02 200.00
$code
[1] 1
$iterations
[1] 24

> ## or use deriv to find the derivatives


>
> fd <- deriv(~ 100*(x2 - x1*x1)^2 + (1-x1)^2, c("x1", "x2"))
> fdd <- function(x1, x2) {}
> body(fdd) <- fd
> nlm(function(x) fdd(x[1], x[2]), c(-1.2,1), hessian = TRUE)
$minimum
[1] 1.182096e-20
$estimate
[1] 1 1
$gradient
[1] 2.583521e-09 -1.201128e-09
$hessian
[,1]
[,2]
[1,] 802.24 -400.02
[2,] -400.02 200.00
$code
[1] 1
$iterations
[1] 24

> fgh <- function(x)


+ {
+
gr <- function(x1, x2)
+
c(-400*x1*(x2 - x1*x1) - 2*(1-x1), 200*(x2 - x1*x1))
+
h <- function(x1, x2) {
+
a11 <- 2 - 400*x2 + 1200*x1*x1

25

##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##

+
+
+
+
+
+
+
+
+ }

a21 <- -400*x1


matrix(c(a11, a21, a21, 200), 2, 2)
}
x1 <- x[1]; x2 <- x[2]
res<- 100*(x2 - x1*x1)^2 + (1-x1)^2
attr(res, "gradient") <- gr(x1, x2)
attr(res, "hessian") <- h(x1, x2)
return(res)

> nlm(fgh, c(-1.2,1), hessian = TRUE)


$minimum
[1] 2.829175
$estimate
[1] -0.6786981

0.4711891

$gradient
[1] -0.4911201

2.1115987

$hessian
[,1]
[,2]
[1,] 366.1188 271.4593
[2,] 271.4593 200.0000
$code
[1] 4
$iterations
[1] 100

demo("smooth")
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##

demo(smooth)
---- ~~~~~~
> ### This used to be in
example(smooth) before we had package-specific demos
> # Copyright (C) 1997-2009 The R Core Team
>
> require(stats); require(graphics); require(datasets)
> op <- par(mfrow = c(1,1))
> ## The help(smooth) examples:
> example(smooth, package="stats")
smooth> require(graphics)
smooth> ## see also
demo(smooth) !
smooth>
smooth> x1 <- c(4, 1, 3, 6, 6, 4, 1, 6, 2, 4, 2) # very artificial

26

##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##

smooth> (x3R <- smooth(x1, "3R")) # 2 iterations of "3"


3R Tukey smoother resulting from smooth(x = x1, kind = "3R")
used 2 iterations
[1] 3 3 3 6 6 4 4 4 2 2 2
smooth> smooth(x3R, kind = "S")
S Tukey smoother resulting from
changed
[1] 3 3 3 3 4 4 4 4 2 2 2

smooth(x = x3R, kind = "S")

smooth> sm.3RS <- function(x, ...)


smooth+
smooth(smooth(x, "3R", ...), "S", ...)
smooth> y <- c(1, 1, 19:1)
smooth> plot(y, main = "misbehaviour of \"3RSR\"", col.main = 3)

10
5

15

misbehaviour of "3RSR"

10

15

Index

##
##
##
##
##
##
##
##

smooth> lines(sm.3RS(y))
smooth> lines(smooth(y))
smooth> lines(smooth(y, "3RSR"), col = 3, lwd = 2)
smooth> x <- c(8:10, 10, 0, 0, 9, 9)
27

# the horror

20

##
## smooth> plot(x, main = "breakdown of

3R

and

and hence

3RSS")

10

breakdown of 3R and S and hence 3RSS

Index

##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##

smooth> matlines(cbind(smooth(x, "3R"), smooth(x, "S"), smooth(x, "3RSS"), smooth(x)))


smooth> presidents[is.na(presidents)] <- 0 # silly
smooth> summary(sm3 <- smooth(presidents, "3R"))
3R Tukey smoother resulting from
smooth(x = presidents, kind = "3R") ; n = 120
used 4 iterations
Min. 1st Qu. Median
Mean 3rd Qu.
Max.
0.0
44.0
57.0
54.2
71.0
82.0
smooth> summary(sm2 <- smooth(presidents,"3RSS"))
3RSS Tukey smoother resulting from
smooth(x = presidents, kind = "3RSS") ; n = 120
used 5 iterations
Min. 1st Qu. Median
Mean 3rd Qu.
Max.
0.00
44.00
57.00
55.45
69.00
82.00
smooth> summary(sm <- smooth(presidents))
3RS3R Tukey smoother resulting from
smooth(x = presidents) ; n = 120
28

##
##
##
##
##
##
##
##
##
##
##

used 7 iterations
Min. 1st Qu. Median
24.00
44.00
57.00

Mean 3rd Qu.


55.88
69.00

Max.
82.00

smooth> all.equal(c(sm2), c(smooth(smooth(sm3, "S"), "S")))


[1] TRUE
smooth> all.equal(c(sm),
[1] TRUE

# 3RSS

=== 3R S S

c(smooth(smooth(sm3, "S"), "3R"))) # 3RS3R === 3R S 3R

smooth> plot(presidents, main = "smooth(presidents0, *) :

3R and default 3RS3R")

60
40
0

20

presidents

80

smooth(presidents0, *) : 3R and default 3RS3R

1945

1950

1955

1960

1965

Time

##
##
##
##
##
##
##
##
##
##
##
##
##

smooth> lines(sm3, col = 3, lwd = 1.5)


smooth> lines(sm, col = 2, lwd = 1.25)
> ## Didactical investigation:
>
> showSmooth <- function(x, leg.x = 1, leg.y = max(x)) {
+
ss <- cbind(x, "3c" = smooth(x, "3", end="copy"),
+
"3"
= smooth(x, "3"),
+
"3Rc" = smooth(x, "3R", end="copy"),
+
"3R" = smooth(x, "3R"),
+
sm = smooth(x))
29

1970

1975

##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##

+
+
+
+
+
+
+
+
+
+
+
+ }

k <- ncol(ss) - 1
n <- length(x)
slwd <- c(1,1,4,1,3,2)
slty <- c(0, 2:(k+1))
matplot(ss, main = "Tukey Smoothers", ylab = "y ; sm(y)",
type= c("p",rep("l",k)), pch= par("pch"), lwd= slwd, lty= slty)
legend(leg.x, leg.y,
c("Data",
"3
(copy)", "3 (Tukey)",
"3R (copy)", "3R (Tukey)", "smooth()"),
pch= c(par("pch"),rep(-1,k)), col=1:(k+1), lwd= slwd, lty= slty)
ss

> ## 4 simple didactical examples, showing different steps in smooth():


>
> for(x in list(c(4, 6, 2, 2, 6, 3, 6, 6, 5, 2),
+
c(3, 2, 1, 4, 5, 1, 3, 2, 4, 5, 2),
+
c(2, 4, 2, 6, 1, 1, 2, 6, 3, 1, 6),
+
x1))
+
print(t(showSmooth(x)))

Data
3 (copy)
3 (Tukey)
3R (copy)
3R (Tukey)
smooth()

y ; sm(y)

Tukey Smoothers

##
## x
## 3c
## 3

[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
4
6
2
2
6
3
6
6
5
2
4
4
2
2
3
6
6
6
5
2
4
4
2
2
3
6
6
6
5
3
30

10

## 3Rc
## 3R
## sm

4
4
4

4
4
4

2
2
4

2
2
3

3
3
3

6
6
6

6
6
6

6
6
6

5
5
5

2
3
3

Data
3 (copy)
3 (Tukey)
3R (copy)
3R (Tukey)
smooth()

y ; sm(y)

Tukey Smoothers

##
##
##
##
##
##
##

x
3c
3
3Rc
3R
sm

[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11]
3
2
1
4
5
1
3
2
4
5
2
3
2
2
4
4
3
2
3
4
4
2
2
2
2
4
4
3
2
3
4
4
4
3
2
2
4
4
3
3
3
4
4
2
2
2
2
4
4
3
3
3
4
4
4
2
2
2
2
3
3
3
3
4
4
4

31

10

Data
3 (copy)
3 (Tukey)
3R (copy)
3R (Tukey)
smooth()

y ; sm(y)

Tukey Smoothers

##
##
##
##
##
##
##

x
3c
3
3Rc
3R
sm

[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11]
2
4
2
6
1
1
2
6
3
1
6
2
2
4
2
1
1
2
3
3
3
6
2
2
4
2
1
1
2
3
3
3
3
2
2
2
2
1
1
2
3
3
3
6
2
2
2
2
1
1
2
3
3
3
3
2
2
2
2
2
2
2
3
3
3
3

32

10

Data
3 (copy)
3 (Tukey)
3R (copy)
3R (Tukey)
smooth()

y ; sm(y)

Tukey Smoothers

##
##
##
##
##
##
##
##
##

x
3c
3
3Rc
3R
sm

[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11]
4
1
3
6
6
4
1
6
2
4
2
4
3
3
6
6
4
4
2
4
2
2
3
3
3
6
6
4
4
2
4
2
2
4
3
3
6
6
4
4
4
2
2
2
3
3
3
6
6
4
4
4
2
2
2
3
3
3
3
4
4
4
4
2
2
2

> par(op)

33

10