LECTURE7 ExtraordinaryRegression 2012-13

© Andrew R Marshall
Statistics and Quantitative Methods Module
Extraordinary
Regression:
Non-Normal, Non-
Parametric & Non-Straight
Relationships
Andy Marshall
Practical 7 (Introductory Lecture)
Spring 2013
Practical 7 files are on the VLE:

Statistics and Quantitative Methods > Course Materials 2012-13 > SPRING WEEK
3 - Prac 7 - Extraordinary Regression
1
Choosing a Statistical Test

From Prac 6: (d) Effect Size of Trends
i) Two datasets with normal error, without causation?
ii) Two datasets without normal error or causation?
iii) Normal predictor and normal response?
iv) Single normal response vs many normal predictors?
2

Pearson Correlation
3

Pearson Correlation
Spearman Rank Correlation
4

Pearson Correlation
Ordinary Least Squares Linear Regression
5

Pearson Correlation
Ordinary Least Squares Linear Regression
Multiple Linear Regression
6
Help on the VLE

(Statistics & Quantitative Methods > STATISTICS FORUM 2012-13)
Time
Statistics Forum Summary Stats 2009-10
Date
Day
7
One-to-one Help
Remaining Help Sessions
• OPEN HELP SESSION: Fri 25th Jan 09:15-12:14 (Steve LFA/015)

(wrap up any incomplete practicals and/or get assignment help)
• TUTORIAL: Weds 30th Jan 09:15-12:14 (Andy LFA/015)

(guided tutorial covering practicals 4-7 including informal test)
• OPEN HELP SESSION: Weds 6th Feb 09:15-12:14 (Andy LFA/015)

(final chance to get help on the assignment data)
• ASSIGNMENT DEADLINE: Mon 11th Feb 12:00 noon
8
What is Extraordinary Regression?

Refers to regression where:
• Non-parametric
- Response data error
distribution undefinable
• Non-normal
- Response data residual
error not normal
• Non-straight
- Predictor-response model
is not a straight line
9
Non-normal Methods
Identifying a non-normal response variable:
Distance from Low Tide (m) 1) Data distribution
Histogram Kernel Density
- Count data with few
0.6
0.4
0.4
values, low mean or low
0.2
0.2
Frequency
Distance (m)
0.0
0.0
sample size
1 2 3 4 5 0 1 2 3 4 5 6
Distance (m) N = 30 Bandwidth = 0.4001 - Binary (0/1) data

Normal Q-Q Plot Boxplot
5
5
2) Data exploration
4
4
3
3
- Skew
2
2
Distance (m)
1
1
Sample Quantiles
-2 -1 0 1 2
- Non-straight
Theoretical Quantiles
10
Non-normal Methods
0.6
0.4
0.4
0.2
0.2
Frequency
Distance (m)
0.0
0.0
sample size
1 2 3 4 5 0 1 2 3 4 5 6

5
5
2) Data exploration
4
4
3
3
- Skew
2
2
Distance (m)
1
1
Sample Quantiles
-2 -1 0 1 2
- Non-straight
11
Non-normal Methods
0.6
0.4
0.4
0.2
0.2
Frequency
Distance (m)
0.0
0.0
sample size
1 2 3 4 5 0 1 2 3 4 5 6

5
5
2) Data exploration
4
4
3
3
- Skew
2
2
Distance (m)
1
1
Sample Quantiles
-2 -1 0 1 2
- Non-straight
12
Non-normal Methods
Residuals vs Fitted Normal Q-Q

3) Diagnostic plots
2
16 9 16 9
1
- Residual plots show
50
0
0
curvature or
Residuals
-1
-100
-2
1
Standardized residuals
1 heteroscedasticity
140 160 180 200 220 240 -2 -1 0 1 2
Fitted values Theoretical Quantiles - Normality

Scale-Location Residuals vs Leverage
- Skew/outliers
2
1
1.5
9
16 9 1
1
0.5
1.0
0
4) Tests (not essential)
-1
0.5
0.5
11 1
-2
Cook's distance 1
-F
0.0
140 160 180 200 220 240 0.0 0.1 0.2 0.3 0.4
Fitted values Leverage

- Kolmogorov-Smirnov, etc.
13
Non-normal Methods

3) Diagnostic plots
2
16 9 16 9
1
50
0
0
curvature or
Residuals
-1
-100
-2
1
140 160 180 200 220 240 -2 -1 0 1 2

- Skew/outliers
2
1
1.5
9
16 9 1
1
0.5
1.0
0
-1
0.5
0.5
11 1
-2
Cook's distance 1
-F
0.0
140 160 180 200 220 240 0.0 0.1 0.2 0.3 0.4

14
Non-normal Methods

3) Diagnostic plots
2
16 9 16 9
1
50
0
0
curvature or
Residuals
-1
-100
-2
1
140 160 180 200 220 240 -2 -1 0 1 2

- Skew/outliers
2
1
1.5
9
16 9 1
1
0.5
1.0
0
-1
0.5
0.5
11 1
-2
Cook's distance 1
-F
0.0
140 160 180 200 220 240 0.0 0.1 0.2 0.3 0.4

15
Non-Parametric
Regression
16
Non-Parametric
Regression
(i.e. no defined error distribution)
17
Non-parametric Regression
Kendall’s robust line method:
• “Robust”, i.e. few assumptions
15
• Slope (z) = median of all possible
slopes (for every pair of points) 10
• Median slope then used to get an

5
intercept for each point, and the

median intercept is used
0
0 1 2 3 4
• Simple (!) x
z[i,j] = (yj-yi)/(xj-xi)
Cleveland (2006)
18
Kendall’s robust line method: H0: Slope (mu; μ) = 0
(i.e. W– = W+)
• Less influenced by outliers
15
than OLS regression
• Wilcoxon Signed-Rank (One-
10
sample Inference):
y
5
- Compares median of slopes μ1-i

to single value (μ = 0)
0
- Critical value W 0 1 2 3 4
(subtract μ from μ1-i, rank disregarding x
signs, then restore signs and sum up wilcox.test(data,mu=0)

the –ves [W–] and +ves [W+]) 19
(i.e. W– = W+)
• Cleveland (2006) gives other
15
non-parametric alternatives
10
• 3 drawbacks:
y
5
0
0 1 2 3 4
Cleveland (2006)
20
(i.e. W– = W+)
• Cleveland (2006) gives other
15
non-parametric alternatives
10
• 3 drawbacks:
y
- Proportion variance
5
explained not known

0
- Can ignore outliers

0 1 2 3 4
- Can’t deal with more than x
one variable
Cleveland (2006)
21
Non-Normal Parametric
Regression
22
Non-Normal Parametric
Regression
(These models are still parametric, i.e.
response variable has defined
distribution)
23
Generalised Linear Models

GLMs or “glims” (package mgcv in R)
• Regression where response variable does not

necessarily require normally distributed errors
• ≥1 predictors
• Predictor distribution unimportant (except skew)
• Based on maximum likelihood rather than

minimising the squared residual error:
“… iterative weighted linear regression…”

(Nelder & Wedderburn 1972 J. R. Statist. Soc. A)
24

GLMs or “glims” (package mgcv in R)
GLMs require an “Error function”
• This modifies the regression to match the data type
• Adjusts the random component (the probability

distribution)…
25
Gaussian (Normal) Error Family
glm(y ~ x, family = gaussian)
• Result very similar to OLS linear

regression (general linear model)
• Normal errors
Wikipedia
26
Binomial Error Family
glm(y ~ x, family = binomial)
• “Logistic regression”
Wikipedia
27
Binomial Error Family
glm(y ~ x, family = binomial)
• “Logistic regression”
• Probability distribution of
two alternative outcomes
• E.g. presence/absence,
0/1, categorical data
• Output: probability of Wikipedia
getting result 1 28
Poisson Error Family
glm(y ~ x, family = poisson)
• “Poisson regression”
Wikipedia
29
glm(y ~ x, family = poisson)
• “Poisson regression”
• Random results from distribution

of counts
Wikipedia
• Dispersion is set to 1
(mean = variance)
• e.g. phone calls / roadkill

30
Negative Binomial Error Family

glm.nb(y ~ x)
Wikimedia Commons
• Alternative
distribution for
count data (mean
≠ variance)
• Example in
practical and
Crawley (2007)
31

GLMs require a “Link function”
• Links the expected value of y to the predictors (i.e.
adjusting for the error function)
• Common link functions:
- Identity link: E(y) = y (normal)
- Log link: E(y) = log(y) (Poisson / negative binomial)
- Logit link: E(y) = log[y/(1 – y)] (binomial)
E.g. Poisson
regression model log(y) = β0 + β1x1 + β2x2 + β3x3 + ...
32

Stats: (1) Deviance
Deviance calculation varies according to error function
• Normal: Σ(y – y̅)2 (=sum of squares)

• Poisson: 2 Σ(y × ln[y/µ] – [y – µ])
• Binomial: 2 Σ{(y × ln[y/µ]) + ([n – y] × ln[n – y] / [n – µ])}
(μ = fitted values of y from maximum likelihood model; Crawley 2007 The R Book)
Proportion deviance explained
1 – Residual Deviance / Null Deviance 33


Stats: (2) AIC (not just for multiple models)
• Typical selection criterion for stepwise modelling
• Trade off between model simplicity and fit (penalty of
two added for each extra parameter):
AIC = (-2 × log-likelihood) + (2 × no. of parameters)
• Lower AIC is better than higher AIC

• Rule of thumb: AIC within 2 suggests models equivalent
34

Stats: (3) Test Statistics
• Marginal statistics used to test the significance of
slope beta (i.e. H0: β=0) – t or z
• Analysis of Deviance (likelihood ratio) test to
determine the probability that reduced model “b” has
reduced deviance to full model “a” - anova(a,b):
F = (Da – Db / νa – νb) / (Db / νb)

(D=deviance; ν=degrees of freedom)
“a” must be nested within “b” 35

Simple univariate example:
• Crawley (2005) expected cancer patients clusters.txt
> library(mgcv)
> model <- glm(Cancers ~
Distance, family = poisson) 6
> xv <- seq(0,100,0.1)

5
> yv <- predict(model,

4
list(Distance=xv))
> plot(Distance,Cancers)
3
Cancers
> lines(xv,exp(yv))
2
1
0
Plotting a line: (1) predict points, (2) adjust for

0 20 40 60 80 100
link function 36
Distance
IMPORTANT
One more step is needed if the data
continue to defy the distribution
Overdispersion…
(mean = variance → dispersion = 1)
37
Quasi-likelihood in GLMs
Overdispersion
• Used where greater variability than expected
• Poisson and binomial (logistic) regression only
• Variance > mean (i.e. dispersion > 1)
• Rule of thumb:
Overdispersion concerning if dispersion >1.5
(Residual Deviance > 1.5 × residual degrees of freedom)
(often shows a funnel shape in residual diagnostic plots)
38
Example of overdispersion:
• Crawley (2005) clusters.txt
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.186865 0.188728 0.990 0.3221
Distance -0.006138 0.003667 -1.674 0.0941 .
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘
’ 1
(Dispersion parameter for poisson family taken to be 1)
Null deviance: 149.48 on 93 degrees of freedom

Residual deviance: 146.64 on 92 degrees of freedom
AIC: 262.41
RD > 1.5 x rdf 39
Overdispersion
• Deviance is scaled by overdispersion coefficient (D/df)
• Binomial → quasibinomial [uses a scaling parameter ≈

Pearson chi-sq / df to scale the deviance]
• Poisson → quasipoisson
• Problems:
- Generally reduces power of the test
- Can’t use automated stepwise reduction…
40
Alternatives to quasi-likelihood:
• Remove more intercollinearity
• Poisson (p372 Q&K):
- Adjust parameters: √(x2/ν)
- Negative binomial GLM (see prac)
• Binomial (& Proportions):

- arcsine/probit/logit transformation
- Beta-binomial distribution
41
See also final slide…
Multiple GLM Final Steps

93 93
3
3
94 94
91 91
2
2
1
1
Residuals
0
0
(1) Diagnostic Plots
Std. deviance resid.
-1
-1
-2
-0.4 -0.3 -0.2 -0.1 0.0 0.1 0.2 -2 -1 0 1 2
(2) Analysis of Deviance Predicted values Theoretical Quantiles
(likelihood ratio test): Scale-Location Residuals vs Leverage

5
93
93 0.5
94
4
91
94
1.5
3
84
2
Gaussian:
1.0
1
0.5
0

anova(a,b,test=“F”)
Std. Pearson resid.
-1
Cook's distance
0.0
Poisson/binomial: -0.4 -0.3 -0.2 -0.1 0.0 0.1 0.2 0.00 0.01 0.02 0.03 0.04
Predicted values Leverage

anova(a,b,test=“Chisq”)
Quasi-likelihood:
anova(a,b,test=“F”)
42
Poisson GLM Example

What are the key stats here?
TABLE II. Predictors of Resident Monkey
Species Richness in 21 Udzungwa Forest
Fragments at the 95% (and 90%) Level
43
Marshall et al. (2010)
Poisson GLM Example

What are the key stats here?
TABLE II. Predictors of Resident Monkey
Species Richness in 21 Udzungwa Forest
Fragments at the 95% (and 90%) Level
44
Marshall et al. (2010)
Recap: GLM Steps

(1) .
(2) .
(3) .
(4) .
(5) .
(6) .
(7) .
45
Recap: GLM Steps

(1) Remove intercorrelation (if multiple predictors)
(2) .
(3) .
(4) .
(5) .
(6) .
(7) .
46
Recap: GLM Steps

(2) Determine distribution
(3) .
(4) .
(5) .
(6) .
(7) .
47
Recap: GLM Steps

(3) Run full GLM (incl. correct error and link)
(4) .
(5) .
(6) .
(7) .
48
Recap: GLM Steps

(4) Stepwise reduction → minimum adequate model
(5) .
(6) .
(7) .
49
Recap: GLM Steps

(5) Check for over-dispersion (→ quasi-likelihood)
(6) .
(7) .
50
Recap: GLM Steps

(6) Check model diagnostics
(7) .
51
Recap: GLM Steps

(6) Check model diagnostics
(7) Analysis of deviance (minimum vs. full model)

52
Non-straight Models
Clues for non-linearity
• Data exploration:
300
- Curve
250
- Hump
200
carbon
150
- Complex relationship 100
50
0.1 0.2 0.3 0.4 0.5
prop90_ba
53
Non-straight Models
16
- Curve
14
12
- Hump
ht_dbh
10
- Complex relationship 8
6
500 1000 1500 2000
elevation 54
Non-straight Models
- Curve
- Hump
- Complex relationship
55
Non-straight Linear Models

Polynomial regression
• Works in same way as ordinary least squares
regression
lm(response ~ predictor + I(predictor^2))
• E.g. simple relationships for x = -4 to +4 (see prac):

60
15
y = a + bx + cx2
40
20
10
z
y
0
y = a + bx + cx2 + dx3
-20
5
-40
-60
-4 -2 0 2 4 -4 -2 0 2 56 4
x x
Non-linear Models
Generalised Additive
Models (GAMs):
• Wiggly relationships
• Non-parametric version of GLM
• Local scoring algorithm

iteratively fits a smoothing
function, e.g. LOESS
y = β0 + f1(x1) + f2(x2) + f3(x3) + ...

57
Zuur et al. (2007) p99; Quinn & Keough p372-4
Non-linear Models
GAMs:
• Uses deviance and AIC, as for GLM, then use analysis
of deviance: anova(simple,complex,Test=”F”)
• Like GLM an error probability distribution is specified
gam(y ~ s(x1) + s(x2) + s(x3), family = xxx)
• Can even mix of linear and wiggly (semi-parametric)

gam(y ~ x1 + s(x2))
58
Non-linear Models
4
Over-fitting in GAMs:
2
0
• A GAM can have perfect fit
-2
s(elevation[,8.652:18])
-4
-6
• BUT fit ≠ explanatory power (we
-8
600 800 1000 1200 1400 1600 1800 2000
want to represent the “parent
elevation[2:18]
population”)
4
2
• GAM can overfit so need to adjust
0
effective degrees of freedom
-2
s(elevation,3.84)
-4
-6
• Use parsimony to decide between
-8
500 1000 1500 2000
models (e.g. quadratic = 3df)
59
elevation
The genus Acacia in East Africa:

distribution, biodiversity and the
protected area network
Marshall et al (2012) Plant Ecology and Evolution
60
Marshall et al. (in review)
Predicted Acacia Biodiversity
61
What Next?
Some extensions to methods covered (all
possible in R):
• Weighted GLM (under-dispersion; e.g. Ridout & Besbeas 2004)
• Zero-inflated binomial GLM (0s ~2× 1s)
• GLS (e.g. RD/df ≥ 15, i.e. high heteroscedasticity)
• GLMM/GAMM (mixed-models for spatial/temporal bias)
• Non-linear regression (e.g. exponential; Crawley p148)
• Multi-model averaging (Burnham & Anderson 1998)
• Multivariate methods (>1 response variable) 62

Take Home
Messages i.There are several sound
methods for distributions
other than normal
ii. Diagnostic checks (plots and
test) are vital but not included
in some statistical software!
iii. Don’t be afraid to try non-
linear methods, but beware of
over-fitting
63
Homework
i. Reading as shown on
slides
ii. Assignment data
analysis
iii. Complete all
practical exercises
iv. Add requests for the
two tutorials onto
Stats Forum
64
Some Additional Slides For Interest
65
Features of a Poisson GLM
• Predictions must be positive: Log link function

ensures this (unlike linear regression)
• Predictions must be integers: Poisson error function
ensures this (unlike linear regression)
66
Generalized Linear Models

Simple univariate example:
• Crawley (2005) expected cancer patients clusters.txt
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.186865 0.188728 0.990 0.3221
Distance -0.006138 0.003667 -1.674 0.0941 . P
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for poisson family taken to be 1)

AIC: 262.41
% Deviance = 1–RD/ND = 1–146.64/149.48
= 0.98 = 98% 67

Simple univariate
example: Residuals vs Fitted Normal Q-Q
2
IT3265 IT3265
1
1 2
• Diagnostics
-1 0
-1 0
Residuals
BW2832 BW2832
-3
-3
LU3988
LU3988
plot(modelname)
2.0 2.5 3.0 3.5 -2 -1 0 1 2
Predicted values Theoretical Quantiles

• This example has
weak funnel shape Scale-Location Residuals vs Leverage
3
LU3988
IT3265 1
2
suggesting that 1.5 IT3265 0.5

BW2832
1
1.0
transformation or
-1 0
0.5
BW2832
0.5
Std. Pearson resid.
other method may be Cook's distance

LU3988
1
0.0
-3
required 2.0 2.5 3.0 3.5 0.0 0.1 0.2 0.3

68
Predicted values Leverage

Multiple GLM: (same as MLR)
• Tests each predictor weighting by intercorrelation
glm(y ~ x1 + x2 + x3, family=xxx)
• Interactions:
glm(y ~ x1*x2*x3, family=xxx)
glm(y ~ x1 + x2 + x3 + x1:x2:x3, family=xxx)
• More parameters lead to better fit but less

explanatory power (as for MLR)
→ minimum adequate model (see last lecture)

69
Multiple GLM
Model Reduction: (same as MLR)
• Before running first model, deal with intercorrelations:
1) Correlation between predictors

Pearson ≥ 0.7
2) Variance Inflation Factors (VIF) (code in prac 5)
Tolerance = 1 – r2 [for xi vs. all other predictors]
VIF = 1/Tolerance [VIF > 5 suggests collinearity]
• Remove the intercorrelated predictors least correlated

with response (unless important)
70
Multiple GLM
Model Reduction: (same as MLR)
• Next run full model and remove any non-significant
interactions (3-way > 2-way …)
• Then stepwise selection using AIC (as last prac):
- Related to deviance (but penalised for lack of

parsimony): -2 × log-likelihood + 2 × (parameters + 1)
- Or BIC: -2 × log-likelihood + logen(parameters + 1)

(Bayes Information Criterion penalises parameters even more)
71
Example of overdispersion:
model <- glm(Cancers ~ Distance, family = quasipoisson)
Coefficients: Decreased power

Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.186865 0.235364 0.794 0.429
Distance -0.006138 0.004573 -1.342 0.183
(Dispersion parameter for quasipoisson family taken to be

1.555271)

AIC: NA
→ anova(simple,complex,Test=“F”)
72
Non-straight Models
PSP9 PSP9
4
1.5
2
0.5
Residuals
-2
PSP1
-4
PSP12
• Diagnositics example Standardized residuals
-1.5 -0.5
PSP1
PSP12
10 11 12 13 14 -2 -1 0 1 2
(see practical) Fitted values Theoretical Quantiles

2
PSP12 PSP9
PSP1 PSP9 0.5
1.2
• Significance remains
1
0.8
0
(despite curvature) 0.4

-1
Standardized residuals PSP1

Cook's distancePSP12 0.5
-2
0.0
10 11 12 13 14 0.00 0.10 0.20
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 15.352888 1.539830 9.971 2.86e-08 ***
73
elevation -0.002892 0.001227 -2.358 0.0314 *
Non-linear Models
Non-linear (least squares) regression:
• Polynomial regression essentially transforms the
data, but if we can’t transform…
• Works in same way as ordinary least squares

regression, but now we have to tell R the equation
nls(y~a-b*exp(-c*x)) [e.g. Crawley 2005]
anova(non-linear,linear)
• Example: exponential (Crawley 2005 jaws.txt shows

how nls() helps if quadratic gives an erroneous
hump) 74
Non-straight Linear Models

General notes
• Alternative code for plotting line of a model:
y_plot <- predict(xy,list(x=x_plot))
lines(x_plot,y_plot)
• If unsure always try a curve to be sure that

linear method is correct
• Use analysis of deviance to test line vs curve:
anova(non-linear,linear)
75
Non-linear Models
Degrees of Freedom in GAMs:
• Df are not integers (“effective df”)
• Try various dfs using package gam – use
s(variable,x.xx)
76
Non-linear Models
GAM with binary data
• Outputs is the probability of success (i.e. 1
rather than 0)
model <- gam(sp ~ ., family=binomial)
• Example: Species distribution modelling (e.g.

Guisan & Thuiller 2005)…
77
Non-linear Models
Multiple GAM variables
• Smoother determined from points either side so
increased error at extremes
• Interactions: s(x1,x2)
- More complicated than lm/glm
- E.g. Crawley 2005 contour plot ozone.data.txt, p617
78

LECTURE7 ExtraordinaryRegression 2012-13

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

LECTURE7 ExtraordinaryRegression 2012-13

Uploaded by

Copyright:

Available Formats

© Andrew R Marshall

Statistics and Quantitative Methods Module

Practical 7 files are on the VLE:

Choosing a Statistical Test

ii) Two datasets without normal error or causation?

iii) Normal predictor and normal response?

iv) Single normal response vs many normal predictors?

Choosing a Statistical Test

iii) Normal predictor and normal response?

iv) Single normal response vs many normal predictors?

Choosing a Statistical Test

iv) Single normal response vs many normal predictors?

Choosing a Statistical Test

Choosing a Statistical Test

Help on the VLE

• OPEN HELP SESSION: Fri 25th Jan 09:15-12:14 (Steve LFA/015)

• TUTORIAL: Weds 30th Jan 09:15-12:14 (Andy LFA/015)

• OPEN HELP SESSION: Weds 6th Feb 09:15-12:14 (Andy LFA/015)

• ASSIGNMENT DEADLINE: Mon 11th Feb 12:00 noon

What is Extraordinary Regression?

Distance (m) N = 30 Bandwidth = 0.4001 - Binary (0/1) data

Distance (m) N = 30 Bandwidth = 0.4001 - Binary (0/1) data

Distance (m) N = 30 Bandwidth = 0.4001 - Binary (0/1) data

Residuals vs Fitted Normal Q-Q

Fitted values Theoretical Quantiles - Normality

Fitted values Leverage

Residuals vs Fitted Normal Q-Q

Fitted values Theoretical Quantiles - Normality

Fitted values Leverage

Residuals vs Fitted Normal Q-Q

Fitted values Theoretical Quantiles - Normality

Fitted values Leverage

• Median slope then used to get an

intercept for each point, and the

- Compares median of slopes μ1-i

(subtract μ from μ1-i, rank disregarding x

signs, then restore signs and sum up wilcox.test(data,mu=0)

explained not known

- Can ignore outliers

Generalised Linear Models

• Regression where response variable does not

• Predictor distribution unimportant (except skew)

• Based on maximum likelihood rather than

“… iterative weighted linear regression…”

Generalised Linear Models

GLMs require an “Error function”

• This modifies the regression to match the data type

• Adjusts the random component (the probability

Gaussian (Normal) Error Family

glm(y ~ x, family = gaussian)

• Result very similar to OLS linear

Binomial Error Family

glm(y ~ x, family = binomial)

Binomial Error Family

glm(y ~ x, family = binomial)

• Output: probability of Wikipedia

Poisson Error Family

glm(y ~ x, family = poisson)

Poisson Error Family

glm(y ~ x, family = poisson)

• Random results from distribution

• e.g. phone calls / roadkill

Negative Binomial Error Family