You are on page 1of 18

Linear models

Looking for relationships


IFCA R 2016
Important stuff
• Data are plural?
• P value v effect size – apply CST
• Means always have standard deviations
• Normality is important
• Non-parametric tests have assumptions
too!
Model types
• ANOVA – differences between/among groups
• Simple Linear models – are X and Y related? Weighted?
• Ancova – Do these relationships between X and Y vary by factor?
• Multiple Linear Regressions – do these X variables describe Y?
• General linear models – do these X variables and multiple factors
describe Y?
• Non-linear models – are X and Y related?
• LOESS line fits – does a descriptive line fit better?
ANOVA

Before v after
By treatment
Treatment * before/after Parametric equivalent of kruskal-
wallace test
Key Assumptions
• The response data are normally distributed
• qqnorm, shapiro.test
• Variance is assumed to be equal
• var.test, levens.test
• The data points are independent from each
other (i.e. not a time series)
• Otherwise repeated measure anova
Explained

Trend
a) Calculating total
Mean
Mean variation (mean
square)
SS . SS model b) Calculating explained
d.f. d.f. mean square
(explained
variability / degrees
Unexplained of freedom)
Trend c) Calculating
F-ratio =
unexplained mean
Explained MS square (unexplained
SS residual Unexplained MS
d.f. variability / degrees
(high F ratio is good) of freedom)
• Ordinary least
squares
• Major axis
• Ranged major
axis
• Reduced major
axis
• All give slightly
different results
Ordinary least squares (OLS)

• No uncertainty in independent variable


(x axis)
• We are predicting values of dependent
variable (Y-axis), based on absolute
certainty of independent variable
values (X-axis).
• Minimises vertical spread
• When distribution is not normal along
both axes (i.e. x need not be normal)
Major axis

• When you want a good


estimate of slope
• Distribution on both axes is
normal
• Minimises sum of squares
perpendicular to the regression
line.
• When variance is equal for both
parameters (usually in same
units)
Reduced Major Axis

• Want good estimation of slope


• Distribution is bivariate normal
• Minimises SS of the triangular
areas
• Error variances are
proportional to variable
variances
a) Random scatter of
residuals
b) Wedge-shaped – variance
is not homogenous
c) Linear pattern – need
another explanatory
variable *trend-shuld do
something to make it flat
again
d) Curved pattern =
curvilinear relationship
Key Assumptions
1) The relationship is linear-linearly corelated.
2) The response data are normally distributed
3) Response variance is assumed to be equal
4) The data points are independent from each
other (i.e. not a time series)
Intercept

Slope
Explained MS / Unexplained MA
Assuming parallel lines – no difference in slope

Coastal intercept

Slope for all

Addition to intercept
for deep
-1.20749 + 0.10195
= -1.10554
Assuming different slopes and different intercepts for
each depth class

Coastal intercept
Coastal slope

Addition to
intercept for
freshwater

Addition to slope
for freshwater

You might also like