Linear regression - Wikipedia, the free encyclopediahttp://en.wikipedia.org/wiki/Linear_regression3 of 89/10/2005 2:15 PM
with the conditional distribution of
y
given
x
essentially the same as the distribution of the error term.A linear regression model need not be affine, let alone linear, in the independent variables
x
. For example,is a linear regression model, for the right-hand side is a linear combination of the parameters
α
,
β
, and
γ
. In this case it isuseful to think of
x
2
as a new independent variable, formed by modifying the original variable
x
. Indeed, any linear combination of functions
f
(
x
),
g
(
x
),
h
(
x
), ..., is linear regression model, so long as these functions do not have any free parameters (otherwise the model is generally a nonlinear regression model). The least-squares estimates of
α
,
β
, and
γ
arelinear in the response variable
y
, and nonlinear in
x
(they are nonlinear in
x
even if the
γ
and
α
terms are absent; if only
β
were present then doubling all observed
x
values would multiply the least-squares estimate of
β
by 1/2).
Parameter estimation
Often in linear regression problems statisticians rely on the Gauss-Markov assumptions:The random errors
ε
i
have expected value 0.The random errors
ε
i
are uncorrelated (this is weaker than an assumption of probabilistic independence).The random errors
ε
i
are "homoscedastic", i.e., they all have the same variance.(See also Gauss-Markov theorem. That result says that under the assumptions above, least-squares estimators are in acertain sense optimal.)Sometimes stronger assumptions are relied on:The random errors
ε
i
have expected value 0.They are independent.They are normally distributed.They all have the same variance.If
x
i
is a vector we can take the product
β
x
i
to be a scalar product (see "dot product").A statistician will usually
estimate
the unobservable values of the parameters
α
and
β
by the
method of least squares
,which consists of finding the values of
a
and
b
that minimize the sum of squares of the
residuals
Those values of
a
and
b
are the "least-squares estimates." The residuals may be regarded as estimates of the errors; seealso errors and residuals in statistics. Notice that, whereas the errors are independent, the residuals cannot be independent because the use of least-squaresestimates implies that the sum of the residuals must be 0, and the scalar product of the vector of residuals with the vector of
x
-values must be 0, i.e., we must haveandThese two linear constraints imply that the vector of residuals must lie within a certain (
n
−
2)-dimensional subspace of R
n
; hence we say that there are "
n
−
2 degrees of freedom for error". If one assumes the errors are normally distributed