You are on page 1of 7

Regression Analysis

1. What is regression analysis?

In statistics, regression analysis is a technique to model

and analyze the relationship between a dependent


variable(or response variable) and one or more

independent variables.

More specifically, regression analysis helps people

understand how the value of the dependent variable


changes when any one of the independent variable is
varied, while the other independent variables remain the

same.

Regression analysis can also be used to predict and


forecast the future value of the dependent variable when

the values of the independent variables are given.


2. History

The term “regression” was first used by the Francis


Galton in the 19th century to describe a biological

phenomenon.

The phenomenon was that the heights of the children


tend to regress down towards a normal average( the

human’s average).

3. Regression model

There are three types of variables involved in the


Regression model:
(1) The dependent variable Y,
(2) The independent variables X1, X2 ,...,Xk ,
(3) The unknown parameters 1, 2 ,...,k .

The regression model can be written in the following

general form:
Y  f ( X1, X 2 ,...,X k ; 1, 2 ,...,k )  

or
E(Y | X1,...,X k )  f ( X1, X 2 ,...,X k ; 1, 2 ,...,k )

where  is called the regression residual.

We need to set f before the regression model is used.


The choices of f most discussed are:
(1) Simple regression model:

In this case, the model is defined to be

Y= f ( X 1 ;  0 , 1 )   =  0  1 X 1  
where f is a linear function of  0 , 1 , and X 1 .

We use this model when data shows the following


pattern.
(2) Multiple regression model:

In this model, f is defined to be


f =  0  1 X 1   2 X 2  ...   k X k

As discussed in (1), f is still a linear function

of  0 ,..., 1 , and X 1 ,..., X k , but in the higher dimensions.

(3) Nonlinear regression model:

The regression model can also consider the nonlinear

function. For example, if data shows the following

pattern, then we could assume the regression model to be


2
Y =  0  1 X 1   2 X 1  

3.5

2.5

2
數列1
1.5

0.5

0
0 5 10 15 20

4. Basic assumptions:
There are some basic assumptions to be introduced when

we analyze the regression model and we take the simple

regression model Yi =  0  1 X 1i   i , i  1,2,..., n as an


example, where n is the data size or the number of

observations.
Usually, we assume the residual  i ' s have the following
properties:
(1)  i ~ N (0,  2 ), i  1,2,..., n,
(2)  i and  j are independent for i  j.

5. Parameter estimations:

There are many methods used in statistics to estimate the

parameters, for example, the unbiased estimator, the


maximum likelihood method, the least square method,

the uniformly minimum variance unbiased estimator, and


so on.

In regression analysis, we usually use the Least square

estimators to estimate the parameters  1 ,...,  k . This


method is directly from the mathematics with the same
name. The idea of this method is to find the values of

parameters that minimizes the sum of the square of the

residuals, that is
n n
min    min  (Yi   0  1 X 1i ) 2
i
2
 0 , 1  0 , 1
i 1 i 1

The following graph shows above idea.

4.5
4

3.5
3

2.5 數列1
2
1.5
1
0.5

0
0 2 4 6 8 10

6. References:
1. M. H. Kutner, C. J. Nachtsheim, and J. Neter (2004), "Applied Linear Regression
Models", 4th ed., McGraw-Hill/Irwin, Boston (p. 25)
2. Draper, N.R. and Smith, H. (1998).Applied Regression Analysis Wiley Series in
Probability and Statistics
3. Fox, J. (1997). Applied Regression Analysis, Linear Models and Related Methods.
Sage

You might also like