You are on page 1of 9

REGRESSION MODEL IN MATRIX NOTATION

^
Y =Xβ+ ε ⇒ Y^ =X β=X (X'X )−1 X'Y

Some Additional Properties of the Transpose and Inverse of a Matrix

We list for future reference some useful facts concerning the transpose and inverse
of a matrix. Proofs are omitted.
( A' ) '= A
( AB ) '=A'B'
−1
( A−1) = A
( AB )−1 =B−1 A−1
( A' )−1=( A−1 ) '
FITTING A REGRESSION MODEL WITH DUMMY VARIABLES
What are dummy variables? QUALITATIVE INDEPENDENT VARIABLES/
DUMMY VARIABLES

Frequently we wish to use nominal-scale variables—such as gender, whether the


home has a swimming pool, or whether the sports team was the home or the visiting
team—in our analysis. These are called qualitative variables. To include a
categorical independent variable in a regression model, you use a dummy variable.
A dummy variable recodes the categories of a categorical variable using the numeric
values 0 and 1. Where appropriate, the value of 0 is assigned to the absence of a
characteristic and the value 1 is assigned to the presence of the characteristic.

Dummy variables, also called indicator variables allow us to include categorical data
(like Gender) in regression models. A dummy variable can take only 2 values, 0
(absence of a category) and 1 (presence of a category).
Suppose he have dummy variable gender to 1 for females and 0 when the employee
is not a female. When interpreting results for gender, we remember that when
dummy variable is 0 (not a female), we are talking about males.
EXAMPLE 1

Employee Salary Gender Experience

1 7.5 Male 6

2 8.6 Male 10

3 9.1 Male 12

4 10.3 Male 18

5 13 Male 30

6 6.2 Female 5

7 8.7 Female 13

8 9.4 Female 15

9 9.8 Female 21

Coding: What would have happened if we had used 0 for females and 1 for males in
our data? Would our results be any different?
EXAMPLE 2
Consider a business problem that involves developing a model for predicting the
assessed value of houses ($000), based on the size of the house (in thousands of
square feet) and whether the house has a swimming

Assessed Value Size Swimming pool


234.4 2.00 Yes
227.4 1.71 No
225.7 1.45 No
235.9 1.76 Yes
229.1 1.93 No
220.4 1.20 Yes
225.8 1.55 Yes
235.9 1.93 Yes
228.5 1.59 Yes
229.2 1.50 Yes
236.7 1.90 Yes
229.3 1.39 Yes
224.5 1.54 No
233.8 1.89 Yes
226.8 1.59 No

Holding constant whether a house has a swimming pool, for each increase of
1.0 thousand square feet in the size of the house, the predicted assessed value is
estimated to increase by 16.1858 thousand dollars (i.e., $16,185.80).

Holding constant the size of the house, the presence of a swimming pool is
estimated to increase the predicted assessed value of the house by 3.8530 thousand
dollars (i.e., $3,853).

EXAMPLE 3

Salar X2 phy X3 Eng


y (y) GPA(x1) Discipline
41.5 2.95 Physical sciences 1 0
43 3.2 Physical sciences 1 0
44.1 3.4 Physical sciences 1 0
45.4 3.6 Physical sciences 1 0
44.2 3.2 Physical sciences 1 0
44 2.85 Engineering 0 1
47 3.1 Engineering 0 1
47.8 2.85 Engineering 0 1
44.7 3.05 Engineering 0 1
43.4 2.7 Engineering 0 1
40.5 2.75 Accounting
42.2 3.1 Accounting
44 3.15 Accounting
42.2 2.95 Accounting
41.8 2.75 Accounting

Y^ =25. 997+5. 491 X 1 −0 . 312 X 2 +3 . 405 X 3


 The following configurations of these two dummy variables represent the
three disciplines. Notice that in this setup, the accounting discipline is the
called a reference variable or default condition; that is X2=X3=0. This means
that each of the other two academic disciplines will be compared to
accounting. The least squares equation is .
 The least squares estimate means that, on average, a student in physical
sciences earns R312 less than a student in accounting. Similarly, the least
squares estimate means that, on average, student in engineering earns R3
405 more than a student in accounting.

EXAMPLE 4
A team of research physicians conducted a study to determine the effect of health
education on the utilization of health services for hypertension patients (Drug Topics,
April 1993). Data collected for a sample of n=282 new HMO enrollers with
hypertension problems were used to fit the following regression model:

E( y )=β 0 + β 1 x 1 + β 2 x 2 + β 3 x 3 + β 4 x 4 + β 5 x 5
where
y=¿Annual health care expenditures (dollars)

x 1= Age (years) {
x 2 = 1 if female
0 if male {
x 3 = 1 if white
0 if nonwhite
x 4 = Number of concomitant maintenance medications (regimen)

{
x 5 = 1 if enrolled in health education program
0 if not
The regression results are summarized below
VARIABLES β ESTIMATE p-VALUE FOR TESTING
H 0 : β k =0
Intercept 64.82 ¿ 0 . 05
Age (x1) 1.05 ¿ 0 . 05
Gender (x2) -10.53 ¿ 0 . 05
Race (x3) 0.27 ¿ 0 . 05
Regimen (x4) 9.46 ¿ 0 . 05
Health education (x5) -92.97 ¿ 0 . 001
F=37 . 84 , R2 =0 . 4357

 Interpret the p-values shown above.


 Predict the annual health care expenditure of a 45-year-old male white
hypertension patient who maintains three medications, but who has not
enrolled in a health care education program.
STATISTICAL INTERVALS: Y ESTIMATE POINT AND INTERVAL: MEAN &
PREDICTION.

At times, the researcher wants to predict


y i , based on specific x i values. In

x
estimating a mean response for Y , one needs to specify a vector of i values within
the range in which the model was constructed.
Matrix Expressions for Confidence Intervals and Prediction Intervals

Point estimates and confidence intervals for the mean response and prediction
intervals for a future response can also be expressed using matrix notation. The

mean response for a specified value


x h of the explanatory variable is

E(Y |X h )= β 0 +β 1 x h . The estimated mean response, denoted, Y^ ( x ) h can be written as


the matrix product

() β^
Y^ ( x h ) = β^ 0 + β^ 1 x h =( 1 , x h ) 0 =x'h β^
β^ 1
.
^ )−1 X'Y therefore X' h β^ =X'h ( X'X) X'Y
−1
Recall that β=(X'X
−1
Let A=X'h (X'X ) X' and A '=X ( X'X)−1 X ' h

The statistical intervals for estimating the mean or predicting new observations in the
simple linear regression case is easily extended to the multiple regression case.
Here, it is only necessary to present the formulas. First, let use define the vector of
given predictors as

[]
1
x h, 1
x
X h= h, 2
xh , 3

x h, p−1

E ( Y |X h )
We are interested in either intervals for or intervals for the value of a new
response
X h given that the observation has the particular value X h . First we define
X
the standard error of the fit at h given by:
s. e . ( ^y h ) =√ MSE(X' h (X'X )−1 X h

ALTERNATIVE METHOD TO THE t − TEST IS THE PARTIAL F-TEST

The partial F-test is similar to the F-test, except that individual or subsets of predictor
variables are evaluated for their contribution in the model to increase SSR or,
conversely, to decrease SSE.

In the current example (MAITANANCE COST(y) and two independent variables ( X 1 -

Age and X 2 -mileage ), we ask, ‘‘what is the contribution of the individual X 1 - age and
X 2 -mileage variables?’’. To determine this, we can evaluate the model, first with X 1 -

in the model, then with X 2 . We evaluate X 1 in the model, not excluding X 2 , but
holding it constant, and then we measure it with the sum-of-squares regression (or
sum of-squares error), and vice versa. That is, the sum-of-squares regression is

explained by adding X 1 - Size into the model already containing X 2 or


SSR( X 1|X 2 ) .

The development of this model is straightforward; the


SSR( x k|x 1 , x2 ,…, x k−1 ) effect of
xk ' s contribution to the model containing
xk ' s variables or various other
combinations.
For the present two-predictor variable model, let us assume that X 1 - Size is
important, and we want to evaluate the contribution of X 2 -Age, given X 1 - Size is in
the model. The general strategy of partial F-tests is to perform the following:
1. Regression with X 1 only in the model.
2. Regression with X 2 and X 1 in the model.
3. Find the difference between the model containing only X 1 and the model

containing X 2 , given X 1 is already in the model,


X 2|X 1 ,this measures the
contribution of X 2 .
x
4. A regression model with k predictors in the model can be contrasted in a

number of ways, e.g., (( k k−1 ) or ( k k−1 k−2 k−3 ,


x |x x ,x ,x |x ..... )
NB: The new variable is included only if it significantly improves the model.
Determining the contribution of an independent variable to the regression
model

SSR(xi all variables except j) = SSR(all variables including j) -SSR(all variables except j)

Example: Suppose we have two independent variables in our model: Use the
selling price data

Contribution of variable x1 given that x2 has been included:


SSR(x1  x2) = SSR(x1 and x2) - SSR(x2)=SSR-SSR(X2)

Contribution of variable x1 given that x2 has been included:


SSR(x2  x1) = SSR(x1 and x2) - SSR(x1)

SSR(x1  x2) = sum of squares of the contribution of x1 to the regression model


given that variable x2 has been included in the model.

SSR(x2  x1) = sum of squares of the contribution of x2 to the regression model


given that variable x1 has been included in the model.

SSR(x1 and x2) = regression sum of squares of the when variables x1 and x2 are
both included in the multiple regression model.

The null and the alternative hypotheses are:

H0: Variable x1 does not significantly improve the model after x2 has been included.
Ha: Variable x1 significantly improve the model after x2 has been included.

The partial F-statistics is used in this instance and defined as:

SSR (x j |all variables except j)


F=
MSE ~ F1,n-p
Relationship between a t-statistic and an F-statistic

t 2a =F1 , a

The ANOVA table dividing the regression sum of squares into components to
determine the contribution of variable x1.

Source of Sum of squares Degrees Mean square F


variation of
freedom
Regressio SSR (x1 and x2) 2
n SSR (x2) 1
x2 SSR (x1  x2)= SSR- SSR (x2) 1 MSR (x1  x2)
x 1  x2
Error SSE n-3 MSE
Total SST n-1

The ANOVA table dividing the regression sum of squares into components to
determine the contribution of variable x2.

Source of Sum of squares Degrees Mean square F


variation of
freedom
Regression SSR (x1 and x2) 2
x1 SSR (x1) 1
x2  x1 SSR (x2  x1) 1 MSR (x2  x1)
Error SSE n-3 MSE
Total SST n-1
H0: Variable x2 does not significantly improve the model after x1 has been included.
Ha: Variable x2 significantly improve the model after x1 has been included

Dataset for Practice

X1 X2 Y
17 42 90
19 45 71 76
20 29 63 63 80 80
21 93 80 64 82 66
25 34 75 82
27 98 99
28 9 73
30 73 67 74

You might also like