Attribution Non-Commercial (BY-NC)

1.6K views

Attribution Non-Commercial (BY-NC)

- Impact of Health and Physical Work Environment upon Employee Retention in Telecom Sector of Pakistan
- Name: <Unnamed> Log: C:\UMD\ECON626\PS5\Ps5.Log Log Type: Text Opened on: 28
- Consumers’ Preferences Modeling With Multiclass Fuzzy Support Vector Machines
- Tutor 7
- Determinants of Seed Cotton Output_ Evidence From the Northern Region of Ghana
- Description of cervical cancer mortality in Belgium using Bayesian age-period-cohort models
- Assignments
- Causal Forecasting Final
- Garriga & Phillips - Foreign Aid as a Signal to Investors-Predicting FDI in Post-conflict Countries
- 12_chapter 4.pdf
- 06 Simple Linear Regression Part1
- Regression Models using R
- Stat Exercise 3-16-2016
- 191 - Mobility and inbreeding in the heart of Europe. What factors predict academic career in Dutch-speaking Belgian universities?
- Interpreting SPSS Out Put 2016
- Predictors of Institutional Climate
- 9149_ues687
- A_Machine_Learning-Based_Method_To_Impro.pdf
- etd_cco5.pdf
- rbfcs-v1n1-2010.pdf

You are on page 1of 20

Chapter 12 introduces criteria for model selection and comparison and discusses some standard

methods of choosing models. Model selection depends on the objectives of the study. Ramsey and

Schafer identify three possible objectives (pp. 345-6) that will influence how you select a model or

models:

1. Adjusting for a large set of explanatory variables. We want to examine the effect of a

particular variable or variables after adjusting for the effect of other variables which we know

may affect the response.

2. Fishing for explanation. Which variables are important in explaining the response?

3. Prediction. All that is desired is a model that predicts the response well from the values of the

explanatory variables. Interpretation of the model is not a goal.

Before considering some criteria by which a “best” model might be selected among a class of models,

let’s review model development.

• Variable Selection: identification of response variable(s) and all candidate explanatory variables.

This is done in the planning stages of a study. Note that a general rule of thumb is that you need

5 to 10 times as many observations as there are explanatory variables in order to do a good job of

model selection and fitting.

• Model Formation: fitting and comparing models based on some selection criteria to determine

one or more candidate models.

• Model Diagnostics: checking for problems in the model and/or its assumptions.

1. Residual Analysis - identifying outliers, missing variables, model lack of fit, and

violation of assumptions.

2. Influence Statistics - identifying influential observations, or those which have a great

effect on the form of the model.

Example: Suppose we are studying differences in abundance of bird species in 3 forest habitats. The

habitats represent various levels of prescribed burns. The experiment itself consists of counting the

number of birds of each species type heard from a station within 100 meters in a 10-minute period.

Many stations were used in the study for replication.

Explanatory variables? Habitat type, Neighboring habitat type, elevation, slope, aspect, visibility, etc.

Suppose we have 8 candidate explanatory variables X 1 , … , X 8 . How many possible first-order models

are there?

With 20 variables, there are 1,048,575 models, and these are only the first-order models. Clearly

fitting all possible models is not a feasible prospect.

Chapter 12, page 2

Criteria for selecting models

1. R2 : R2 cannot decrease when variables are added so the model maximizing R2 is the one with

all the variables. Maximizing R2 is equivalent to minimizing SSE. R2 is an appropriate way to

compare models with the same number of explanatory variables (as long as the response

variable is the same). Be aware that measures like R2 based on correlations are sensitive to

outliers.

2. MSE = SSE/(n-p): MSE can increase when variables are added to the model so minimizing

MSE is a reasonable procedure. However, minimizing MSE is equivalent to maximizing

adjusted R2 (discussed below) and tends to overfit (include too many variables).

3. Adjusted R2 : This statistic adjusts R2 by including a penalty for the number of parameters in

the model. This statistic is closely related to both R2 and MSE, as shown below.

Adjusted R2 =

Total mean square - Residual mean square MST − MSE MSE ( p − 1)(1 − R 2 )

= = 1− = R2 −

Total mean square MST MST n− p

where p is the number of coefficients (including the intercept) in the model. The third

expression shows that maximizing adjusted R2 is equivalent to minimizing MSE since MST is

fixed (it’s simply the variance of the response variable).

• Adjusted R2 tends to select models with too many variables (overfitting). This can be seen

from the fact that adjusted R2 will increase when a variable is added if the F statistic for

comparing the two models is greater than 1. This is a very generous criterion as this

corresponds to a significance level of around .5.

4. Mallows’ Cp: The Cp statistic assumes that the full model with all variables fits. Then Cp is

computed for a reduced model as

Cp = p + (n − p )

(σˆ

− σˆ full

2 2

)= ( n − p )

σˆ 2

+ 2p − n

σˆ full

2

σˆ full

2

where p is the number of coefficients (including the intercept) in the reduced model.

• Note that σ̂ 2 is simply MSE (mean square error or mean square residual) for a model.

• Models with small values of Cp are considered better and, ideally, we look for the smallest

model with a Cp of around p or smaller. Some statistics programs will compute Cp for a

large set of models and plot Cp versus p, as in Display 12.9 on p. 357. Unfortunately, SPSS

does not compute Cp automatically.

• CP assumes that the full model fits and satisfies all the regression model assumptions.

Outliers, unexplained nonlinearity, nonconstant variance, may seriously affect the

performance of Cp as a model selection tool.

Chapter 12, page 3

• Mallow's Cp is closely related to AIC. AIC has come to be preferred by many statisticians

in recent years.

5. Akaike's Information Criterion (AIC): The AIC statistic for a model is given by:

⎛ SSE ⎞

AIC = n ln⎜ ⎟ + 2p

⎝ n ⎠

where SSE = the error SS for the model under consideration and ln is natural log.

• The term 2p is the penalty for the number of parameters in the model.

• Ripley: “AIC has been criticized in asymptotic studies and simulation studies for tending to

over-fit, that is, choose a model at least as large as the true model. That is a virtue, not a

deficiency: this is a prediction-based criterion, not an explanation based one.'' BIC (below)

is a criterion based on “explanation” approach and places a bigger penalty on the number of

parameters.

• AIC can only be used to compare models. It is not an absolute measure of fit of the model

like R2 is. The model with the smallest AIC among those you examined may fit the data

best, but this does not mean it's a good model. Therefore, selecting which models to

consider (which variables, transformations, form of the model) and making sure the models

satisfy the regression model assumptions is very important.

• Since AIC is not an absolute measure of fit, many authors suggest reporting ∆AIC, the

difference between the AIC of each model and the AIC of the best fitting model. A further

suggestion is to consider all models with ∆AIC less than about 2 as having essentially equal

support.

• Neither AIC nor Cp nor R2 nor adjusted R2 can be used to compare models with different

response variables.

• AIC is based on the assumption that the models satisfy the regression model assumptions

and can be greatly affected by outliers.

6. Bayesian Information Criterion (BIC). BIC is similar to AIC but the penalty on the number

of parameters is pln(n) where ln is the natural log. That is,

⎛ SSE ⎞

BIC = n ln⎜ ⎟ + p ln(n)

⎝ n ⎠

BIC is motivated by a Bayesian approach to model selection and is said not to tend to overfit

like AIC. Therefore, it may be better for model selection for “explanation.” The purpose of

having the penalty depend on the sample size n is to reduce the likelihood that small and

relatively unimportant parameters are included (which is more likely with large n).

Chapter 12, page 4

7. PRESS Statistic (not in text): another prediction-based model selection statistic is the PRESS

statistic. It is calculated as follows: Remove the ith observation and fit the model with the

remaining n-1 observations. Then use this model to calculate a predicted value for the left-out

observation; call this predicted value Yi* . Compute Yi − Yi* , the difference between the

observed response and the predicted response from the model without the ith observation in it.

Repeat this process for each data value. The PRESS statistic is then defined as:

∑ (Yi − Yi* )

n

2

PRESS =

i =1

• Leaving one item out at a time is known as n-fold cross-validation or leave-one-out cross-

validation..

• The Yi − Yi* are called “deleted” residuals in SPSS. So the PRESS statistic can be

computed in SPSS by saving the deleted residuals, creating a new variable which is the

square of the deleted residuals, then computing the sum of this new variable using

Analyze…Descriptive Statistics…Descriptives and choosing Sum under Options.

• PRESS is similar to SSE, but is based on the deleted residuals rather than the raw residuals.

Unlike SSE, it’s possible for PRESS to increase when variables are added to the model.

The PRESS statistic is an example of the general idea of using crossvalidation to assess the predictive

power of models. A model will generally predict the data it's based on better than new data and bigger

models will necessarily do a better job of predicting the data they’re based on than smaller models:

SSE always decreases as more terms are added to the model. A less biased way of assessing the

predictive power of a model is to use the following general idea: fit a model using a subset of the data,

then validate the model using the remainder of the data. This is called crossvalidation (abbreviated

CV).

In k-fold CV, the data are randomly split into k approximately equal-sized subsets. Each subset is left

out in turn and the model based on the remaining subsets is used to predict for the left-out subset. The

PRESS statistic is based on n-fold CV, that is, only one observation at a time is left out. Simulations

have suggested that smaller values of k may work better; 10-fold CV has become a standard method of

cross-validation. Cross-validation is most useful as a way to compare models rather than as an

absolute measure of how good the predictions will be. This is because the model used for prediction of

each subset is different than the model based on all the data that will actually be used to predict future

observations. Each of the models being compared should use the same splits of the data. It’s also best

to repeat the 10-fold CV several times and average the results.

Chapter 12, page 5

Example

Ozone data without case 17. n = 110 cases. Dependent variable is log10(ozone).

Model p SSE R2 MSE AIC BIC PRESS

W + T + S + W:T + W:S + T:S 7 21.534 0.695 0.209 -165.39 -146.49 25.62

W + T + S + W:T + W:S 6 22.152 0.687 0.213 -164.28 -148.08 25.56

W + T + S + W:T + T:S 6 21.537 0.695 0.207 -167.38 -151.17 24.51

W + T + S + W:S + T:S 6 21.867 0.691 0.210 -165.70 -149.50 25.44

W + T + S + W:T 22.182 24.55

W + T + S + W:S 5 22.726 0.679 0.216 -163.47 -149.96 25.63

W + T + S + T:S 5 21.897 0.690 0.209 -167.56 -154.05 24.54

W+T+S 4 23.069 0.674 0.218 -163.82 -153.02 25.20

W + T + W:T 4 26.372 0.627 0.249 -149.10 -138.30 28.54

W+T 3 26.995 0.618 0.252 -148.53 -140.43 28.78

W + S + W:S 4 36.121 0.489 0.341 -114.50 -103.69 39.39

W+S 3 36.410 0.485 0.340 -115.62 -107.52 38.70

T + S + T:S 4 27.029 0.618 0.255 -146.39 -135.59 29.22

T+S 3 28.038 0.603 0.262 -144.36 -136.26 29.68

W 2 44.985 0.364 0.417 -94.36 -88.95 46.84

T 2 31.908 0.549 0.295 -132.14 -126.74 32.98

S 2 57.974 0.180 0.537 -66.45 -61.05 60.15

Constant 1 70.695 0.000 0.649 -46.63 -43.93 72.00

Model p SSE R2 MSE AIC BIC PRESS

W + T + S + W^2 + T^2 + S^2 7 20.175 0.715 0.196 -172.56 -153.66 23.57

W + T + S + W^2 + T^2 6 20.754 0.706 0.200 -171.45 -155.25 23.79

W + T + S + W^2 + S^2 6 20.875 0.705 0.201 -170.81 -154.61 23.51

W + T + S + T^2 + S^2 6 21.270 0.699 0.205 -168.75 -152.55 24.15

W + T + S + W^2 21.393 23.65

W + T + S + T^2 5 21.818 0.691 0.208 -167.95 -154.45 24.36

W + T + S + S^2 5 22.614 0.680 0.215 -164.01 -150.51 25.12

W + T + W^2 + T^2 5 24.924 0.647 0.237 -153.31 -139.81 28.19

W + T + W^2 4 25.390 0.641 0.240 -153.27 -142.47 27.68

W + T + T^2 4 25.998 0.632 0.245 -150.67 -139.87 28.33

W + S + W^2 + S^2 5 29.996 0.576 0.286 -132.94 -119.43 32.79

W + S + W^2 4 32.958 0.534 0.311 -124.58 -113.78 35.31

W + S + S^2 4 33.350 0.528 0.315 -123.28 -112.47 36.12

T + S + T^2 + S^2 5 25.466 0.640 0.243 -150.95 -137.44 28.14

T + S + T^2 4 26.418 0.626 0.249 -148.91 -138.11 28.58

T + S + S^2 4 27.207 0.615 0.257 -145.67 -134.87 29.39

W + W^2 3 41.263 0.416 0.386 -101.86 -93.76 43.98

T + T^2 3 30.579 0.567 0.286 -134.82 -126.72 32.32

S + S^2 3 49.093 0.306 0.459 -82.74 -74.64 51.72

Chapter 12, page 6

Approaches to choosing a model

There are a number of possible approaches to model selection using the measures above to compare

and select models:

• Choose several models a priori that make scientific sense. Use criteria above (like AIC and

BIC) to compare models.

• Examine all possible models involving the variables, including interactions and/or quadratic

terms or both (this is what was done with Ozone data). Generally feasible only up to 3 or 4

variables.

• Examine all main effects models only (there are 2k-1 possible models where k is the number of

variables). Consider interactions or other higher order terms only after the main effects have

been selected.

• If the number of variables is large, select a subset of the variables first, perhaps based on the

correlation of each of the variables individually with the response and/or eliminating redundant

variables (ones which are highly correlated with another variable). Then proceed with one of

the above approaches.

• If the number of variables is large, use stepwise regression to select possible models. Stepwise

regression does not require examination of all models.

Some authors do not believe in stepwise methods and other procedures that search for “good-fitting”

models because they are essentially searching through many tens or hundreds of possible models,

whether they make any scientific sense or not, and picking the “best” ones. The more models you

consider, the higher the likelihood you will select the “wrong” one. Therefore, they believe, you should

select a few models a priori that you will compare. Others argue that there is no “right” model and that

if the goal is prediction, it does not matter if the model makes physical sense. In that case, cross-

validation (discussed above) might be an important tool.

Stepwise regression

Stepwise regression methods attempt to find models minimizing or maximizing some criterion without

examining every possible model. Stepwise methods are not guaranteed to find the best model (in

terms of the criterion selected), but simply try to find the best models using a one-step at a time

approach.

The three most common types of subset selection methods employed are outlined below. The criterion

used in these descriptions is the F statistic for comparing two nested models, but stepwise methods can

also use the associated P-value, or AIC or BIC as a criterion. The latter two are now generally

preferred to the F statistic or P-value. SPSS, however, only does stepwise regression with the F

statistic or P-value.

Forward Selection

1. Start with the model with only the constant.

Chapter 12, page 7

2. Consider all models which consist of the current model plus one more term. For each term not

in the model, calculate its “F-to-enter” (the extra sum-of-squares F statistic). Identify the

variable with the largest F-to-enter. Higher order terms (interactions, quadratic terms) are

eligible for entry only if all lower order terms involved in them are already in the model. For

example, do not consider the interaction AxB for entry unless both A and B individually are

already in the model.

3. If the largest F-to-enter is greater than 4 (or some other user-specified number), add this

variable to get a new current model and return to step 2. If the largest F-to-enter is less than the

user-specified number, stop.

The criterion could also be the P-value for the F-test, in which case a term is added only if its P-value

is less than the user-specified cutoff (usually somewhere between .05 to .20). If a variable is a

categorical variable with more than 2 levels, we add all the indicator variables for this variable at once.

Note that once a variable has been entered it cannot be removed, even if its coefficient becomes

statistically nonsignificant with the addition of other variables, which is possible.

Backward Elimination

1. Start with the model with all of the candidate variables and any higher order terms which might

be important.

2. Calculate the F-to-remove for each variable in the current model (the extra-sum-of-squares test

statistic). Identify the variable with the smallest F-to-remove. A lower order term is eligible

for removal only if all higher order terms involving that variable have already been removed.

For example, the variable A is not eligible for removal if AxB is still in the model.

3. If the smallest F-to-remove is 4 (or some other user-specified number) or less, then remove that

variable to get a new current model and return to step 2. If the smallest F-to-remove is greater

than the user-specified number, stop.

• Again, the criterion for removal could be the P-value (remove a variable only if its P-value is

greater than the cutoff).

• Backward elimination is preferred to forward selection by many users because it does not

eliminate a term unless there is good reason to (forward selection, on the other hand, does not

include a term unless there is convincing evidence to include it).

Stepwise Selection

This method is a hybrid of the previous two, involving both forward selection and backward

elimination.

1. Start with the model with only the constant.

2. Do one step of forward selection.

3. Do one step of backward elimination.

4. Repeat steps 2 and 3 until no changes occur during one cycle of steps 2 and 3.

The F-to-enter must be greater than the F-to-remove; otherwise, you could have a never-ending cycle

of a variable being entered, then eliminated. If a P-value cutoff is used, then the P for entry must be

smaller than the P for removal.

Chapter 12, page 8

Forward selection in SAT data (Case study 12.1) using P of .05 or less to enter. Preliminary analysis

presented in text suggested that log of percent taking exam (log(takers)) should be used in place of

takers.

Coefficientsa

Unstandardized Standardized

Coefficients Coefficients

Model B Std. Error Beta t Sig.

1 (Constant) 1112.248 12.275 90.611 .000

Log10(takers) -135.896 9.476 -.900 -14.340 .000

2 (Constant) 1060.351 15.539 68.239 .000

Log10(takers) -148.061 8.459 -.981 -17.504 .000

expend 2.900 .646 .252 4.488 .000

3 (Constant) 851.315 87.022 9.783 .000

Log10(takers) -143.383 8.272 -.950 -17.333 .000

expend 2.698 .620 .234 4.350 .000

years 12.833 5.265 .127 2.438 .019

a. Dependent Variable: sat

Excluded Variablesd

Collinearity

Partial Statistics

Model Beta In t Sig. Correlation Tolerance

1 income .078a .997 .324 .144 .648

years .157a 2.592 .013 .354 .960

public .048a .755 .454 .109 .980

expend .252a 4.488 .000 .548 .897

rank .221a 1.028 .309 .148 .086

2 income -.057b -.783 .438 -.115 .533

years .127b 2.438 .019 .338 .943

public -.014b -.254 .801 -.037 .916

rank .101b .546 .588 .080 .084

3 income -.051c -.726 .472 -.108 .532

public .056c .938 .353 .138 .727

rank .369c 1.939 .059 .278 .067

a. Predictors in the Model: (Constant), Log10(takers)

b. Predictors in the Model: (Constant), Log10(takers), expend

c. Predictors in the Model: (Constant), Log10(takers), expend, years

d. Dependent Variable: sat

Chapter 12, page 9

These three stepwise methods will not necessarily lead to the same model. In addition, changes in the F

or P-to-enter and F or P-to-remove can result in more or fewer variables in the final model.

The SPSS stepwise regression procedure has some disadvantages. SPSS has no way of knowing that

some variables may be higher order terms that involve lower order terms. Therefore, it cannot enforce

the restriction that higher order terms cannot be added before the corresponding lower order terms

have been added, nor that lower order terms cannot be eliminated until all higher order terms involving

them have been eliminated (that is why I used the SAT data and not the Ozone data with higher order

terms in this example). SPSS also cannot treat the set of indicator variables corresponding to a

categorical variable as one set of variables that should all be added or eliminated at once.

However, SPSS does allow you to define blocks of explanatory variables which can be treated

differently in stepwise regression. Therefore, for the ozone data, where I wanted to look at adding

two-way interactions and quadratic terms, I defined Block 1 to be Wind, MaxTemp and SolarRad and

Block 2 to be all the two way interactions and quadratic terms. I also defined the “Method” for Block

1 to be “Enter”, which means these variables will be in the starting model and cannot be eliminated. I

also defined the “Method” for Block 2 to be “Stepwise”, which means these variables can be added or

eliminated. The P-to-enter and P-to-remove were the default values of .05 and .10, respectively. will

be in the starting model and cannot be eliminated.

Ozone data, case #17 deleted: stepwise regression; Wind, MaxTemp and SolarRad fored to be in the

model.

Coefficientsa

Unstandardized Standardized

Coefficients Coefficients

Model B Std. Error Beta t Sig.

1 (Constant) .114 .226 .504 .615

Wind speed (mph) -.030 .006 -.308 -4.779 .000

Maximum temperature (F) .019 .002 .519 7.830 .000

Solar radiation (langleys) .001 .000 .245 4.248 .000

2 (Constant) .518 .260 1.992 .049

Wind speed (mph) -.096 .024 -.980 -4.040 .000

Maximum temperature (F) .018 .002 .489 7.534 .000

Solar radiation (langleys) .001 .000 .247 4.429 .000

Wind^2 .003 .001 .676 2.868 .005

a. Dependent Variable: Log10(Ozone)

Excluded Variables

Collinearity

Partial Statistics

Model Beta In t Sig. Correlation Tolerance

1 Wind^2 .676 2.868 .005 .270 .052

MaxTemp^2 1.929 2.454 .016 .233 .005

SolarRad^2 -.359 -1.453 .149 -.140 .050

WindTemp -.776 -2.049 .043 -.196 .021

WindSolar -.256 -1.258 .211 -.122 .074

TempSolar 1.198 2.371 .020 .225 .012

2 MaxTemp^2 1.431 1.789 .076 .173 .004

SolarRad^2 -.384 -1.606 .111 -.156 .050

WindTemp -.021 -.038 .969 -.004 .010

WindSolar -.117 -.572 .568 -.056 .069

TempSolar .933 1.846 .068 .178 .011

Chapter 12, page 10

One significant problem with using the F statistic or P-value is that the addition and elimination of

variables is not based on a criterion for comparing models – the final model is not necessarily

“optimal” in any sense. Why not add or eliminate variables based on one of the measures considered

in the first part of this handout, such as AIC or BIC?

The stepAIC function in the MASS library of S-Plus does stepwise regression using AIC (or BIC) as

the criterion. In forward selection, it looks for the single variable which reduces AIC the most; if no

variable reduces AIC, then it stops. In backward elimination, the goal is the same: find the variable

whose elimination reduces AIC the most. If no variable reduces AIC when it's eliminated, then stop.

In stepwise using both directions, find the addition or deletion which reduces AIC the most. Using AIC

has the additional appeal of not having to set arbitrary criteria for entering and removing variables.

The stepAIC function also handles categorical variables and interactions properly: an interaction

cannot be added unless all the variables involved in the interaction have been added; similarly, a

variable cannot be eliminated unless all higher order interactions involving that variable have been

eliminated. Unfortunately, stepAIC does not handle quadratic terms properly.

> summary(m0)

Residuals:

Min 1Q Median 3Q Max

-158.4 -59.45 19.55 50.55 139.6

Coefficients:

Value Std. Error t value Pr(>|t|)

(Intercept) 948.4490 10.2140 92.8574 0.0000

Multiple R-Squared: 2.465e-029

F-statistic: Inf on 0 and 48 degrees of freedom, the p-value is NA

Start: AIC= 419.42

sat ~ 1

+ log(takers) 1 199006.8593 46369.26 339.7760

+ rank 1 190296.7388 55079.38 348.2108

+ income 1 102026.4049 143349.72 395.0799

+ years 1 26338.2438 219037.88 415.8538

<none> NA NA 245376.12 419.4176

+ public 1 1231.7335 244144.39 421.1710

+ expend 1 385.5838 244990.54 421.3406

sat ~ log(takers)

+ expend 1 20523.4615 25845.80 313.1361

+ years 1 6363.5198 40005.74 334.5429

<none> NA NA 46369.26 339.7760

+ rank 1 871.1345 45498.13 340.8467

+ income 1 785.0507 45584.21 340.9393

+ public 1 448.9059 45920.36 341.2993

Chapter 12, page 11

- log(takers) 1 199006.8593 245376.12 419.4176

sat ~ log(takers) + expend

+ years 1 1248.184463 24597.62 312.7106

+ rank 1 1053.599508 24792.20 313.0967

<none> NA NA 25845.80 313.1361

+ income 1 53.329409 25792.47 315.0349

+ public 1 1.292761 25844.51 315.1336

- expend 1 20523.461462 46369.26 339.7760

- log(takers) 1 219144.737003 244990.54 421.3406

sat ~ log(takers) + expend + years

+ rank 1 2675.51301 21922.10 309.0681

<none> NA NA 24597.62 312.7106

- years 1 1248.18446 25845.80 313.1361

+ public 1 287.82166 24309.80 314.1339

+ income 1 19.19044 24578.43 314.6724

- expend 1 15408.12616 40005.74 334.5429

- log(takers) 1 190946.97826 215544.60 417.0660

sat ~ log(takers) + expend + years + rank

<none> NA NA 21922.10 309.0681

+ income 1 505.3684 21416.74 309.9253

+ public 1 185.0259 21737.08 310.6528

- rank 1 2675.5130 24597.62 312.7106

- years 1 2870.0980 24792.20 313.0967

- log(takers) 1 5094.3405 27016.44 317.3067

- expend 1 13619.6111 35541.72 330.7455

Call:

lm(formula = sat ~ log(takers) + expend + years + rank, data = case1201)

Coefficients:

(Intercept) log(takers) expend years rank

399.1147 -38.1005 3.995661 13.14731 4.400277

Residual standard error: 22.32106

Stepwise regression starting with the main effects model and allowing all two-way interactions.

> stepAIC(mfull,list(upper=~.^2,lower=~1))

Start: AIC= 311.88

sat ~ log(takers) + income + years + public + expend + rank

+ years:public 1 5027.807692 16368.93 300.7547

+ log(takers):public 1 3617.792915 17778.95 304.8035

+ income:public 1 1977.822427 19418.92 309.1269

+ income:years 1 1804.755461 19591.98 309.5617

- public 1 19.997447 21416.74 309.9253

+ public:rank 1 1452.863422 19943.87 310.4340

Chapter 12, page 12

- income 1 340.339906 21737.08 310.6528

+ log(takers):years 1 1197.663996 20199.07 311.0570

+ log(takers):income 1 1194.412626 20202.33 311.0649

+ income:rank 1 1046.006240 20350.73 311.4235

<none> NA NA 21396.74 311.8795

+ years:rank 1 485.951497 20910.79 312.7538

+ log(takers):expend 1 447.951860 20948.79 312.8428

+ expend:rank 1 323.487437 21073.25 313.1330

+ years:expend 1 93.688852 21303.05 313.6645

+ public:expend 1 51.522079 21345.22 313.7614

+ log(takers):rank 1 44.248267 21352.49 313.7781

+ income:expend 1 9.445369 21387.29 313.8579

- log(takers) 1 2150.004922 23546.74 314.5712

- years 1 2531.615348 23928.35 315.3590

- rank 1 2679.046601 24075.78 315.6599

- expend 1 10964.372896 32361.11 330.1517

sat ~ log(takers) + income + years + public + expend + rank + years:public

- income 1 193.844212 16562.77 299.3315

+ log(takers):public 1 923.331155 15445.60 299.9097

+ income:rank 1 869.194138 15499.74 300.0811

<none> NA NA 16368.93 300.7547

+ public:rank 1 587.095100 15781.84 300.9649

+ expend:rank 1 513.555766 15855.37 301.1927

+ log(takers):expend 1 496.074306 15872.86 301.2467

+ log(takers):income 1 417.822552 15951.11 301.4877

+ income:public 1 119.187306 16249.74 302.3966

+ log(takers):rank 1 96.896741 16272.03 302.4638

+ income:expend 1 16.336369 16352.59 302.7058

+ income:years 1 10.664688 16358.27 302.7227

+ log(takers):years 1 9.199796 16359.73 302.7271

+ public:expend 1 4.688396 16364.24 302.7406

+ years:rank 1 4.080195 16364.85 302.7425

+ years:expend 1 3.618119 16365.31 302.7439

- log(takers) 1 2319.536747 18688.47 305.2482

- rank 1 2533.477921 18902.41 305.8060

- years:public 1 5027.807692 21396.74 311.8795

- expend 1 13670.486641 30039.42 328.5038

sat ~ log(takers) + years + public + expend + rank + years:public

+ log(takers):public 1 7.036022e+002 15859.17 299.2045

<none> NA NA 16562.77 299.3315

+ expend:rank 1 6.439627e+002 15918.81 299.3884

+ log(takers):expend 1 6.224671e+002 15940.31 299.4545

+ public:rank 1 4.726451e+002 16090.13 299.9129

+ income 1 1.938442e+002 16368.93 300.7547

+ public:expend 1 3.375877e+000 16559.40 301.3216

+ log(takers):rank 1 1.935137e+000 16560.84 301.3258

+ years:expend 1 1.528711e+000 16561.25 301.3270

+ years:rank 1 8.679866e-001 16561.91 301.3290

+ log(takers):years 1 5.202697e-002 16562.72 301.3314

- rank 1 2.456165e+003 19018.94 304.1071

- log(takers) 1 2.985168e+003 19547.94 305.4514

- years:public 1 5.174303e+003 21737.08 310.6528

- expend 1 1.615704e+004 32719.81 330.6919

Chapter 12, page 13

Step: AIC= 299.2

sat ~ log(takers) + years + public + expend + rank + years:public +

log(takers):public

<none> NA NA 15859.17 299.2045

+ expend:rank 1 602.5956096 15256.58 299.3063

- log(takers):public 1 703.6021875 16562.77 299.3315

+ log(takers):expend 1 549.9128359 15309.26 299.4752

+ income 1 413.5731794 15445.60 299.9097

+ years:rank 1 141.9104795 15717.26 300.7640

+ log(takers):years 1 102.4165565 15756.76 300.8870

+ public:rank 1 54.7708444 15804.40 301.0350

+ public:expend 1 39.7984090 15819.37 301.0813

+ log(takers):rank 1 6.6716882 15852.50 301.1839

+ years:expend 1 0.8878288 15858.28 301.2017

- years:public 1 2725.3253513 18584.50 304.9749

- rank 1 3086.8696076 18946.04 305.9190

- expend 1 12860.9171063 28720.09 326.3031

Call:

lm(formula = sat ~ log(takers) + years + public + expend + rank + years:public +

log(takers):public, data = case1201)

Coefficients:

(Intercept) log(takers) years public expend rank years:public

2590.556 19.42852 -134.2278 -26.43972 4.347684 5.991911 1.661026

log(takers):public

-0.5848999

Residual standard error: 19.66746

> stepAIC(mfull,list(upper=~.^2,lower=~1),k=log(49))

Start: AIC= 325.12

sat ~ log(takers) + income + years + public + expend + rank

+ years:public 1 5027.807692 16368.93 315.8892

+ log(takers):public 1 3617.792915 17778.95 319.9381

- public 1 19.997447 21416.74 321.2762

- income 1 340.339906 21737.08 322.0037

+ income:public 1 1977.822427 19418.92 324.2615

+ income:years 1 1804.755461 19591.98 324.6963

<none> NA NA 21396.74 325.1222

+ public:rank 1 1452.863422 19943.87 325.5686

- log(takers) 1 2150.004922 23546.74 325.9221

+ log(takers):years 1 1197.663996 20199.07 326.1916

+ log(takers):income 1 1194.412626 20202.33 326.1995

+ income:rank 1 1046.006240 20350.73 326.5581

- years 1 2531.615348 23928.35 326.7099

- rank 1 2679.046601 24075.78 327.0109

+ years:rank 1 485.951497 20910.79 327.8884

+ log(takers):expend 1 447.951860 20948.79 327.9773

+ expend:rank 1 323.487437 21073.25 328.2676

+ years:expend 1 93.688852 21303.05 328.7990

+ public:expend 1 51.522079 21345.22 328.8959

+ log(takers):rank 1 44.248267 21352.49 328.9126

Chapter 12, page 14

+ income:expend 1 9.445369 21387.29 328.9924

- expend 1 10964.372896 32361.11 341.5027

sat ~ log(takers) + income + years + public + expend + rank + years:public

- income 1 193.844212 16562.77 312.5743

<none> NA NA 16368.93 315.8892

+ log(takers):public 1 923.331155 15445.60 316.9361

+ income:rank 1 869.194138 15499.74 317.1075

+ public:rank 1 587.095100 15781.84 317.9913

+ expend:rank 1 513.555766 15855.37 318.2191

+ log(takers):expend 1 496.074306 15872.86 318.2731

- log(takers) 1 2319.536747 18688.47 318.4910

+ log(takers):income 1 417.822552 15951.11 318.5141

- rank 1 2533.477921 18902.41 319.0487

+ income:public 1 119.187306 16249.74 319.4230

+ log(takers):rank 1 96.896741 16272.03 319.4901

+ income:expend 1 16.336369 16352.59 319.7321

+ income:years 1 10.664688 16358.27 319.7491

+ log(takers):years 1 9.199796 16359.73 319.7535

+ public:expend 1 4.688396 16364.24 319.7670

+ years:rank 1 4.080195 16364.85 319.7688

+ years:expend 1 3.618119 16365.31 319.7702

- years:public 1 5027.807692 21396.74 325.1222

- expend 1 13670.486641 30039.42 341.7466

sat ~ log(takers) + years + public + expend + rank + years:public

<none> NA NA 16562.77 312.5743

+ log(takers):public 1 7.036022e+002 15859.17 314.3390

+ expend:rank 1 6.439627e+002 15918.81 314.5230

+ log(takers):expend 1 6.224671e+002 15940.31 314.5891

+ public:rank 1 4.726451e+002 16090.13 315.0475

- rank 1 2.456165e+003 19018.94 315.4581

+ income 1 1.938442e+002 16368.93 315.8892

+ public:expend 1 3.375877e+000 16559.40 316.4561

+ log(takers):rank 1 1.935137e+000 16560.84 316.4604

+ years:expend 1 1.528711e+000 16561.25 316.4616

+ years:rank 1 8.679866e-001 16561.91 316.4635

+ log(takers):years 1 5.202697e-002 16562.72 316.4659

- log(takers) 1 2.985168e+003 19547.94 316.8024

- years:public 1 5.174303e+003 21737.08 322.0037

- expend 1 1.615704e+004 32719.81 342.0428

Call:

lm(formula = sat ~ log(takers) + years + public + expend + rank + years:public,

data = case1201)

Coefficients:

(Intercept) log(takers) years public expend rank years:public

3274.012 -34.05226 -164.8157 -33.8661 4.651103 5.040749 2.042115

Residual standard error: 19.85829

Chapter 12, page 15

> m1 <- lm(log(ozone)~wind+temp+solar,data=Ozone)

> summary(m1)

Residuals:

Min 1Q Median 3Q Max

-1.0203 -0.31515 -0.0093072 0.32296 1.1222

Coefficients:

Value Std. Error t value Pr(>|t|)

(Intercept) 0.26236 0.52033 0.50423 0.61515

wind -0.06931 0.01450 -4.77854 0.00001

temp 0.04445 0.00568 7.82953 0.00000

solar 0.00219 0.00052 4.24768 0.00005

Multiple R-Squared: 0.67369

F-statistic: 72.947 on 3 and 106 degrees of freedom, the p-value is 0

Stepwise regression using AIC: start with the main effects model and allow all two-way interactions

and quadratic terms; “lower” specifies the lowest allowable model, which is the main effects model.

> stepAIC(m1,list(upper=~.^2+wind^2+temp^2+solar^2,lower=m1))

Start: AIC= -163.82

log(ozone) ~ wind + temp + solar

+ I(wind^2) 1 1.67592921 21.392844 -170.11663

+ I(temp^2) 1 1.25107360 21.817700 -167.95347

+ temp:solar 1 1.17208023 21.896693 -167.55592

+ wind:temp 1 0.88700820 22.181765 -166.13308

+ I(solar^2) 1 0.45453682 22.614236 -164.00908

<none> NA NA 23.068773 -163.82005

+ wind:solar 1 0.34252408 22.726249 -163.46557

log(ozone) ~ wind + temp + solar + I(wind^2)

+ temp:solar 1 0.67869427353 20.714150 -171.66297

+ I(temp^2) 1 0.63901036417 20.753834 -171.45243

+ I(solar^2) 1 0.51800644492 20.874838 -170.81295

<none> NA NA 21.392844 -170.11663

+ wind:solar 1 0.06713886979 21.325705 -168.46239

+ wind:temp 1 0.00030265311 21.392541 -168.11818

- I(wind^2) 1 1.67592920662 23.068773 -163.82005

log(ozone) ~ wind + temp + solar + I(wind^2) + temp:solar

+ I(solar^2) 1 0.7474978978 19.966652 -173.70586

<none> NA NA 20.714150 -171.66297

+ I(temp^2) 1 0.2793246140 20.434825 -171.15638

- temp:solar 1 0.6786942735 21.392844 -170.11663

+ wind:temp 1 0.0536327944 20.660517 -169.94815

+ wind:solar 1 0.0015866564 20.712563 -169.67139

- I(wind^2) 1 1.1825432544 21.896693 -167.55592

Chapter 12, page 16

log(ozone) ~ wind + temp + solar + I(wind^2) + I(solar^2) + temp:solar

<none> NA NA 19.966652 -173.70586

+ I(temp^2) 1 0.2687418912 19.697910 -173.19646

+ wind:temp 1 0.0981394822 19.868512 -172.24786

+ wind:solar 1 0.0051540289 19.961498 -171.73426

- I(solar^2) 1 0.7474978978 20.714150 -171.66297

- temp:solar 1 0.9081857264 20.874838 -170.81295

- I(wind^2) 1 1.1811295810 21.147781 -169.38399

Call:

lm(formula = log(ozone) ~ wind + temp + solar + I(wind^2) + I(solar^2) + temp:

solar, data = Ozone)

Coefficients:

(Intercept) wind temp solar I(wind^2)

2.7000915 -0.19764083 0.016191722 -0.0024656831 0.0059294158

I(solar^2) temp:solar

-0.000012334129 0.0001202964

Residual standard error: 0.44028512

> stepAIC(m1,list(upper=~.^2+wind^2+temp^2+solar^2,lower=m1),k=log(110))

Start: AIC= -153.02

log(ozone) ~ wind + temp + solar

+ I(wind^2) 1 1.67592921 21.392844 -156.61423

+ I(temp^2) 1 1.25107360 21.817700 -154.45107

+ temp:solar 1 1.17208023 21.896693 -154.05352

<none> NA NA 23.068773 -153.01813

+ wind:temp 1 0.88700820 22.181765 -152.63068

+ I(solar^2) 1 0.45453682 22.614236 -150.50668

+ wind:solar 1 0.34252408 22.726249 -149.96317

log(ozone) ~ wind + temp + solar + I(wind^2)

<none> NA NA 21.392844 -156.61423

+ temp:solar 1 0.67869427353 20.714150 -155.46008

+ I(temp^2) 1 0.63901036417 20.753834 -155.24955

+ I(solar^2) 1 0.51800644492 20.874838 -154.61006

- I(wind^2) 1 1.67592920662 23.068773 -153.01813

+ wind:solar 1 0.06713886979 21.325705 -152.25951

+ wind:temp 1 0.00030265311 21.392541 -151.91530

Call:

lm(formula = log(ozone) ~ wind + temp + solar + I(wind^2), data = Ozone)

Coefficients:

(Intercept) wind temp solar I(wind^2)

1.1932358 -0.22081888 0.041915712 0.0022096915 0.0068982286

Residual standard error: 0.45137719

Chapter 12, page 17

W + T + S + W:T + W:S + T:S 7 21.534 0.695 0.209 -165.39 -146.49 25.62 4.16676E+63 0.00002

W + T + S + W:T + W:S 6 22.152 0.687 0.213 -164.28 -148.08 25.56 2.04328E+64 0.00012

W + T + S + W:T + T:S 6 21.537 0.695 0.207 -167.38 -151.17 24.51 4.49052E+65 0.00254

W + T + S + W:S + T:S 6 21.867 0.691 0.210 -165.70 -149.50 25.44 8.45328E+64 0.00048

W + T + S + W:T 5 22.182 0.686 0.211 -166.13 -152.63 24.55 1.93360E+66 0.01093

W + T + S + W:S 5 22.726 0.679 0.216 -163.47 -149.96 25.63 1.33906E+65 0.00076

W + T + S + T:S 5 21.897 0.690 0.209 -167.56 -154.05 24.54 7.99954E+66 0.04522

W+T+S 4 23.069 0.674 0.218 -163.82 -153.02 25.20 2.85589E+66 0.01614

W + T + W:T 4 26.372 0.627 0.249 -149.10 -138.30 28.54 1.15592E+60 0.00000

W+T 3 26.995 0.618 0.252 -148.53 -140.43 28.78 9.72689E+60 0.00000

W + S + W:S 4 36.121 0.489 0.341 -114.50 -103.69 39.39 1.07645E+45 0.00000

W+S 3 36.410 0.485 0.340 -115.62 -107.52 38.70 4.95841E+46 0.00000

T + S + T:S 4 27.029 0.618 0.255 -146.39 -135.59 29.22 7.69111E+58 0.00000

T+S 3 28.038 0.603 0.262 -144.36 -136.26 29.68 1.50302E+59 0.00000

W 2 44.985 0.364 0.417 -94.36 -88.95 46.84 4.27065E+38 0.00000

T 2 31.908 0.549 0.295 -132.14 -126.74 32.98 1.10276E+55 0.00000

S 2 57.974 0.180 0.537 -66.45 -61.05 60.15 3.26346E+26 0.00000

Constant 1 70.695 0.000 0.649 -46.63 -43.93 72.00 1.19828E+19 0.00000

W + T + S + W^2 + T^2 + S^2 7 20.175 0.715 0.196 -172.56 -153.66 23.57 5.41614E+66 0.03062

W + T + S + W^2 + T^2 6 20.754 0.706 0.200 -171.45 -155.25 23.79 2.65594E+67 0.15014

W + T + S + W^2 + S^2 6 20.875 0.705 0.201 -170.81 -154.61 23.51 1.40046E+67 0.07917

W + T + S + T^2 + S^2 6 21.270 0.699 0.205 -168.75 -152.55 24.15 1.78494E+66 0.01009

W + T + S + W^2 5 21.393 0.697 0.204 -170.12 -156.61 23.65 1.03481E+68 0.58499

W + T + S + T^2 5 21.818 0.691 0.208 -167.95 -154.45 24.36 1.19339E+67 0.06746

W + T + S + S^2 5 22.614 0.680 0.215 -164.01 -150.51 25.12 2.32093E+65 0.00131

W + T + W^2 + T^2 5 24.924 0.647 0.237 -153.31 -139.81 28.19 5.23253E+60 0.00000

W + T + W^2 4 25.390 0.641 0.240 -153.27 -142.47 27.68 7.48057E+61 0.00000

W + T + T^2 4 25.998 0.632 0.245 -150.67 -139.87 28.33 5.55609E+60 0.00000

W + S + W^2 + S^2 5 29.996 0.576 0.286 -132.94 -119.43 32.79 7.37547E+51 0.00000

W + S + W^2 4 32.958 0.534 0.311 -124.58 -113.78 35.31 2.59434E+49 0.00000

W + S + S^2 4 33.350 0.528 0.315 -123.28 -112.47 36.12 7.00004E+48 0.00000

T + S + T^2 + S^2 5 25.466 0.640 0.243 -150.95 -137.44 28.14 4.89140E+59 0.00000

T + S + T^2 4 26.418 0.626 0.249 -148.91 -138.11 28.58 9.55897E+59 0.00000

T + S + S^2 4 27.207 0.615 0.257 -145.67 -134.87 29.39 3.74366E+58 0.00000

W + W^2 3 41.263 0.416 0.386 -101.86 -93.76 43.98 5.24144E+40 0.00000

T + T^2 3 30.579 0.567 0.286 -134.82 -126.72 32.32 1.08093E+55 0.00000

S + S^2 3 49.093 0.306 0.459 -82.74 -74.64 51.72 2.60459E+32 0.00000

1.76893E+68 1.00000

0.207

on the average SAT score and a number of other variables. The reason for

collecting the other variables is to help explain the discrepancy between

states' SAT averages. For example, many midwestern states (Montana included)

have much higher SAT scores than other regions. A closer look reveals that

Chapter 12, page 18

this difference is due primarily to the fact that only the better students

in these states actually take the SAT exam. Hence it is important to examine

what factors affect the average SAT scores for each state. Some of the

variables considered as ``explanatory'' variables were:

\begin{enumerate}

\item Percentage of eligible students who took the exam (TAKERS)

\item Median income of families of test-takers (INCOME)

\item Average number of years of study in social science, natural science,

and humanities among the test-takers (YEARS)

\item Percentage of test-takers in public schools (PUBLIC)

\item State expenditures in hundreds of dollars per student (EXPEND)

\item Median percentile ranking of test-takers within their schools (RANK).

\end{enumerate}

Before fitting any models, it is a good idea to examine

the relationships between all pairs of variables. A scatterplot

matrix and a correlation matrix are very useful. The variable

TAKERS appears to have a nonlinear relationship with SAT score so

we may want to consider a transformation of takers: log of TAKERS

appears to work well. There also appear to be a couple of

outliers; Alaska is a particularly extreme outlier on state

expenditures (EXPEND).

We can try For this data set, there are other possible

objectives, besides finding good models for predicting SAT score.

For example:

\begin{quotation}

{\em After accounting for the percentage of students who

took the test (Log(TAKERS)) and the median class rank of the

test-takers (RANK), which variables are important predictors of

state SAT scores?} \end{quotation}

\begin{quotation} {\em

After accounting for the percentage of students who took the test

(TAKERS) and the median class rank of the test-takers (RANK),

which states performed best for the amount of money they spend?}

\end{quotation}

correlations between SAT score and other variables after adjusting

for TAKERS and RANK. Added variable plots and partial residual

plots (available in S-Plus on the regression menu) allow us to

look at this visually (these plots should be obtained by adding

each variable separately to the model with TAKERS and RANK.

The second question could be answered in this way. First, fit the

regression model involving the TAKERS and RANK variables. What do

the resulting residuals tell us? The residuals are the difference

in the observed SAT scores and those predicted by the variables

TAKERS and RANK. A positive residual means the SAT score is

higher than predicted and a negative residual means it is lower

Chapter 12, page 19

than predicted based on these 2 variables. The states could then

be ranked based on these residuals.

\end{document}

S-Plus in the MASS library. The AIC of any fitted linear model

can be obtained by the command \textbf{extractAIC(m)} and the BIC

by \textbf{extractAIC(m,k=log(n))} where m is a fitted model and

$n$ is the sample size. Stepwise regression using AIC or BIC is

obtained from the \textbf{stepAIC} command which is illustrated on

a separate handout.

Example

Ozone data without case 17. n = 110 cases. Dependent variable is log10(ozone).

Chapter 12, page 20

All possible models with main effects and two-way interactions

Model p SSE R2 MSE AIC BIC PRESS

W + T + S + W:T + W:S + T:S 7 21.534 0.695 0.209 -165.39 -146.49 25.62

W + T + S + W:T + W:S 6 22.152 0.687 0.213 -164.28 -148.08 25.56

W + T + S + W:T + T:S 6 21.537 0.695 0.207 -167.38 -151.17 24.51

W + T + S + W:S + T:S 6 21.867 0.691 0.210 -165.70 -149.50 25.44

W + T + S + W:T 5 22.182 0.686 0.211 -166.13 -152.63 24.55

W + T + S + W:S 5 22.726 0.679 0.216 -163.47 -149.96 25.63

W + T + S + T:S 5 21.897 0.690 0.209 -167.56 -154.05 24.54

W+T+S 4 23.069 0.674 0.218 -163.82 -153.02 25.20

W + T + W:T 4 26.372 0.627 0.249 -149.10 -138.30 28.54

W+T 3 26.995 0.618 0.252 -148.53 -140.43 28.78

W + S + W:S 4 36.121 0.489 0.341 -114.50 -103.69 39.39

W+S 3 36.410 0.485 0.340 -115.62 -107.52 38.70

T + S + T:S 4 27.029 0.618 0.255 -146.39 -135.59 29.22

T+S 3 28.038 0.603 0.262 -144.36 -136.26 29.68

W 2 44.985 0.364 0.417 -94.36 -88.95 46.84

T 2 31.908 0.549 0.295 -132.14 -126.74 32.98

S 2 57.974 0.180 0.537 -66.45 -61.05 60.15

Constant 1 70.695 0.000 0.649 -46.63 -43.93 72.00

Model p SSE R2 MSE AIC BIC PRESS

W + T + S + W^2 + T^2 + S^2 7 20.175 0.715 0.196 -172.56 -153.66 23.57

W + T + S + W^2 + T^2 6 20.754 0.706 0.200 -171.45 -155.25 23.79

W + T + S + W^2 + S^2 6 20.875 0.705 0.201 -170.81 -154.61 23.51

W + T + S + T^2 + S^2 6 21.270 0.699 0.205 -168.75 -152.55 24.15

W + T + S + W^2 5 21.393 0.697 0.204 -170.12 -156.61 23.65

W + T + S + T^2 5 21.818 0.691 0.208 -167.95 -154.45 24.36

W + T + S + S^2 5 22.614 0.680 0.215 -164.01 -150.51 25.12

W + T + W^2 + T^2 5 24.924 0.647 0.237 -153.31 -139.81 28.19

W + T + W^2 4 25.390 0.641 0.240 -153.27 -142.47 27.68

W + T + T^2 4 25.998 0.632 0.245 -150.67 -139.87 28.33

W + S + W^2 + S^2 5 29.996 0.576 0.286 -132.94 -119.43 32.79

W + S + W^2 4 32.958 0.534 0.311 -124.58 -113.78 35.31

W + S + S^2 4 33.350 0.528 0.315 -123.28 -112.47 36.12

T + S + T^2 + S^2 5 25.466 0.640 0.243 -150.95 -137.44 28.14

T + S + T^2 4 26.418 0.626 0.249 -148.91 -138.11 28.58

T + S + S^2 4 27.207 0.615 0.257 -145.67 -134.87 29.39

W + W^2 3 41.263 0.416 0.386 -101.86 -93.76 43.98

T + T^2 3 30.579 0.567 0.286 -134.82 -126.72 32.32

S + S^2 3 49.093 0.306 0.459 -82.74 -74.64 51.72

- Impact of Health and Physical Work Environment upon Employee Retention in Telecom Sector of PakistanUploaded byArsalan Xaidi
- Name: <Unnamed> Log: C:\UMD\ECON626\PS5\Ps5.Log Log Type: Text Opened on: 28Uploaded bymapitiprada
- Consumers’ Preferences Modeling With Multiclass Fuzzy Support Vector MachinesUploaded byChih-Chieh Yang
- Tutor 7Uploaded byAmira Rahmat
- Determinants of Seed Cotton Output_ Evidence From the Northern Region of GhanaUploaded byAlexander Decker
- Description of cervical cancer mortality in Belgium using Bayesian age-period-cohort modelsUploaded byolarauf2285
- AssignmentsUploaded byTiwari Vivek
- Causal Forecasting FinalUploaded byBharti Goyal
- Garriga & Phillips - Foreign Aid as a Signal to Investors-Predicting FDI in Post-conflict CountriesUploaded bytzengyen
- 12_chapter 4.pdfUploaded byemkaysubha5977
- 06 Simple Linear Regression Part1Uploaded byRama Dulce
- Regression Models using RUploaded byHektor Ektropos
- Stat Exercise 3-16-2016Uploaded byShella Mae Jalique
- 191 - Mobility and inbreeding in the heart of Europe. What factors predict academic career in Dutch-speaking Belgian universities?Uploaded byOECDInnovation
- Interpreting SPSS Out Put 2016Uploaded bydurraizali
- Predictors of Institutional ClimateUploaded byinventionjournals
- 9149_ues687Uploaded byCiussepe
- A_Machine_Learning-Based_Method_To_Impro.pdfUploaded byPrasanna Babu
- etd_cco5.pdfUploaded byTom Grand
- rbfcs-v1n1-2010.pdfUploaded byFanta
- ICTAS Rizfanni Cahya Putri 1315201004Uploaded byRizfanni Cahya Putri
- Analyzing the Risk of Mortgage DefaultUploaded byBhagath Gottipati
- 3Uploaded byselfi
- Serial Correlation and HeteroscedasticityUploaded byNurul Ariffah
- Chap.16Uploaded byRifki Maulana
- ResultsUploaded byTintin Sumaway
- Kliping Tugas Seni Budaya Punya CarlosUploaded byMariaFernandaSimbolon
- ML Book.pdfUploaded byKrishna Chaitanya
- Consumer behavior [1667]Uploaded byFaheemullah Haddad
- smaiUploaded byAli

- Model- vs. design-based sampling and variance estimationUploaded byFanny Sylvia C.
- ReviewChaps3-4Uploaded byFanny Sylvia C.
- SampleSizeCalcRevisitedUploaded byFanny Sylvia C.
- ReviewChaps1-2Uploaded byFanny Sylvia C.
- Hypo%26PowerLectureUploaded byFanny Sylvia C.
- Non%26ParaBootUploaded byFanny Sylvia C.
- Chapter 21Uploaded byFanny Sylvia C.
- Chapter 20Uploaded byFanny Sylvia C.
- Chapter 14Uploaded byFanny Sylvia C.
- Chapter 13Uploaded byFanny Sylvia C.
- Chapter 11Uploaded byFanny Sylvia C.
- Chapter 8Uploaded byFanny Sylvia C.
- Chapter 10Uploaded byFanny Sylvia C.
- Chapter 9Uploaded byFanny Sylvia C.
- Chapter 6Uploaded byFanny Sylvia C.
- Chapter 5Uploaded byFanny Sylvia C.
- Chapter5p2LectureUploaded byFanny Sylvia C.
- Chapter 7Uploaded byFanny Sylvia C.
- An Ova PowerUploaded byFanny Sylvia C.
- Intro BootstrapUploaded byMichalaki Xrisoula
- Good Article on Standard Error vs Standard DeviationUploaded byAshok Kumar Bharathidasan
- Data Modeling: General Linear Model &Statistical InferenceUploaded byFanny Sylvia C.
- Bio Math 94 CLUSTERING POPULATIONS BY MIXED LINEAR MODELSUploaded byFanny Sylvia C.
- GRM: Generalized Regression Model for Clustering Linear SequencesUploaded byFanny Sylvia C.
- Clustering in the Linear ModelUploaded byFanny Sylvia C.
- R Matrix TutorUploaded byFanny Sylvia C.
- The not so Short Introduction to LaTeXUploaded byoetiker
- Close Out NettingUploaded byFanny Sylvia C.

- 4CITWHumanAspectsofComputingUploaded byPhyllis Dawkins
- social post card lessonUploaded byapi-251726621
- 1997 Uniform Code for the Abatement of Dangerous BuildingsUploaded byMatthew Taylor
- Volume 5 Number 3 - October 2014Uploaded byEdmundo Caetano
- Commercial Member Education - masterUploaded by560107
- Virginia Woolf - 20.pdfUploaded byDarbonChristine
- IM_JCS_Prog_EnuUploaded byShankar Vaitheeswaran
- Compendium OIV vol1Uploaded byDaniela Menegat
- Data MiningUploaded byahmed
- IGNOU MCA 3rd Semster Sofware Engineering Lab Record Solved MCSL 036Uploaded byfajer007
- IndexUploaded byDanny Achahue
- Learnings in CST in church and impact or influence to your personUploaded bySugar Jumuad
- Infosys Hw4Uploaded byAdam Ong
- Outline Spring 2017Uploaded byMohamed Sayed Abdo
- CA EndevorUploaded bySrikanth Kolli
- Activity Guide and Evaluation Rubric - Task 1 - Understanding the Need of Being QualifiedUploaded byAsura Knight
- 2 - Geodesy.pptUploaded byReynaldo Aldamar
- Stuart Wilde - The Journey Beyond EnlightenmentUploaded byGratiela Stadiu
- aptt talking pointsUploaded byapi-377713106
- Fwd Ups Delivery Notification Tracking Number 1zro396fd475878180Uploaded by:Nanya-Ahk:Heru-El(R)(C)TM
- Freeway Geometric Design for Active Traffic Management in EuropeUploaded byAmaefuleLawrenciaAjii
- Competencies MappingUploaded byAkshay P. Joshi
- ResumeUploaded byjessicadietlin
- tarea de gestion de proyectosUploaded byDaric Peña Gomez
- human_computer_interface_tutorial.pdfUploaded byRomeo Balingao
- Walton Whatwhywherefore CopyAestheticsUploaded byjbarulkar
- Class 02 - Types of MaintenanceUploaded byb_shadid8399
- Structural Precast Concrete Handbook LowresUploaded byVance Kang
- Lotus Notes Admin EITUploaded bynarendraideal
- in 1.pdfUploaded byputri

## Much more than documents.

Discover everything Scribd has to offer, including books and audiobooks from major publishers.

Cancel anytime.