Topic 16 covered

© All Rights Reserved

0 views

Topic 16

Topic 16 covered

© All Rights Reserved

- 4. (English) Gaya Kepemimpinan Trasnformasional 4
- Multiple Linear Regression and Checking for Collinearity Using SAS
- Synergies Created by Strategic Fit Between Business Strategies and Human Resource Strategies in Transnational Tea Firms in Kenya
- 00000chen- Linear Regression Analysis3
- Co Mark Anthony ECONMET Project (Poverty Line)
- Correlation and Regression Measuring and Predicting Relationships
- ECO4016F+2011+Tutorial+7
- Capital Structure
- 40 Interview Questions Asked at Startups in Machine Learning _ Data Science
- Optimization of Growth
- Final Draft (10) to Print
- Factors Influencing
- 7 Regression
- Aggregated vs. Disaggregated Data in Regression
- A Study of Relationship of Academic Achievement With Aptitude Attitude and Anxiety by Mrs Santwana G Mishra Dr K L Chincholikar
- Simple Linear Regression Part 1
- Cash Flow in Bankruptcy Prediction
- Multiple Linear Regrssion - Chapter3.3
- Strategi Fleksibility.pdf
- model data science

You are on page 1of 4

However, how can we tell if the dropped variable(s) give us any useful

Ridge Regression (Section 10.2)

information.

If the variable is important, the parameter estimates become biased up.

Some Remedial Measures for Multicollinearity Sometimes, observations can be designed to break the multicollinearity.

Get coefficient estimates from additional data from other contexts.

Restrict the use of the regression model to infererence on values of predictor

For instance, if the model is

variables that follow the same pattern of multicollinearity.

For example, suppose a model has three predictors: X1 , X2 , X3 . The Yi = 0 + 1 Xi,1 + 2 Xi,2 + i ,

distribution of (X1 , X2 , X3 ) is N (, ) for some mean vector and

and you have an estimator b1 (for 1 based on another data set, you can

covariance matrix . If future predictor values come from this 0

estimate 2 by regressing the adjusted variable Yi = Yi b1 Xi,1 on Xi,2 .

distribution, even if there is serious multicollinearity, inferences for the (Common example: in economics, using cross-sectional data to estimate

predictions using this model are still useful. parameters for a time-dependent model.)

If the model is a polynomial regression model, use centered variables. Use the first few principal components (or factor loadings) of the predictor

Drop one or more predictor variables (i.e., variable selection). variables. (Limitation: may lose interpretability.)

Standard errors on the parameter estimates decrease. Biased Regression or Coefficient Shrinkage (Example: Ridge Regression)

Page 1 Page 2

Statistics 512: Applied Regression Analysis Purdue University Statistics 512: Applied Regression Analysis Purdue University

Professor Jennings Spring 2004 Professor Jennings Spring 2004

Equivalent Representation:

2

Two Equivalent Formulations of Ridge Regression X

p

ridge = arg min yi 0 xi,j j ,

P

Ridge regression shrinks estimators by penalyzing their size. (Penalty: j2 ) j=1

X

p

subject to j2 s.

Penalized Residual Sum of Squares:

j=1

XN X

p X

p There is a direct relationship between and s (although we will usually talk

ridge = arg min (Yi 0 xi,j j )2 + j2

about ).

i=1 j=1 j=1

controls the amount of shrinkage of the parameter estimates

Matrix Representation of Solution

Large greater shrinkage (toward zero)

ridge = (X0 X + I)1 X0 y

Page 3 Page 4

Statistics 512: Applied Regression Analysis Purdue University Statistics 512: Applied Regression Analysis Purdue University

Professor Jennings Spring 2004 Professor Jennings Spring 2004

NKNW Example page 261

data bodyfat;

SAS code in ridge.sas infile H:\System\Desktop\CH07TA01.dat;

input skinfold thigh midarm fat;

20 healthy female subjects ages 25-34 proc print data = bodyfat;

proc reg data = bodyfat;

Y is fraction body fat model fat = skinfold thigh midarm;

Sum of Mean

X2 is thigh circumference Source DF Squares Square F Value Pr > F

Model 3 396.98461 132.32820 21.52 <.0001

X3 is midarm circumference Error 16 98.40489 6.15031

Corrected Total 19 495.38950

Conclusion from previous analysis: could have good model with thigh only or Root MSE 2.47998 R-Square 0.8014

midarm and thickness only. Dependent Mean 20.19500 Adj R-Sq 0.7641

Coeff Var 12.28017

Page 5 Page 6

Statistics 512: Applied Regression Analysis Purdue University Statistics 512: Applied Regression Analysis Purdue University

Professor Jennings Spring 2004 Professor Jennings Spring 2004

Parameter Estimates

Parameter Standard

Variable DF Estimate Error t Value Pr > |t|

Intercept 1 117.08469 99.78240 1.17 0.2578

skinfold 1 4.33409 3.01551 1.44 0.1699 Try Ridge Regression

thigh 1 -2.85685 2.58202 -1.11 0.2849

midarm 1 -2.18606 1.59550 -1.37 0.1896

proc reg data = bodyfat

None of the p-values are significant. outest = bfout ridge = 0 to 0.1 by 0.003;

Pearson Correlation Coefficients, N = 20 model fat = skinfold thigh midarm / noprint;

skinfold thigh midarm fat plot / ridgeplot nomodel nostat;

skinfold 1.00000 0.92384 0.45778 0.84327

thigh 0.92384 1.00000 0.08467 0.87809

midarm 0.45778 0.08467 1.00000 0.14244

fat 0.84327 0.87809 0.14244 1.00000

Page 7 Page 8

Statistics 512: Applied Regression Analysis Purdue University Statistics 512: Applied Regression Analysis Purdue University

Professor Jennings Spring 2004 Professor Jennings Spring 2004

Ridge Trace

How to Choose

Estimated coefficients should be stable

look for only modest change in R2 or

.

title2 Variance Inflation Factors;

proc gplot data = bfout;

plot (skinfold thigh midarm)* _RIDGE_ / overlay;

where _TYPE_ = RIDGEVIF;

run;

Each value of (or Ridge k in SAS) gives different values of the parameter

estimates. (Note the instability of the estimate values for small .)

Page 9 Page 10

Statistics 512: Applied Regression Analysis Purdue University Statistics 512: Applied Regression Analysis Purdue University

Professor Jennings Spring 2004 Professor Jennings Spring 2004

var _RIDGE_ skinfold thigh midarm;

where _TYPE_ = RIDGEVIF;

proc print data = bfout;

var _RIDGE_ _RMSE_ Intercept skinfold thigh midarm;

where _TYPE_ = RIDGE;

Page 11 Page 12

Statistics 512: Applied Regression Analysis Purdue University Statistics 512: Applied Regression Analysis Purdue University

Professor Jennings Spring 2004 Professor Jennings Spring 2004

Parameter Estimates

Variance Inflation Factors Obs _RIDGE_ _RMSE_ Intercept skinfold thigh midarm

Obs _RIDGE_ skinfold thigh midarm 3 0.000 2.47998 117.085 4.33409 -2.85685 -2.18606

2 0.000 708.843 564.343 104.606 5 0.002 2.54921 22.277 1.46445 -0.40119 -0.67381

4 0.002 50.559 40.448 8.280 7 0.004 2.57173 7.725 1.02294 -0.02423 -0.44083

6 0.004 16.982 13.725 3.363 9 0.006 2.58174 1.842 0.84372 0.12820 -0.34604

8 0.006 8.503 6.976 2.119 11 0.008 2.58739 -1.331 0.74645 0.21047 -0.29443

10 0.008 5.147 4.305 1.624 13 0.010 2.59104 -3.312 0.68530 0.26183 -0.26185

12 0.010 3.486 2.981 1.377 15 0.012 2.59360 -4.661 0.64324 0.29685 -0.23934

14 0.012 2.543 2.231 1.236 17 0.014 2.59551 -5.637 0.61249 0.32218 -0.22278

16 0.014 1.958 1.764 1.146 19 0.016 2.59701 -6.373 0.58899 0.34131 -0.21004

18 0.016 1.570 1.454 1.086 21 0.018 2.59822 -6.946 0.57042 0.35623 -0.19991

20 0.018 1.299 1.238 1.043 23 0.020 2.59924 -7.403 0.55535 0.36814 -0.19163

22 0.020 1.103 1.081 1.011 25 0.022 2.60011 -7.776 0.54287 0.37786 -0.18470

24 0.022 0.956 0.963 0.986 27 0.024 2.60087 -8.083 0.53233 0.38590 -0.17881

26 0.024 0.843 0.872 0.966 29 0.026 2.60156 -8.341 0.52331 0.39265 -0.17372

28 0.026 0.754 0.801 0.949 31 0.028 2.60218 -8.559 0.51549 0.39837 -0.16926

30 0.028 0.683 0.744 0.935 33 0.030 2.60276 -8.746 0.50864 0.40327 -0.16531

32 0.030 0.626 0.697 0.923

Note that at RIDGE = 0.020, the RM SE is only increase by 5% (so SSE

Note that at RIDGE = 0.020, the VIFs are close to 1.

increase by about 10%), and the parameter estimates are closer to making sense.

Page 13 Page 14

Professor Jennings Spring 2004

Conclusion

So the solution at = 0.02 with parameter estimates (-7.4, 0.56, 0.37, -0.19)

seems to make the most sense.

Notes

The book makes a big deal about standardizing the variables... SAS does this

for you in the ridge option.

the region of the predictor variables: less affected by small changes in the

data. (Ordinary LS estimates can be highly unstable when there is lots of

multicollinearity.)

P

Other procedures use different penalties, e.g. Lasso penalty: |j |.

Page 15

- 4. (English) Gaya Kepemimpinan Trasnformasional 4Uploaded byRickaArdila
- Multiple Linear Regression and Checking for Collinearity Using SASUploaded byRobin
- Synergies Created by Strategic Fit Between Business Strategies and Human Resource Strategies in Transnational Tea Firms in KenyaUploaded byEditor IJTSRD
- 00000chen- Linear Regression Analysis3Uploaded byTommy Ngo
- Co Mark Anthony ECONMET Project (Poverty Line)Uploaded byMac Co
- Correlation and Regression Measuring and Predicting RelationshipsUploaded byDr Rushen Singh
- ECO4016F+2011+Tutorial+7Uploaded bySwazzy12
- Capital StructureUploaded bysimmi33
- 40 Interview Questions Asked at Startups in Machine Learning _ Data ScienceUploaded byVinod Manghani
- Optimization of GrowthUploaded bysiamak77
- Final Draft (10) to PrintUploaded byzakwan01
- Factors InfluencingUploaded byJojo Lam
- 7 RegressionUploaded bysumit
- Aggregated vs. Disaggregated Data in RegressionUploaded byEpsonminces
- A Study of Relationship of Academic Achievement With Aptitude Attitude and Anxiety by Mrs Santwana G Mishra Dr K L ChincholikarUploaded byWindeeNuñal
- Simple Linear Regression Part 1Uploaded by_vanityk
- Cash Flow in Bankruptcy PredictionUploaded bysohail0779
- Multiple Linear Regrssion - Chapter3.3Uploaded byJorge Gonza
- Strategi Fleksibility.pdfUploaded byindra
- model data scienceUploaded byMid Subs
- 101192Uploaded bydwirrel
- STATS 1900 Sem 2 2013 Major Assignment_Solution.docUploaded byPandurang Thatkar
- Data Anna 1Uploaded bySuaeba Nur
- Research paper.docUploaded byFiza Amber
- lecture2.621Uploaded byRui Magalhães
- OUTPUTTESISUploaded byYopafreza
- level-2-r-11Uploaded byQuyen
- 1Uploaded bykhilanvekaria
- Jurnal Terbit Di IJSTR Tim Zulfityana dan I Ketut Budaraga,5-9-2016Uploaded byketut2
- Wilson, Hardy and Harwood (2006)Uploaded byKylie Wilson

- SolutionsUploaded bymazamniazi
- ARDL_GoodSlides.pdfUploaded bymazamniazi
- Pakistan FloodUploaded bymazamniazi
- pakistan_flood.pdfUploaded bymazamniazi
- Panel DataUploaded bykuldeep_0405
- p_167.pdfUploaded byharoonshaikh
- Basmati Value Chain PunjabUploaded bymazamniazi
- 2-3SLS-Good18-simul.pdfUploaded bymazamniazi
- Functional FormUploaded bymazamniazi
- PriceIndices-ISDA.pdfUploaded bymazamniazi
- Consumer Theory Burkey AcademyUploaded bymazamniazi
- Supplemental CostminimizationUploaded bymazamniazi
- 59_National Nutrition Survey-2011Uploaded bymazamniazi
- Poultry Role RurUploaded bymazamniazi
- phutti oct15Uploaded bymazamniazi
- PanelUploaded byPär Sjölander
- UncertaintyUploaded bymazamniazi
- cn7Uploaded bymalaika_heaven
- 2013-14 wheat,Uploaded bymazamniazi
- Adaptive ExpectationsUploaded bymazamniazi
- Crops Tsls 3slsUploaded bymazamniazi
- PriceIndices-ISDAUploaded bymazamniazi
- CGE SlidesUploaded bymazamniazi
- ardl-kkOYCKtRANSFORMATIONUploaded bymazamniazi
- ch21_ansUploaded bymazamniazi
- David GilesUploaded bymazamniazi

- FPE7e AppendicesUploaded byJohn
- ESTAÑO ACIDOUploaded byhumbertotorresr
- Buddy Assignment Abang and SarthishUploaded byStanley Eighat
- Abdul, H N H, Mee, K L K (2011) Adsorption of Basic Red 46 by Granular Activated Carbon in a Fixed-bed ColumnUploaded bymatsonenglish
- STUDY OF THE BEHAVIOUR OF CLAY SOIL MIXED WITH POLYTHENE STRIPS.Uploaded byIJAR Journal
- Site Investigation Code of PracticeUploaded byLiam Culligan
- boosterUploaded byD'todoMoquegua
- 10-TimeUploaded bySelect dsouza
- hi902_manual automatic potentionmetric titrator.pdfUploaded byVăn Đại - BKHN
- Factors Affecting Water Solubility in OilsUploaded byArgenomSaubi
- IP Multicast Troubleshooting GuideUploaded bysurge031
- Bunkering Oil on the ShipUploaded byibnuhary
- Kooltherm K7 Pitched Roof BoardUploaded byCaminito Mallorca
- angularjs-cookbook.pdfUploaded byniurkam
- 99 MORE Level 700+ GMAT QuestionsUploaded byamit_k214283
- Minimum and Maximum ModesUploaded bySubash Basnyat
- Performance Analysis of 20Gbps Optical Transmission System Using Fiber Bragg GratingUploaded byIJSRP ORG
- o chem 2 labUploaded byJesus Eddy Peña Melissaratos
- Negative FeedbackUploaded byCharlene Elio
- 29F010.pdfUploaded byJosé Adelino
- Menu Analysis 3.5LUploaded byMelisa Lin
- Vlan Trunking Dtp Vtp Private VlanUploaded byTạ Ngọc Hưng
- DataStage Advanced All Labs1-11Uploaded byVishali Nerai
- 03_G130&G150&S150_TECH-DAT_CLUploaded byANDRES CISTERNAS
- vlsi lab VTUUploaded byKeith Fernandes
- Adaptadores a Distintos Tipos de ConectoresUploaded byFederico Mejia
- Scientific Career of Dr. Jaime Lagunez OteroUploaded byFrente Civico
- Potentiometric Titration of an Acid MixtureUploaded byRuth Danielle Gascon
- TT100_user_manual_2012_ENGL.pdfUploaded byErika Sears
- CMOS Glucose SensorUploaded byMaverick Ramos