You are on page 1of 3

Introduction to Data Analytics

Lecture: Adj R2 and Regularization/Coefficient shrinkage


NPTEL MOOC
By
Prof. Nandan Sudarsanam, DoMS, IIT-M and
Prof. B. Ravindran, CS&E, IIT-M
Motivation
• Metrics to evaluate a Regression Model
• We have so far discussed the p-value from the F-test of an ANOVA
• What about R2 and Adj R2
• For R2 (It is nothing but SSM/SST)
• From the ANOVA output SST, SSM/Regression, SSE/Residual

• SST = 𝑦𝑖 − 𝑦 2 , SSM = 𝑦𝑖 − 𝑦 2 , SSE = 𝑦𝑖 − 𝑦𝑖 2

(1−𝑅 2 )(𝑛−1)
• Adj R2 is nothing but: 1 −
𝑛−𝑘−1
Regularization Techniques
• Going beyond variable selection, what about variable shrinkage
• Multicollinearity and the potential for many forms of a regression equation
• Y = 4A-2B or Y = 10A-8B
• Ridge Regression
𝑛 𝑝
• 𝛽 𝑟𝑖𝑑𝑔𝑒 = min 𝑖=1(𝑦𝑖 − 𝛽0 − 𝑗=1 𝑥𝑖𝑗 𝛽𝑗 ) 2

𝑝 2
Subject to 𝛽
𝑗=1 𝑗 ≤ 𝑠
• Lasso Regression
𝑛 𝑝
• 𝛽 𝑟𝑖𝑑𝑔𝑒 = min 𝑖=1(𝑦𝑖 − 𝛽0 − 𝑗=1 𝑥𝑖𝑗 𝛽𝑗 ) 2
𝑝
Subject to 𝑗=1 |𝛽𝑗 | ≤𝑠

You might also like