This action might not be possible to undo. Are you sure you want to continue?

On this page!

What is Generalized Linear Model Lasso Regularization?

Regularize Poisson Regression

Regularize Logistic Regression

Regularize Wide Data in Parallel

Generalized Linear Model Lasso and Elastic Net

References

What is Generalized Linear Model Lasso Regularization?

Lasso is a regularization technique. Use lassoglm to:

Reduce the number of predictors in a generalized linear model.

Identify important predictors.

Select among redundant predictors.

Produce shrinkage estimates with potentially lower predictive errors than ordinary least squares.

Elastic net is a related technique. Use it when you have several highly correlated variables. lassoglm provides

elastic net regularization when you set the Alpha name-value pair to a number strictly between 0 and 1.

For details about lasso and elastic net computations and algorithms, see Generalized Linear Model Lasso and

Elastic Net. For a discussion of generalized linear models, see What Are Generalized Linear Models?.

Regularize Poisson Regression

This example shows how to identify and remove redundant predictors from a generalized linear model.

Create data with 20 predictors, and Poisson responses using just three of the predictors, plus a constant.

rng('default') % for reproducibility

X = randn(100,20);

mu = exp(X(:,[5 10 15])*[.4;.2;.3] + 1);

y = poissrnd(mu);

Construct a cross-validated lasso regularization of a Poisson regression model of the data.

[B FitInfo] = lassoglm(X,y,'poisson','CV',10);

Examine the cross-validation plot to see the effect of the Lambda regularization parameter.

lassoPlot(B,FitInfo,'plottype','CV');

Lasso Regularization of Generalized Linear Models - MATLAB... http://www.mathworks.com/help/stats/lasso-regularization-of-g...

1 of 14 6/2/14 9:57 AM

The green circle and dashed line locate the Lambda with minimal cross-validation error. The blue circle and dashed

line locate the point with minimal cross-validation error plus one standard deviation.

Find the nonzero model coefficients corresponding to the two identified points.

minpts = find(B(:,FitInfo.IndexMinDeviance))

minpts =

3

5

6

10

11

15

16

min1pts = find(B(:,FitInfo.Index1SE))

min1pts =

5

10

15

Lasso Regularization of Generalized Linear Models - MATLAB... http://www.mathworks.com/help/stats/lasso-regularization-of-g...

2 of 14 6/2/14 9:57 AM

The coefficients from the minimal plus one standard error point are exactly those coefficients used to create the

data.

Find the values of the model coefficients at the minimal plus one standard error point.

B(min1pts,FitInfo.Index1SE)

ans =

0.2903

0.0789

0.2081

The values of the coefficients are, as expected, smaller than the original [0.4,0.2,0.3]. Lasso works by

"shrinkage," which biases predictor coefficients toward zero.

The constant term is in the FitInfo.Intercept vector.

FitInfo.Intercept(FitInfo.Index1SE)

ans =

1.0879

The constant term is near 1, which is the value used to generate the data.

Regularize Logistic Regression

This example shows how to regularize binomial regression. The default (canonical) link function for binomial

regression is the logistic function.

Step 1. Prepare the data.

Step 2. Create a cross-validated fit.

Step 3. Examine plots to find appropriate regularization.

Step 4. Create a regularized model.

Step 5. Examine residuals.

Step 6. Alternative: Use identified predictors in a least-squares generalized linear model.

Step 1. Prepare the data.

Load the ionosphere data. The response Y is a cell array of 'g' or 'b' strings. Convert the cells to logical values,

with true representing 'g'. Remove the first two columns of X because they have some awkward statistical

properties, which are beyond the scope of this discussion.

load ionosphere

Ybool = strcmp(Y,'g');

X = X(:,3:end);

Lasso Regularization of Generalized Linear Models - MATLAB... http://www.mathworks.com/help/stats/lasso-regularization-of-g...

3 of 14 6/2/14 9:57 AM

Step 2. Create a cross-validated fit.

Construct a regularized binomial regression using 25 Lambda values and 10-fold cross validation. This process can

take a few minutes.

rng('default') % for reproducibility

[B,FitInfo] = lassoglm(X,Ybool,'binomial',...

'NumLambda',25,'CV',10);

Step 3. Examine plots to find appropriate regularization.

lassoPlot can give both a standard trace plot and a cross-validated deviance plot. Examine both plots.

lassoPlot(B,FitInfo,'PlotType','CV');

The plot identifies the minimum-deviance point with a green circle and dashed line as a function of the regularization

parameter Lambda. The blue circled point has minimum deviance plus no more than one standard deviation.

lassoPlot(B,FitInfo,'PlotType','Lambda','XScale','log');

Lasso Regularization of Generalized Linear Models - MATLAB... http://www.mathworks.com/help/stats/lasso-regularization-of-g...

4 of 14 6/2/14 9:57 AM

The trace plot shows nonzero model coefficients as a function of the regularization parameter Lambda. Because

there are 32 predictors and a linear model, there are 32 curves. As Lambda increases to the left, lassoglm sets

various coefficients to zero, removing them from the model.

The trace plot is somewhat compressed. Zoom in to see more detail.

xlim([.01 .1])

ylim([-3 3])

Lasso Regularization of Generalized Linear Models - MATLAB... http://www.mathworks.com/help/stats/lasso-regularization-of-g...

5 of 14 6/2/14 9:57 AM

As Lambda increases toward the left side of the plot, fewer nonzero coefficients remain.

Find the number of nonzero model coefficients at the Lambda value with minimum deviance plus one standard

deviation point. The regularized model coefficients are in column FitInfo.Index1SE of the B matrix.

indx = FitInfo.Index1SE;

B0 = B(:,indx);

nonzeros = sum(B0 ~= 0)

nonzeros =

14

When you set Lambda to FitInfo.Index1SE, lassoglm removes over half of the 32 original predictors.

Step 4. Create a regularized model.

The constant term is in the FitInfo.Index1SE entry of the FitInfo.Intercept vector. Call that value cnst.

The model is logit(mu) = log(mu/(1 - mu)) X*B0 + cnst . Therefore, for predictions, mu = exp(X*B0 +

cnst)/(1+exp(x*B0 + cnst)).

The glmval function evaluates model predictions. It assumes that the first model coefficient relates to the constant

term. Therefore, create a coefficient vector with the constant term first.

cnst = FitInfo.Intercept(indx);

B1 = [cnst;B0];

Lasso Regularization of Generalized Linear Models - MATLAB... http://www.mathworks.com/help/stats/lasso-regularization-of-g...

6 of 14 6/2/14 9:57 AM

Step 5. Examine residuals.

Plot the training data against the model predictions for the regularized lassoglm model.

preds = glmval(B1,X,'logit');

hist(Ybool - preds) % plot residuals

title('Residuals from lassoglm model')

Step 6. Alternative: Use identified predictors in a least-squares generalized linear model.

Instead of using the biased predictions from the model, you can make an unbiased model using just the identified

predictors.

predictors = find(B0); % indices of nonzero predictors

mdl = fitglm(X,Ybool,'linear',...

'Distribution','binomial','PredictorVars',predictors)

mdl =

Generalized Linear regression model:

y ~ [Linear formula with 15 terms in 14 predictors]

Distribution = Binomial

Estimated Coefficients:

Estimate SE tStat pValue

Lasso Regularization of Generalized Linear Models - MATLAB... http://www.mathworks.com/help/stats/lasso-regularization-of-g...

7 of 14 6/2/14 9:57 AM

_________ _______ ________ __________

(Intercept) -2.9367 0.50926 -5.7666 8.0893e-09

x1 2.492 0.60795 4.099 4.1502e-05

x3 2.5501 0.63304 4.0284 5.616e-05

x4 0.48816 0.50336 0.9698 0.33215

x5 0.6158 0.62192 0.99015 0.3221

x6 2.294 0.5421 4.2317 2.3198e-05

x7 0.77842 0.57765 1.3476 0.1778

x12 1.7808 0.54316 3.2786 0.0010432

x16 -0.070993 0.50515 -0.14054 0.88823

x20 -2.7767 0.55131 -5.0365 4.7402e-07

x24 2.0212 0.57639 3.5067 0.00045372

x25 -2.3796 0.58274 -4.0835 4.4363e-05

x27 0.79564 0.55904 1.4232 0.15467

x29 1.2689 0.55468 2.2876 0.022162

x32 -1.5681 0.54336 -2.8859 0.0039035

351 observations, 336 error degrees of freedom

Dispersion: 1

Chi^2-statistic vs. constant model: 262, p-value = 1e-47

Plot the residuals of the model.

plotResiduals(mdl)

Lasso Regularization of Generalized Linear Models - MATLAB... http://www.mathworks.com/help/stats/lasso-regularization-of-g...

8 of 14 6/2/14 9:57 AM

As expected, residuals from the least-squares model are slightly smaller than those of the regularized model.

However, this does not mean that mdl is a better predictor for new data.

Regularize Wide Data in Parallel

This example shows how to regularize a model with many more predictors than observations. Wide data is data with

more predictors than observations. Typically, with wide data you want to identify important predictors. Use lassoglm

as an exploratory or screening tool to select a smaller set of variables to prioritize your modeling and research. Use

parallel computing to speed up cross validation.

Load the ovariancancer data. This data has 216 observations and 4000 predictors in the obs workspace variable.

The responses are binary, either 'Cancer' or 'Normal', in the grp workspace variable. Convert the responses to

binary for use in lassoglm.

load ovariancancer

y = strcmp(grp,'Cancer');

Set options to use parallel computing. Prepare to compute in parallel using parpool.

opt = statset('UseParallel',true);

parpool()

Starting parpool using the 'local' profile ... connected to 2

workers.

ans =

Lasso Regularization of Generalized Linear Models - MATLAB... http://www.mathworks.com/help/stats/lasso-regularization-of-g...

9 of 14 6/2/14 9:57 AM

Pool with properties:

AttachedFiles: {0x1 cell}

NumWorkers: 2

IdleTimeout: 30

Cluster: [1x1 parallel.cluster.Local]

RequestQueue: [1x1 parallel.RequestQueue]

SpmdEnabled: 1

Fit a cross-validated set of regularized models. Use the Alpha parameter to favor retaining groups of highly

correlated predictors, as opposed to eliminating all but one member of the group. Commonly, you use a relatively

large value of Alpha.

rng('default') % for reproducibility

tic

[B,S] = lassoglm(obs,y,'binomial','NumLambda',100, ...

'Alpha',0.9,'LambdaRatio',1e-4,'CV',10,'Options',opt);

toc

Elapsed time is 398.635386 seconds.

Examine cross-validation plot.

lassoPlot(B,S,'PlotType','CV');

Examine trace plot.

lassoPlot(B,S,'PlotType','Lambda','XScale','log')

Lasso Regularization of Generalized Linear Models - MATLAB... http://www.mathworks.com/help/stats/lasso-regularization-of-g...

10 of 14 6/2/14 9:57 AM

The right (green) vertical dashed line represents the Lambda providing the smallest cross-validated deviance. The

left (blue) dashed line has the minimal deviance plus no more than one standard deviation. This blue line has many

fewer predictors:

[S.DF(S.Index1SE) S.DF(S.IndexMinDeviance)]

ans =

50 86

You asked lassoglm to fit using 100 different Lambda values. How many did it use?

size(B)

ans =

4000 84

lassoglm stopped after 84 values because the deviance was too small for small Lambda values. To avoid overfitting,

lassoglm halts when the deviance of the fitted model is too small compared to the deviance in the binary

responses, ignoring the predictor variables.

You can force lassoglm to include more terms by explicitly providing a set of Lambda values.

minLambda = min(S.Lambda);

explicitLambda = [minLambda*[.1 .01 .001] S.Lambda];

[B2,S2] = lassoglm(obs,y,'binomial','Lambda',explicitLambda,...

'LambdaRatio',1e-4, 'CV',10,'Options',opt);

Lasso Regularization of Generalized Linear Models - MATLAB... http://www.mathworks.com/help/stats/lasso-regularization-of-g...

11 of 14 6/2/14 9:57 AM

length(S2.Lambda)

ans =

87

lassoglm used the three smaller values in fitting.

To save time, you can use:

Fewer Lambda, meaning fewer fits

Fewer cross-validation folds

A larger value for LambdaRatio

Use serial computation and all three of these time-saving methods:

tic

[Bquick,Squick] = lassoglm(obs,y,'binomial','NumLambda',25,...

'LambdaRatio',1e-2,'CV',5);

toc

Elapsed time is 51.708074 seconds.

Graphically compare the new results to the first results.

lassoPlot(Bquick,Squick,'PlotType','CV');

lassoPlot(Bquick,Squick,'PlotType','Lambda','XScale','log')

Lasso Regularization of Generalized Linear Models - MATLAB... http://www.mathworks.com/help/stats/lasso-regularization-of-g...

12 of 14 6/2/14 9:57 AM

The number of nonzero coefficients in the lowest plus one standard deviation model is around 50, similar to the first

computation.

Generalized Linear Model Lasso and Elastic Net

Overview of Lasso and Elastic Net

Lasso is a regularization technique for estimating generalized linear models. Lasso includes a penalty term that

constrains the size of the estimated coefficients. Therefore, it resembles ridge regression. Lasso is a shrinkage

estimator: it generates coefficient estimates that are biased to be small. Nevertheless, a lasso estimator can have

smaller error than an ordinary maximum likelihood estimator when you apply it to new data.

Unlike ridge regression, as the penalty term increases, the lasso technique sets more coefficients to zero. This

means that the lasso estimator is a smaller model, with fewer predictors. As such, lasso is an alternative to

stepwise regression and other model selection and dimensionality reduction techniques.

Elastic net is a related technique. Elastic net is akin to a hybrid of ridge regression and lasso regularization. Like

lasso, elastic net can generate reduced models by generating zero-valued coefficients. Empirical studies suggest

that the elastic net technique can outperform lasso on data with highly correlated predictors.

Definition of Lasso for Generalized Linear Models

For a nonnegative value of !, lasso solves the problem

where

Deviance is the deviance of the model fit to the responses using intercept "

0

and predictor coefficients ". The

formula for Deviance depends on the distr parameter you supply to lassoglm. Minimizing the !-penalized

deviance is equivalent to maximizing the !-penalized log likelihood.

N is the number of observations.

! is a nonnegative regularization parameter corresponding to one value of Lambda.

Parameters "

0

and " are scalar and p-vector respectively.

Lasso Regularization of Generalized Linear Models - MATLAB... http://www.mathworks.com/help/stats/lasso-regularization-of-g...

13 of 14 6/2/14 9:57 AM

As ! increases, the number of nonzero components of " decreases.

The lasso problem involves the L

1

norm of ", as contrasted with the elastic net algorithm.

Definition of Elastic Net for Generalized Linear Models

For an # strictly between 0 and 1, and a nonnegative !, elastic net solves the problem

where

Elastic net is the same as lasso when # = 1. For other values of #, the penalty term P

#

(") interpolates between the

L

1

norm of " and the squared L

2

norm of ". As # shrinks toward 0, elastic net approaches ridge regression.

References

[1] Tibshirani, R. Regression Shrinkage and Selection via the Lasso. Journal of the Royal Statistical Society, Series

B, Vol. 58, No. 1, pp. 267–288, 1996.

[2] Zou, H. and T. Hastie. Regularization and Variable Selection via the Elastic Net. Journal of the Royal Statistical

Society, Series B, Vol. 67, No. 2, pp. 301–320, 2005.

[3] Friedman, J., R. Tibshirani, and T. Hastie. Regularization Paths for Generalized Linear Models via Coordinate

Descent. Journal of Statistical Software, Vol. 33, No. 1, 2010. http://www.jstatsoft.org/v33/i01

(http://www.jstatsoft.org/v33/i01)

[4] Hastie, T., R. Tibshirani, and J. Friedman. The Elements of Statistical Learning, 2nd edition. Springer, New York,

2008.

[5] McCullagh, P., and J. A. Nelder. Generalized Linear Models, 2nd edition. Chapman & Hall/CRC Press, 1989.

Lasso Regularization of Generalized Linear Models - MATLAB... http://www.mathworks.com/help/stats/lasso-regularization-of-g...

14 of 14 6/2/14 9:57 AM

Sign up to vote on this title

UsefulNot usefulLasso regularization in matlab

Lasso regularization in matlab

- Deming Regression
- Diagnostic Testing
- Multiple linear regression
- polyfit.pdf
- Model Validiation
- Group 8 MAIN 2014-Data Analysis Report
- intro R
- Session 1-Upload ET
- Linear Regression - MATLAB & Simulink
- sinex2
- Spss_web
- STAT 786 Assi-2
- [Christopher_F_Baum]_OLS,_IV,_IV–GMM_and_DPD_Est(BookZa.org)
- hw3sol
- STA302_Mid_2012S
- Multiple Regression - Handout
- Ch19.Linear
- II
- diagnostics.pdf
- Part3. 实用教程--Practical Regression and ANOVA using R
- Regression lecture notes
- Relative Valuation_Ashwath Damodaran
- bms.pdf
- multiple regression mechanics
- 9781461452386-c1 (1)
- Diagnostics
- Linear Least Squares (Mathematics) - Wikipedia, The Free Encyclopedia
- Black Business Statistics Study Guide Ch12
- Classical 1
- Lasso Regularization of Generalized Linear Models - MATLAB & Simulink