2 thoughts on “Regularization Part 2: Lasso Regression

  1. Hi Josh,

    Thank you so much for the videos! I’ve been working on a data analysis course and ridge regression came up and your videos are a godsend.

    Could you clarify why increasing lambda decreases the slope? My understanding is that in order to decrease the amount of penalty, we can only decrease the slope of the regression line.

    However, I’m also confused by how the multivariate regressions shrink. And why can’t ridge regression’s slope never equal zero while lass regression’s slope can?

    Thank you!


    • If your parameter is 2, and lambda = 1, then the lasso penalty = 2. If lambda = 2, then the lasso penalty = 4 and if lambda = 3, then the lasso penalty = 6. So the more we increase lambda, the more the penalty is. To compensate for this, we can decrease the parameter value. This may increase the sum of the squared residuals, but perhaps not as much as the lasso penalty.
      The Introduction to Statistical Learning (free download, just google it) has a discussion of why lasso can shrink parameters to 0 and ridge can not. Intuitively, once the parameter values get below zero, the ridge penalty, by squaring the parameters, makes them even smaller and thus, there is less need to shrink them. Thus, they don’t go all the way to zero. In contrast, the lasso penalty leaves the parameters as is.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s