← Back to all posts

Reading time - 7 mins

1. ML Picks of the Week

A weekly dose of ML tools, concepts & interview prep. All in under 1 minute.

2. Technical ML Section

Learn how to choose between LASSO and Ridge Regression Models.

3. Career ML Section

Learn 3 ML career tips of the month that can help in yearly (1-4 y.e.) career

1. ML Picks of the Week

🥇ML Tool

Python Library DARTS

DARTS makes time series forecasting simple, flexible, and production-ready — all in Python.

It provides a unified API for classical models (ARIMA, Exponential Smoothing) and modern deep learning models (RNNs, TCN, Transformers).

DARTS is especially powerful for:

Quickly prototyping with multiple forecasting models
Handling multivariate and probabilistic forecasts
Backtesting, ensembling, and model comparison — all out of the box

If you’re still building time series models from scratch or mixing different libraries, start using Darts to streamline your forecasting workflow.

📈 ML Concept

SHAP Values

SHAP (SHapley Additive exPlanations) is a powerful method to interpret ML model predictions by assigning each feature an importance value.

Based on game theory, they show how much each feature contributed to a prediction
SHAP works with any ML model: tree-based, linear, neural networks, etc.
SHAP helps explain individual predictions and overall model behavior

SHAP is especially useful for debugging, building trust with stakeholders, and meeting model transparency requirements.

Read a crystal clear breakdown on how SHAP works HERE.

🤔 ML Interview Question

What is the major difference between Gradient Boosting & Random Forest?

Both are ensemble learning methods that combine decision trees, but they differ in how they build and combine those trees.

Random Forest builds many trees independently using random subsets of data and features.

Gradient Boosting builds trees sequentially, where each new tree tries to correct the errors made by the previous trees.

Key differences:

Random Forest relies on averaging many uncorrelated trees to improve stability
Gradient Boosting creates a strong learner by combining many weak learners in a targeted way

2. Technical ML Section

How to choose between Ridge and LASSO?

(Read a more detailed article here)

You’re building a regression model & you know it might overfit.

So you decide to add regularization. But now you're stuck:

L1 (LASSO) or L2 (Ridge)?

They sound similar.
They both penalize model complexity.
They both help prevent overfitting

In this newsletter, you'll learn:

What exactly is the difference between LASSO and Ridge
Which one to use (and when not to)
A quick cheat sheet to remember it all

Note: LASSO or Ridge are not constrained to linear models.

But for simplicity, in this newsletter, we will consider the linear form only.

1️⃣ What is LASSO?

Imagine that we want to fit (i.e., find weights) a linear model of the following form:

To fit the model and find the weights, LASSO minimizes the following objective function:

We see that, in addition to the MSE error, the objective function has a penalty term expressed as the sum of the weights' absolute values.

When the optimization problem is formed, this penalty term creates a constrained region in the form of a diamond shape in the optimization space.

Due to the constrained shape nature, LASSO naturally drives part of the weights towards zero, see the image below.

(Read a more detailed description in this article)

So, often after the LASSO model is fitted, some of the feature coefficients (weights) can become exactly zeros.

As such, the direct result of LASSO model fitting is feature selection!

2️⃣ What is Ridge?

Similar to LASSO, Ridge minimizes the constrained objective function, however, this function has a different form:

While the absolute weight values created a diamond shape for LASSO, for Ridge it creates a circle, see the image below.

Which difference does it make compared to LASSO?

In this case, the weights also get smaller compared to the case with the unconstrained objective. However, they are NOT forced to become zero!

3️⃣ So, what is the main difference?

As we see above, the main difference between LASSO and Ridge is the fact that LASSO naturally drives part of the weights towards zero. This naturally creates the process of feature selection.

Ridge, on the other hand, does not drive weights to zero while it drives them to become small.

Now, 3 questions arise:

Q1: Can Ridge give you zero weights?

Answer: Yes, sure!

Q2: Does LASSO create more zero weights than Ridge?

Answer: Yes, and this is exactly the main difference between them!

Q3: When to use what?

Answer: See below!

4️⃣ When to use LASSO vs Ridge?

Here’s a practical decision guide based on the behavior of both methods:

✅ Use LASSO when:

You expect that only a few features are truly relevant
You want automatic feature selection
You’re working with high-dimensional data

📌 Good for: sparse models, simplifying features, quick filtering

❌ Avoid LASSO when:

You have many collinear (correlated) features
LASSO tends to pick one and ignore the rest, and that choice can change with small shifts in the data

✅ Use Ridge when:

You believe most features are relevant, even if weakly.
Your dataset includes many correlated features.

Ridge distributes the weight across correlated predictors instead of picking one. That makes it more stable and more robust.

📌 Example use cases: regularizing large models, preventing multicollinearity issues.

❌ Avoid Ridge when:

You believe some features are irrelevant and want the model to ignore them.

🔥 Cheat sheet:

2. ML Career Section

3 ML Career Tips of the Month

These are the ML Career tips I wish I knew when I was starting my career.

These tips are relevant to all Data Scientists and Machine Learning Engineers with up to 4 years of experience.

The tips are:

1. Focus on end-to-end ML skills

2. Learn 1-2 ML and business domains deeply

3. Join a good team & team lead, not a company brand

Why are these tips important?

-> Focusing on end-to-end ML skills will help to:

See the bigger picture when building models
Learn a lot more than just data analysis
Get skills that are in high demand.

-> Focusing on 1-2 ML and business domains will give real expertise for which companies pay more, not less.

Why not learn many ML domains?

My take: "Learning everything = knowing nothing". ALWAYS TRUE.

-> Having a good team and team lead over company brands matters because:

You learn from people, not company names
Big corporations slowly adapt to new technologies
Knowledge is worth a lot more than a company brand

Hope this helps in your ML Career Journey!

LASSO regression vs Ridge regression - when to use what?

That is it for this week!

If you haven’t yet, follow me on LinkedIn where I share Technical and Career ML content every day!

Whenever you're ready, there are 3 ways I can help you:

Join Maistermind for 1 weekly piece with 2 ML guides:

1. Technical ML tutorial or skill learning guide
2. Tips list to grow ML career, LinkedIn, income

Join here!

#9: How to choose Ridge vs LASSO? | 3 ML Career Tips of the Month

Reading time - 7 mins