#6: 5 Tips to Write ML Pipelines | 3 Mistakes to Avoid on ML Interviews
Reading time - 4 mins
1. Technical ML Section
Learn 5 best practices for writing Machine Learning Pipelines.
The full article is here (reading time - 6 mins).
2. Career ML Section
Learn how to avoid 3 (unspoken) mistakes on ML interviews
1. Technical ML Section:
(For a more detailed discussion, read the full blog article!)
The ML Pipeline Challenge
Most Data Scientists don’t come from a Software Engineering background, so key coding best practices are often unfamiliar to them.
Yet, these days, Data Scientists are often expected to build end-to-end ML solutions, including pipelines that run in production.
Without solid Software engineering principles, these pipelines often end up messy, inflexible, and fragile—especially without proper code reviews from experienced ML/MLOps engineers.
I’ve seen so many poorly ML Pipelines, so I’ve selected 5 most common ML Pipeline bad coding practices and tips on how to solve them.
→ Bad Practice 1: Hardcoding Parameters in Code
đźź Hardcoding leads to:
- Difficulty in making changes if the codebase is big
- Difficulty in adapting parameters for similar modeled objects without lengthy "if-else" statements.
- Need for re-deployment if the parameters need to be changed.
âś… Solution:
Use a Configuration (config) File and write the entire ML Pipeline such that the parameters that might change are extracted from this config file. Here is an example:
→ Bad Practice 2: Ignoring Modularization
đźź Ignoring modularization leads to:
- Poor code readability and difficulty in following data transformation
- Difficult maintenance and testing
- Code repetition and limited reusability
âś… Solution:
Split the code into functions (class methods). Ideally, one function should perform one task.
→ Bad Practice 3: Avoiding type annotations and documenting the code
đźź Ignoring proper code documentation leads to:
- Poor code readability, especially for other developers
- Inability to check type-related issues for linters
- Poor maintainability, especially in big codebases
âś… Solution:
Add typing annotations, docstrings and comments to your code. Below is an example.
→ Bad Practice 4: Avoiding Unit & Integration Tests
đźź Avoiding unit and integration tests leads to:
- Undetected errors in data preprocessing or feature engineering go unnoticed
- Pipeline failures in production & unexpected outputs due to untested edge cases
âś… Solution:
- Use Unit Tests to validate individual components, such as data preprocessing functions or feature engineering steps.
- Use Integration Tests to check how multiple components work together, such as ensuring the model correctly processes preprocessed data or handles edge cases.
→ Bad Practice 5: Avoiding Logging
đźź Neglecting proper logging leads to:
- Difficult debugging: no clear insights into where and why failures occur
- Lack of visibility: hard to track pipeline execution and performance
âś… Solution:
Use tools like Loguru, Structlog or even simple Python logging to implement logging for visibility into pipeline steps. Here is an example of logging:
2. Career ML Section
3 Mistakes to Avoid on ML Interviews
đźź Mistake 1: Lying about your experience.
Trust me, an experienced interviewer will notice this quickly.
This happens because usually an interviewer goes from simple questions and dive deeper to understand your actual experience.
When lying, simple questions can be answered but the deeper ones are usually not.
For example, if you lie and say that you have Data Science Leadership experience, questions like: “Can you describe an example of how you handle a situation when a delegated task has failed?“ will kill you straight away.
đźź Mistake 2: Trying to answer questions by guessing
If you don’t know the answer, just say so.
For example, if asked, “How does a median filter work for time series?” and you’ve never used one, don’t guess—it looks terrible.
Instead, say: “I haven’t worked with median filters, but I have experience with X and Y processing methods for time series.”
đźź Mistake 3: Trying to answer the question straight away
If the question is open ended, for instance, a case study or a ML System Design question, don’t try to answer right away.
Instead, ask clarifying questions first. This will:
- Show you’re focused on understanding the bigger picture
- Help you avoid misinterpretations or missing key details
- Give you time to think while the interviewer responds
That is it for this week!
If you haven’t yet, follow me on LinkedIn where I share Technical and Career ML content every day!
Related articles:
Join MAIstermind for 1 weekly piece with 2 ML guides:
1. Technical ML tutorial or skill learning guide
2. Tips list to grow ML career, LinkedIn, income