Q1
Walk me through how you would design a machine learning pipeline for real-time fraud detection on financial transactions. What tools and frameworks would you use, and how would you handle concept drift in production?
Why they ask this:* They're evaluating your ability to architect end-to-end ML systems, understand production constraints, and address real-world challenges like data distribution shift that are critical in data-heavy industries.
Q2
Explain the trade-offs between L1 and L2 regularization, and describe a situation where you chose one over the other in a past project. How did you measure its impact?
Why they ask this:* They're assessing your understanding of fundamental ML concepts, model optimization, and your ability to make data-driven decisions backed by experimentation and metrics.
Q3
You're working with a highly imbalanced dataset (95% negative class). What techniques would you use to handle this during training and evaluation, and why is accuracy alone insufficient?
Why they ask this:* This tests knowledge of practical challenges in real-world datasets, appropriate evaluation metrics (precision, recall, F1, AUC-ROC), and whether you understand the business impact of class imbalance.
Q4
Describe your experience with feature engineering in a previous role. How did you identify which features to create, and what methods did you use to validate their importance?