1、,Building a Customer Churn Prediction Pipeline with MLflow,Priyanka Asnani,Senior Machine Learning Engineer Fidelity Investments,Data Summit 2025Boston,MA,3457891013141617182122,Agenda,Problem StatementDataset OverviewExploratory Data AnalysisHigh-Level System DesignWhy MLflow Projects?Chaining Comp
2、onents in MLflowMLflow Project AnatomyData Preparation Component:Cleaning&Artifact LoggingTraining Pipeline Component:Preprocessing+XGBoostSaving Trained Model with MLflow and W&B ArtifactsXGBoost Hyperparameter Optimization via Hydra SweepsExperiment Tracking with Weights&BiasesSaving the Best Mode
3、l with Weights&Biases Model RegistryServing Models with Mlflow,Problem Statement,Predict whether a customer will churn(leave the company)based on their demographics,service subscriptions,and account information.,Customer churn is expensive:Acquiring new customers costs 5x more than retaining existin
4、g onesEarly identification allows proactive customer retention strategies:Offering targeted promotionsImproving service qualityPersonalized engagement,Why This Problem Matters,Objective,3,Dataset:Telco Customer ChurnProblem Type:Binary classification(Predict if customer churns)Records:7000 customers
5、Target Variable:churn(Yes/No),Key Features,Dataset Overview,Basic Preprocessing steps,Dropped irrelevant column:customerIDConverted Totalcharges to numeric:handled missing valuesRemoved rows:where tenure=0(invalid customers)Stratified splits:to maintain churn proportions across train/test sets,4,Exp
6、loratory Data Analysis,5,Exploratory Data Analysis,6,High-Level System Design,7,Reproducible:Capture code,environment,and parameters to ensure consistent results across runs and platformsReusable:Define once,run anytimeprojects can be easily shared and reused across teamsPortable:Run the same projec