《MLOps 与 Databricks.pdf》由会员分享,可在线阅读,更多相关《MLOps 与 Databricks.pdf(31页珍藏版)》请在三个皮匠报告上搜索。
1、Maria Vechtomova10 June 2025MLOps withDatabricksDoing it“the right way”MLOps is a paradigm that aims to deploy and maintain ML models in production reliably and efficiently“Production”means that the output of the ML model is consistently delivered to the end user and brings business value 4Data scie
2、ntistProduct managerfulfillmentPOC5Data scientistSchedule6Preprocess dataTrain modelComputepredictionsWrite predictionsmlops_prd.demand_forecast.predictionsFulfillment teamReadpredictionsIs it production?Yes.Is it efficient?Maybe.Is it reliable?No.7What about version control andcode quality checks?8
3、Version history of a notebook is not enough!Real solution:version control and CI/CD-Pre-commit hooks-Unit tests-Integration tests-Deploys to acc/prd environmentsCI pipelineCD pipeline10Help to prevent the mistakes from happeningPR reviews:requires approvalDeployment protection:requires approval1112D
4、evelopers should not have direct access to higher environments13Catalog setupWhy it matters-Full control:deployment toproduction only goes through CD pipelines-Quality assurance:code goesthrough quality checks and tests,and only then can be deployed14Unit testing requires code packaging15demo_projec
5、t.data.DataLoader16Preprocess dataTrain modelComputepredictionsWhat if your code breaks here?You will need to start all over again!Orchestration matters!Orchestration matters!17Preprocess dataTrain modelComputepredictionsBenefits:-Splitting one big notebook into tasks allows for the task run retries
6、-Different tasks may have different requirements(processing needs spark,model training often does not need distributed compute)Orchestration matters!18Preprocess dataTrain modelComputepredictionsdemo_projectproject_config.ymlDatabricks Asset Bundles to the rescue!19Databricks ass