《Delta 和 Databricks 作为高性能 EB 级应用程序后端.pdf》由会员分享,可在线阅读,更多相关《Delta 和 Databricks 作为高性能 EB 级应用程序后端.pdf(43页珍藏版)》请在三个皮匠报告上搜索。
1、Delta and Databricks as a Performant Exabyte-Scale Application BackendScott SchenkeinVP,Distinguished EngineerCapital One Cyber Technology6/10/2025About me2-Interest in systems performance optimization for over 25 years-Joined Capital One in 2009-Cyber Security Data Focus for 10 years-Lead Engineer
2、for Capital One Cyber-Sponsoring Exec for the Capital One Data Engineering CommunityTodays Lightning Talk3-Discuss trade-offs between performance,scale,and cost for lake applications-Explore Capital One Cybers lake architecture,and the pain points we needed to address around“fast search”-Discuss our
3、 efforts to better understand our users and our tech stack,and apply these insights to build“fast search”using Delta-Take a deeper dive into how our destination technology components worked together to produce great results-View our simplified technical architecture,and discuss how were building on
4、top of it for the futureApps with scale real-time backends are complex and expensive 4Scale DataCheap and SimpleHigh Performance“Pick any 2”Today we explore how Capital Ones Cyber team challenged that assumptionModernized Delta LakehouseData LakeAfter we converged on Delta Lake5LoadTablesTablesStrea
5、mBatchWarehouse$Ingest TierSpecial-Purpose Search TierModernized Delta LakehouseData LakeAfter we converged on Delta Lake,we still leveraged a costly point solution for“fast search”6LoadTablesTablesStreamBatchWarehouseStream BrokerOnline Search IndexesLoad$Operations Web UIIngest TierEfficient Acces
6、s LayerWe wanted to replace our fast retrieval solution with Delta like we did for streaming and warehouse7Modernized Delta LakehouseData LakeLoadTablesTablesStreamBatchWarehouse$Operations Web UIIngest TierOur known options for meeting our latency and cost goals with Delta and Databricks were not d