使用 DELTA LAKE 大幅降低处理成本.pdf

编号:167686 PDF 23页 1.12MB 下载积分:VIP专享
下载报告请您先登录!

使用 DELTA LAKE 大幅降低处理成本.pdf

1、2024 Databricks Inc.All rights reserved1DRASTICALLY DRASTICALLY REDUCING REDUCING PROCESSING COSTS PROCESSING COSTS WITH DELTA LAKEWITH DELTA LAKEGeneroso Pagano&Mauricio JostGeneroso Pagano&Mauricio Jost1010-13 June 202413 June 2024 Amadeus IT Group and its affiliates and subsidiaries Amadeus IT Gr

2、oup and its affiliates and subsidiaries2About usGeneroso Pagano Principal Data Engineers Amadeus Mostly having fun with Scala,Spark and Delta LakeMauricio Jost Amadeus IT Group and its affiliates and subsidiaries Amadeus IT Group and its affiliates and subsidiaries3Making travel simpler,smarter and

3、smoother.Online travel agencies Travel agencies Travel management companies Metasearch Tour operators Media players Others Strategic Alliances and Partners Amadeus IT Group and its affiliates and subsidiaries4Our product 100s of output tables Several years of historical data History consolidationA c

4、omplex applicationChallenging requirements Join/merge intensive 1000s of Spark jobs Amadeus IT Group and its affiliates and subsidiaries5Our cost reduction journeyM4:photon,dvM5:revised history consolidationcost:1%M3:z-order,dfpM2:addressed thread contentionPilot daily cost(%)MilestonesM1:baselineco

5、st:100%-99%Amadeus IT Group and its affiliates and subsidiariesM1:Baseline(beginning of our journey)Functional correctness Technical stability Throughput below expectations CPU usage below 10%6Journey Tracker#1M4M5M3M2M1CostMilestones Amadeus IT Group and its affiliates and subsidiaries7Why is CPU u

6、sage so low?But.JSON parsing is CPU intensive!Post-mortem Spark UI Spark job not retained in UI Unnamed jobs Little workers information Live Spark UI What are workers doing?Most task threads BLOCKED Thread contention(shared lock)Task Threads in Worker JVM.Amadeus IT Group and its affiliates and subs

友情提示

1、下载报告失败解决办法
2、PDF文件下载后,可能会被浏览器默认打开,此种情况可以点击浏览器菜单,保存网页到桌面,就可以正常下载了。
3、本站不支持迅雷下载,请使用电脑自带的IE浏览器,或者360浏览器、谷歌浏览器下载即可。
4、本站报告下载后的文档和图纸-无水印,预览文档经过压缩,下载后原文更清晰。

本文(使用 DELTA LAKE 大幅降低处理成本.pdf)为本站 (张5G) 主动上传,三个皮匠报告文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。 若此文所含内容侵犯了您的版权或隐私,请立即通知三个皮匠报告文库(点击联系客服),我们立即给予删除!

温馨提示:如果因为网速或其他原因下载失败请重新下载,重复下载不扣分。
客服
商务合作
小程序
服务号
折叠