《使用 Delta Lake 的动态插入覆盖选择性地覆盖数据.pdf》由会员分享,可在线阅读,更多相关《使用 Delta Lake 的动态插入覆盖选择性地覆盖数据.pdf(24页珍藏版)》请在三个皮匠报告上搜索。
1、DynamicDynamicInsertInsertOverwriteOverwriteThang Long Vu,Bart Samwel Databricks R&DJune 11th,20251Thang Long VuSoftware Engineer Databricks Amsterdam2SpeakersSpeakersLook whos talking!Bart SamwelPrincipal Software Engineer Databricks AmsterdamAtomically overwriting the latest date partitions is a c
2、ommon ETL process.3IntroductionIntroductionWhat is it and why do I care?datecutleryquantity2025-06-08chopsticks102025-06-09spoon42025-06-10fork2history_store_salesdatecutleryquantity2025-06-09spoon52025-06-10fork32025-06-10knife4latest_updated_salesHistory of sales per dayUpdated sales of most recen
3、t dates4ExampleExampledatecutleryquantity2025-06-08chopsticks102025-06-09spoon42025-06-10fork2history_store_salesdatecutleryquantity2025-06-09spoon52025-06-10fork32025-06-10knife4latest_updated_sales5ExampleExampledatecutleryquantity2025-06-08chopsticks102025-06-09spoon52025-06-10fork32025-06-10knif
4、e4history_store_salesdatecutleryquantity2025-06-09spoon52025-06-10fork32025-06-10knife4latest_updated_sales6ExampleExampledatecutleryquantity2025-06-08chopsticks102025-06-09spoon52025-06-10fork32025-06-10knife4history_store_salesdatecutleryquantity2025-06-10fork32025-06-10knife22025-06-10pick22025-0
5、6-11scoop6latest_updated_sales7ExampleExampledatecutleryquantity2025-06-08chopsticks102025-06-09spoon52025-06-10fork32025-06-10knife22025-06-10pick22025-06-11scoop6history_store_salesdatecutleryquantity2025-06-10fork32025-06-10knife22025-06-10pick22025-06-11scoop6latest_updated_salesWe are presented
6、 with the following 3 challenges:1.Only touch the latest changed date partitions,instead of whole table.2.Automatically detect the latest changed date partitions.3.Simple and intuitive for everyone.8Challenges&Solution RequirementsChallenges&Solution Requirements=Our solution needs to satisfy the fo