1、Data Evolution at Inari:Harnessing Delta Live Tables&Unity CatalogKishore Sundar Sr.Manager,Data EngineeringJonathon Long Staff Data EngineerSpeakersSr.Manager,Data Engineering InariJonathon LongStaff Data Engineer InariKishore Sundar3Inari AgricultureAbout Inari Slide with approved text45WHAT TO EX
2、PECTIntroductionHow the Data Engineering team at Inari onboarded their first project onto DatabricksHow we developed a strategy that would guide future data engineering projects at Inari,using Unity Catalog and Delta Live Tables.Data EngineeringData ManagementData GovernanceWhat is our talk about?Wh
3、o is it for?Early Data Landscape(2022-23)DATA LANDSCAPEMajor LIMS migration project in mid-2023 gave us the opportunity to design a Databricks-centric solution.Existing data views were created with complex queries(5 hours execution time)running as CRON jobs within SQL databases attached to the exist
4、ing LIMS software.These data views were critical to key decisions made in our entire product pipeline.OpportunitySPECS&REQUIREMENTSSource data:o25+tables,between 10-50 columns in each table,between 500k 20M records each and always growingExpected outcome:o15 tables,each being a product of joining an
5、d transforming several source tables and meant to answer specific operational or scientific questions.oMust follow FAIR data principlesoCentralized governance and sharingoEnsure data quality and freshnessOpportunityUnity Catalog ApproachUnified governance and management solution for all data assetsA
6、llows sharing,governing,and managing data across all workspaces and external applications using SQL warehouses and service principals.Auditing and lineage capabilitiesEnables FAIR data.Became clear that features&improvements to Databricks would be built around Unity CatalogWhy Unity Catalog?Unity Ca