1、Unity Catalog,Delta Sharing and Data Mesh on Databricks LakehouseThomas RoachSurya Sai TuragaDatabricks2023Current state of data in an enterpriseData distributed in many different systemsEach use-case starts with an ETL processMany copies of the same dataHard to understand how data is derivedData la
2、kehouse,a future proof architectureData Lakes for Data Engineering and AIData Warehousing for traditional reportingData Warehouses and Data Lakes are difficult to keep in syncSecurity&Governance AgilityEfficiencyAccess to scalable computeExperiment and fail fastShadow ITData fragmentationSecurity&Go
3、vernanceData SilosData Warehouses and Data LakesCloud Proliferationn4Data Mesh Principles5#1 Domain-oriented data ownership#2 Data as a product#3 Self-service infrastructure platform#4 Federated computational governanceDomain-agnostic approach to building,executing,and maintaining interoperable data
4、 products through common toolingCreating an ecosystem that adheres to global rules through automatic execution of decisions by the platformDistributed architecture where domain teams,the data producers,have autonomy over their data and analytics workloadsApplying product thinking to data,to make qua
5、lity data easily accessible to data consumers via standardised interfacesThe Four Principles of Data Mesh1.Domain OwnershipData domains need to host and serve their domain data sets in an easily consumable way.2.Data as a ProductApply product thinking to data,by making data a first-class citizen.Sup
6、porting operations with its owner and development behind it.Easy to discover,read and understand-documentationVersioning,Security,Monitoring,Logging and Alerting3.Self-Serve Data InfrastructureProvide tools and user friendly interfaces to develop analytical data products for both analytical end-user