《Ursa:使用与 Kafka 兼容的数据流功能增强您的 Lakehouse.pdf》由会员分享,可在线阅读,更多相关《Ursa:使用与 Kafka 兼容的数据流功能增强您的 Lakehouse.pdf(37页珍藏版)》请在三个皮匠报告上搜索。
1、Ursa:Augment Your Lakehouse With Kafka-Compatible Data Streaming CapabilitiesSijie GuoCo-founder&CEOStreamNativeApache Pulsar PMC MemberGaurav SaxenaDirector of EngineeringAutomotive IndustryApache Pulsar ContributorSijie GuoCo-founder&CEO,StreamNativeGaurav SaxenaApache Pulsar Contributor Director
2、of Engineering,Automotive IndustryApache Pulsar PMC MemberAgenda1.Background&Enterprise Requirements2.Existing Solution&Challenges3.Ursa+Iceberg:Cost-Efficient Streaming Lakehouse4.Conclusion4The Evolving Data LandscapeGrowth of real-time data sources(IoT,streaming events,user logs)Need for timely a
3、nalytics and the shift from batch-only to hybrid streaming+batchTraditional connectors and ETL pipelines are getting complicated and can become bottlenecks5Enterprise Priorities6Governance&ComplianceSingle pane of governance for real-time data batch dataCost-EffectivenessAvoid unnecessary data trans
4、fers,reduce duplication,control costsScalabilityHandling high-volume streaming dataInteroperabilityLong-term flexibility;avoid vendor lock-inExample:Connected VehiclesMassive Data Volumes:Millions of onboard sensors produce substantial,continuous telemetry data.Complex Ingestion:Multiple Kafka clust
5、ers must handle high-velocity data in real-time.Critical Analytics:Real-time insights fuel traffic optimization,predictive maintenance,and safety features.Impact of Delays:Slow analytics can disrupt operations,reduce vehicle efficiency,or risk passenger safety.Spiky Workloads:Data surges from unpred
6、ictable traffic patterns or vehicle events require scalable,resilient infrastructure.7Kafka+Lakehouse8Kafka ConnectPulsar I/OOther ConnectorsKafka+Lakehouse9Kafka ConnectPulsar I/OOther ConnectorsExpensive Data SilosKafka and the lakehouse operate as separate systems,leading to duplicated data,incon