1、Flink Table Store v0.2应用场景和核心功能李劲松 阿里巴巴 应应用用场场景景 核心功能 未来展望 项目信息目目录录DFS/Cloud Storage01010101010101Lake Store:Manifest-LSM FilesFlink Streaming InsertFlink Batch InsertLog System(Kafka)Flink Table StoreFlink Streaming QueryFlink Batch QueryHive QuerySpark QueryPresto Query架构架构01010101010101Table Stor
2、e Lake:Manifest-LSM Files场场景一:离景一:离线线数数仓仓加速加速Streaming WriteBatch Read写端:1.无状态更新2.高性能更新Update with PKUpdate without PKAppend Only读端:3.高性能 MOR4.主键索引加速01010101010101Table Store Lake:Manifest-LSM Files场场景二:景二:Partial Update*(COALESCE)Streaming WriteBatch ReadCREATE TABLE MyTable(pk BIGINT PRIMARY KEY N
3、OT ENFORCED,column_1 DOUBLE,column_2 BIGINT)WITH(merge-engine=partial-update);INSERT INTO MyTableSELECT pk,column_1,NULL FROM Src1UNION ALLSELECT pk,NULL,column_2 FROM Src2基于主键打宽表写端:1.无状态更新2.高性能更新读端:3.高性能 MOR4.主键索引加速01010101010101Table Store Lake:Manifest-LSM Files场场景三:景三:预预聚合聚合 RollupStreaming Writ
4、eBatch ReadCREATE TABLE MyTable(pk BIGINT PRIMARY KEY NOT ENFORCED,column_1 DOUBLE,column_2 BIGINT)WITH(merge-engine=aggregation,column_1.aggregate=sum,column_2.aggregate=max);写端:1.无状态更新2.高性能更新读端:3.高性能 MOR4.主键索引加速01010101010101Table Store Lake:Manifest-LSM Files场场景四:景四:实时实时数数仓仓增增强强Streaming WriteStr
5、eaming ReadLog System(Kafka)双写记录 OffsetCREATE TABLE MyTable(column_1 DOUBLE,column_2 BIGINT,dt STRING)PARTITIONED BY(dt)WITH(write-mode=append-only,log.system=kafka,log.topic=my_topic,log.kafka.bootstrap.servers=.);QueryHybrid:BackfillAppendOnly:保证输入序中间表可查 应用场景 核心功能核心功能 未来展望 项目信息目目录录Flink Table Stor
6、e v0.1:湖存:湖存储结储结构构 Snapshot 级别的事务语义 对象存储上的大规模数据存储的支持Flink Table Store v0.1:分区内部:分区内部Bucket-0Bucket-1Bucket-2Partition:2022-05-20LSM TreeLSM TreeLSM TreeTable Store CatalogFlink SQL:CREATE CATALOG MyCatalog WITH(type=table-store,root-path=.,metastore.type =hive,metastore.uri =.);U