1、演讲题目演讲人:陈俊杰-腾讯-资深研发工程师Iceberg 最新高级特性介绍01Iceberg 高级特性解锁新场景0203高级特性在腾讯应用与实践04Q/A目录Iceberg 社区高级特性介绍Part 01Branch and TagNew Table APIcreateBranch(String name,long snapshotId);createTag(String name,long snapshotId);A-B-C(master)(tag1)D-E(archive branch)F-G(test branch)-Create a branch/tag for tableALTER
2、 TABLE table CREAT TAG/BRANCH tagNameAS OF VERSION snapshotIdRETAIN interval DAYS|HOURS|MINUTES-Read from a branchSELECT*FROM table BRANCH/TAG branch_name-Insert into a branchINSERT INTO table BRANCH branch_name SELECT.spark().read().format(iceberg).option(branch,branchName).load(table)spark().write
3、().format(iceberg).option(branch,branchName).mode(SaveMode.Append).save(table)Puffin formatA file format designed to store information such as indexes and statistics about data managed in an Iceberg table that cannot be stored directly within the Iceberg manifest.public interface UpdateStatistics ex
4、tends PendingUpdateList/*Set the tables statistics file for the given snapshot,replacing the previous statistics*file for the snapshot if any exists.*return this for method chaining*/UpdateStatistics setStatistics(long snapshotId,StatisticsFile statisticsFile);/*Remove the tables statistics file for
5、 given snapshot.*return this for method chaining*/UpdateStatistics removeStatistics(long snapshotId);Statistics Table statistics Number of rows Number of distinct values in a column The faction of NULL values in a column Min/max value in a column The average data size of a column How statistics help
6、 CBO?ViewA view is a logical table that can be referenced by future queries,the iceberg viewdefinition standardizes the view metadata for ease of sharing the views across engines.Iceberg高级特性解锁新场景Part 02BRANCH 解锁场景一:CDC 入湖Write raw CDC events to the change branch,produce change log feed from the bran