《3、韩飞-流批一体在京东的探索与实践.pdf》由会员分享,可在线阅读,更多相关《3、韩飞-流批一体在京东的探索与实践.pdf(28页珍藏版)》请在三个皮匠报告上搜索。
1、韩 飞京东高级技术专家流批一体在京东的探索与实践流批一体在京东的探索与实践Exploration and Practice of Unifying Batch and Streaming in JD 整体思考整体思考#1技术方案及优化技术方案及优化Technical Solutions and Optimizations#2落地案例落地案例#3未来规划未来规划#4Overall ThinkingOverall ThinkingCasesFuture Plan#1#1整体思考整体思考Overall Thinking流批一体认知计算统一计算统一存储统一存储统一Cognition of Unifyi
2、ng Batch and Streaming落地面临的挑战端到端数据时延本质性能问题数据实时性数据实时性开发层面兼容调度问题兼容“批”兼容“批”流批混部弹性伸缩资源问题资源问题风险与收益用户视角用户观念用户观念Challenges#2#2技术方案及优化及优化Technical Solutions and OptimizationsFlinkJDOSHDFS/CFSZookeeperSQLJARLoggingMetricsConfigDebuggingMetadataJRC(JD Realtime Computing)JDQJMQHBaseJimDBHiveRelational DBDatala
3、keClickhouseDorisElasticSearch技术方案-整体架构JDQ离线数仓统一模型流批统一计算FlinkSQL+UDF流批统一存储IcebergTopicHive Table流存储批存储统一模型Overall Architecture流批统一存储Iceberg技术方案-兼容“批”exe_bdm_xxxexe_fdm_xxxexe_gdm_xxxexe_adm_xxxexe_gdm_xxx_1gdm_biz_order_mgdm_biz_order_morder_iditem_idseller_ido_amountsgdm_rt_order_mgdm_rt_order_mord
4、er_id_mitem_idseller_idorder_amountsgdm_order_morder_iditem_idseller_idorder_amountsFlinkSQL+UDFu 统一模型解决字段映射问题u SQL开发面向统一模型层u 打通调度系统u 支持Flink批任务作为数据加工环节u 支持自定义Hive UDF、UDAF、UDTF复用u 临时注册Functioncreate catalog xxxuse catalog xxx(drop function xxx)create function xxxinsert into xxx(call function)upload
5、 UDF jarHive MetastoreExtFunctionModuleCompatibility of Batch ProcessingHive TableTopic技术方案-混部及弹性Metrics SystemFlink Cluster on JDOS AutoScalingServiceJRC fabricu 计算资源占用天然错峰(0-8点,流低峰批高峰)u 流批任务混部的JDOS Zoneu 统一Flink引擎+自动弹性能力metricsmetricsenableresultadjustHybrid Deploy and Auto Scaling#2#2技术方案及技术方案及优化
6、维表优化Join优化Window优化Iceberg Connector优化Technical Solutions and Optimizations维表优化 Rebalance依赖平台预览拓扑功能通过并行度调整实现Forward-RebalanceRebalance-Dynamic Rebalanceuapus.dynamic.rebalance.enable=trueDimension Table Optimization-Rebalance维表优化 KeybyStreamExecSinkStreamExecCalcStreamExecLoopupJoinStreamExecTableSou