1、A middle layer responsible for offloading JVM-based SQL engines execution to native engines.演讲人:Weiting Chen Intel Software ManagerApache Gluten Intro.DataFunSummit 2025Apache Gluten IntroductionIntroducing Apache Gluten Framework and the Background01Performance ResultDemonstrating Gluten Performanc
2、e Result 02Customer Success StoriesSharing Gluten Status and Customer Cases03RoadmapGlutens Roadmap and Future Works 04目 录CONTENTSDataFunSummit 202500.20.40.60.811.21.41.61.82Spark 1.6Spark 2.0Spark 2.4Spark 3.0Spark 3.2Relative PerformanceHashAggHashJoinTableScanHigher is betterNormalized To Spark
3、1.6Basic Operator Stop Growing3CPU Becomes the BottleneckNode:3CPU:2 x 36C Intel(R)Xeon(R)Platinum 8360Y 3.5GHzMemory:512GB DDR4Disk:4 x INTEL SSDPE2KE016T8NIC:25GbpsDataset:3T4Apache Gluten 2nd gen native SQL engine for Spark initiated by Intel and Kyligence Transform Sparks whole stage physical pl
4、an to Substrait plan and send to native Offload performance critical data processing to native library Define clear JNI interfaces for native libraries Switch the native backends easily Reuse Sparks distributed control flow Manage data sharing between JVM and native Extend support to native accelera
5、tors5Gluten-A Middle layer for Native SQL enginesMotivationOpen-source native SQL engines and explore more possibilities for Intel Product and technologies integrationGluten:A middle layer using Metas Velox as default backend(link)A Spark plugin to accelerate Spark SQL and offload to native engineGl
6、uten has achieved 3.3x speedup in TPC-H2.7x speedup in TPC-DSCustomer Cases:Alibaba EMR has integrated Gluten into EMR 5.11.1Alibaba ADB has adopted Gluten into MySQL Spark productBaidu CloudStorage has used Gluten in private cloud with 1.59x speedupBaidu NetworkDiskTech has achieved 1.83x speedupBO