breaking-boundaries-tacc-as-an-unified-cloud-native-infra-for-ai-hpc-wu-dui-zha-daeptaccai-hpcni-chang-27dya-shi-peter-pan-daocloud-kaiqiang-xu-hong-kong-university-of-science-and-technology.pdf

上传人：山海

编号：627231

2025-04-21

PDF 33页 2.04MB

《breaking-boundaries-tacc-as-an-unified-cloud-native-infra-for-ai-hpc-wu-dui-zha-daeptaccai-hpcni-chang-27dya-shi-peter-pan-daocloud-kaiqiang-xu-hong-kong-university-of-science-and-technology.pdf》由会员分享，可在线阅读，更多相关《breaking-boundaries-tacc-as-an-unified-cloud-native-infra-for-ai-hpc-wu-dui-zha-daeptaccai-hpcni-chang-27dya-shi-peter-pan-daocloud-kaiqiang-xu-hong-kong-university-of-science-and-technology.pdf（33页珍藏版）》请在三个皮匠报告上搜索。

1、Towards an Unified Cloud-Native Infra for AI&HPC Kaiqiang Xu:HKUSTPeter Pan:DaoCloud TACC-A Five-Year JourneyTACC:Turing AI Computing Cloud About UsPhD Researcherin Computer Systems for Machine LearningHong Kong University of Science and TechnologyR&D Engineer LeadOpen Source Advocate DaoCloudKaiqia

2、ng XuPeter PanTACC:IntroHigh-performance and highly scalable AI computing infrastructureTACC is an AI computing infrastructure designed for machine learning applications,supporting and accelerating the constantly evolving research in machine learning at both the software and hardware levels.Due to s

3、ystem-level optimizations specifically targeted for ML/DL programs,TACC outperforms traditional HPC computing clusters in terms of both performance and stability.Turing AI Computing Cloud-TACCMLResearchersML ApplicationsTensorFlowMagatron-LMPyTorchDataset FrameworksNew ResearchSmart City Application

4、sRecommendationFineWebQA PairsPrivacy-preserving AIComputer VisionNLPImageNetTACC:IntroUnderpinning ResearchML frameworks enhance model development through advanced parallelization and distributed training,efficiently handling complex computations and large datasets.Cluster resource scheduler optimi

5、zes cluster-wide resource allocation across AI tasks,boosting overall job throughput and other efficiency factors in AI clusters.AI-centric networking improves data flow and reduces latency by efficiently managing large model transport and using FPGAs for compute offloadingOptimizing ML Applications

6、Compiler ApproachTransformations&OptimizationsExecutablesFramework ApproachOptimized API ImplementationsUser CodeUser CodeHigh-level APIOperating SystemStandard LibrariesSoftware AccelerationsSoft.&Hard.Co-DesignCuDNN,CuBLASRDMACustomized Operations&DriversSpecialized CPU,GPU,NetworkimplementationsF

breaking-boundaries-tacc-as-an-unified-cloud-native-infra-for-ai-hpc-wu-dui-zha-daeptaccai-hpcni-chang-27dya-shi-peter-pan-daocloud-kaiqiang-xu-hong-kong-university-of-science-and-technology.pdf

相关报告