《Keynote 3:NVIDIA.PDF》由会员分享,可在线阅读,更多相关《Keynote 3:NVIDIA.PDF(16页珍藏版)》请在三个皮匠报告上搜索。
1、SmartNICs and DPUs Accelerate Generative AI at Data Center ScaleKevin Deierling,VP of Networking|SmartNICs Summit,June 2023Democratizing AI Across Diverse FieldsAI Workloads Accelerating Data Center Transformation ChatGPT is the fastest-growing application in historyChatGPT4942392613201020304050What
2、sAppFacebookSnapchatInstagramTikTokTime to 100 Million Users(months)DPUNVIDIA Full Stack Compute and NetworkingFueling giant-scale AI infrastructureModern AI is a Data Center Scale Computing WorkloadData centers are becoming AI factories:data as input,intelligence as outputAlexNetVGG-19Seq2SeqResnet
3、InceptionV3XceptionResNeXtDenseNet201ELMoMoCo ResNet50Wav2Vec 2.0TransformerGPT-1BERT LargeGPT-2XLNetMegatron-NLGMicrosoft T-NLGGPT-3MT NLG 530BBLOOMChinchillaPaLM1001,00010,000100,0001,000,00010,000,000100,000,0001,000,000,00010,000,000,0002011201220132014201520162017201820192020202120222023Before
4、Transformers=8x/2yrsTransformers=215x/2yrsChatGPTSingle GPUHGX 8-GPU100s-1000s HGX 8-GPUSystemsTraining Compute(petaFLOPs)AI Training Computational RequirementsNetworking for AI Data CentersAI FactoriesSingle or few users|Extremely large AI models|NVLink and InfiniBand AI fabricAI CloudMulti-tenant|
5、Variety of workloads|Ethernet networkThe Core of AI Factories NVIDIA AI Compute Networking#of GPUEthernetInfiniBandInfiniBand+NVLinkAIFactoryAICloudThroughputAI Factories and Clouds Require Different Infrastructure NetworkingAI Clouds Going Through A Major ChangeGenerative AI workloads require new c
6、lass of EthernetLoosely Coupled ApplicationsDistributed ComputingTCP(Low BandwidthFlows and Utilization)RoCE(High Bandwidth Flows and Utilization)High Jitter ToleranceLow Jitter ToleranceOversubscribed TopologiesPerformance Optimized TopologiesHeterogeneous TrafficAverage Multi-P