《HYVE 展示了通过开放标准实施和性能优化为 AI 超大规模基础设施打造下一代交换机架构.pdf》由会员分享,可在线阅读,更多相关《HYVE 展示了通过开放标准实施和性能优化为 AI 超大规模基础设施打造下一代交换机架构.pdf(11页珍藏版)》请在三个皮匠报告上搜索。
1、Hyve Solutions ConfidentialPowering Next Generation Switch ArchitectureMichael LaneVP,Networking,Hyve SolutionsFor AI Hyperscale InfrastructureOCP Global Summit 2025Traditional Data Center Networks StrugglesoCongestion with large AI modelsoHigh overhead in scalingoInefficient GPU/CPU interconnectsAI
2、 Workloads Are Data-intensive RequireoHigh bandwidth for trainingoUltra-low latencyoScalable fabric for multi-tenantsWhy Next-Gen Networking for AI1.Bandwidth:Multi-terabits per rack2.Latency:Sub-microsecond within racks,low inter-rack3.Telemetry:Real-time monitoring and dynamic traffic optimization
3、4.Security:Zero-trust for distributed AI workloads5.Topology:Clos and Dragonfly networks for scalabilityAI Cluster Networking RequirementsFive Key Factors forSuccessfully Building a Network for an AI ClusterAI Cluster Networking RequirementsFive Key Factors for Successfully Building a Network for an
4、 AI Cluster Topology:Clos vs.Dragonfly networks for scalabilitygloballinkrouternodegrouplocallinkCoreAggregationEdgeEmerging TrendsoSoftware Defined Networking(SDN):Managing traffic dynamically Emerging TrendsoSoftware Defined Networking(SDN):Managing traffic dynamically oMassive Scale-Out Traffic:F
5、or exascale AI clustersEmerging TrendsoSoftware Defined Networking(SDN):Managing traffic dynamically oMassive Scale-Out Traffic:For exascale AI clustersoInfrastructure Alignment:DLC-based network switchesEmerging TrendsoSoftware Defined Networking(SDN):Managing traffic dynamically oMassive Scale-Out
6、 Traffic:For exascale AI clustersoInfrastructure Alignment:DLC-based network switchesoEdge Networking:Extending network infrastructure to the edgeConsiderationsChallengeMitigationCost OptimizationInteroperabilityScalabilityCabling InfrastructureThermalsAlign D