设计、构建和测试具有行业领先性能的OCP人工智能网络.pdf

编号:1012015 PDF 13页 1.14MB 下载积分:VIP专享
下载报告请您先登录!

设计、构建和测试具有行业领先性能的OCP人工智能网络.pdf

1、JM HandsMatt RomanMarc AustinDesign,Build and Test an OCP AI Cloud Network with Industry Leading PerformanceNETWORKINGOCP SPECIAL FOCUS:ARTIFICIAL INTELLIGENCE(AI)JM HandsMatt RomanMarc AustinDesign,Build and Test an OCP AI Cloud Network with Industry Leading PerformancePanel DiscussionJM HandsCEO,F

2、armGPUMatt RomanSr Director,PLMCelesticaMarc AustinCEO,HedgehogDesignOCP NetworkingLearn more2U 64-port 800GbE Data Center SwitchAI/ML&Big Data AnalyticsHyperscale Data Centers&Cloud ComputingHigh-Performance Computing(HPC)Network Backbone(800GbE Data Center Leaf/Spine)NETWORKINGCelestica DS5000800G

3、bE SwitchOCP Networking SoftwareNETWORKINGBuild17 Day Crash Course on AI Networking17 DaysMay 23Aug 1Aug 20Jul 16Jul 17 Aug 1Aug 15July 17Equipment OrderedEquipment OrderedNCCL TestNCCL TestEquipment On SiteEquipment On SiteOptics IssueOptics IssueLots of CollaborationLots of CollaborationGo LiveGo

4、LiveSold OutSold OutAI Network is a Lot More Than a SwitchComponentLesson LearnedBetter Next TimeCablingEasy to make mistakes,different types of MPO,dust,etc.Use host and switch software to confirm cablingOpticsVery little interoperability,need to validate EVERY optic with switchValidate BOM to ensu

5、re compatible optics.Management software to provide detailed optics status.Software to identify anomalies.BIOSDisable IOMMU and PCIe ACS for max performance on NCCLManagement software to validate host BIOS settingsOS kernelBlackwell NVIDIA driver workaround for Ubuntu 24.04/Kernel 6.8Management soft

6、ware to validate versions and check known issuesDriversMellanox OFED drivers,RDMA setup,Blackwell supportManagement software to automate configuration of host networkingKernel modulenvidia-peermem,DOCA(See above)NICMST tools,disable autoneg,400G force link,tur

友情提示

1、下载报告失败解决办法
2、PDF文件下载后,可能会被浏览器默认打开,此种情况可以点击浏览器菜单,保存网页到桌面,就可以正常下载了。
3、本站不支持迅雷下载,请使用电脑自带的IE浏览器,或者360浏览器、谷歌浏览器下载即可。
4、本站报告下载后的文档和图纸-无水印,预览文档经过压缩,下载后原文更清晰。

本文(设计、构建和测试具有行业领先性能的OCP人工智能网络.pdf)为本站 (明日何其多) 主动上传,三个皮匠报告文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。 若此文所含内容侵犯了您的版权或隐私,请立即通知三个皮匠报告文库(点击联系客服),我们立即给予删除!

温馨提示:如果因为网速或其他原因下载失败请重新下载,重复下载不扣分。
客服
商务合作
小程序
服务号
折叠