《为AI网络提供更优的光互连方案.pdf》由会员分享,可在线阅读,更多相关《为AI网络提供更优的光互连方案.pdf(15页珍藏版)》请在三个皮匠报告上搜索。
1、Confidential ConfidentialAddressing Better Optical Connections in AI Networks为AI网络提供更优的光互连方案Dec.8,2023ConfidentialAI Model Sizes are Growing 10 x Annually大模型规模每年扩增10倍GPT3 Example 50,257 word vocabulary 2,048 word sequence length 175B Parameters:1TB to store model 300B tokens in training data set Tra
2、ining required 10,000 NVIDIA V100 GPUs for 1 month in 2021 Power 4.6MW2Microsofts Zaid Khan,GM Cloud AI,Apr,2023“were now training models on 75MW”ConfidentialIntroducing the Backend NetworkAI后端网络 Unlike traditional general compute networks,AI clusters are built with two separate networks The front-e
3、nd network is used for data I/O to the cluster The back-end network creates a communication fabric between all GPUs The back-end network can be 10-20 x more dense than the front end network The result is a huge number of new optical connections at 400G or 800G Given the huge volume of these optical
4、components,improvements in energy efficiency are essential as cluster sizes continue to expand3ConfidentialSwitchServersServersServersServersSwitchSwitchCredo:Addressing Every High-Speed Connection in the HSDCCredo致力于高速互连解决方案ConfidentialCredo SolutionsCredo解决方案SerDes IP Licensing2.5D and MCM SerDes
5、ChipletsProductSolutionsIP and ChipletSolutionsLine Card PHYsOptical DSPsDDC and TOR-to-NIC HiWire AECsConfidentialCore Technology Drives Competitive Advantage 核心技术优势Signal IntegrityPower EfficiencyCost EffectivenessConfidentialLPO Challenges and a Unique SolutionLPO的挑战以及新方案 The LPO story is appeali
6、ngbut there are many challenges7Switch ASwitch BSwitch CSwitch DVendor AVendor BVendor CVendor DDifferent Switch ASICsDifferent TracesDifferent OpticsDifferent ConditionsOpticalDSPRemove DSPTraditional 800G TransceiverLinear(LPO)800G TransceiverResultHuge Variationin Optical Signal QualityAnd Link P