《超越硬件:实现高效AI推理的全栈优化.pdf》由会员分享,可在线阅读,更多相关《超越硬件:实现高效AI推理的全栈优化.pdf(34页珍藏版)》请在三个皮匠报告上搜索。
1、FuriosaAI Inc.AI Hardware Summit 2024Hyunsik Choi,Head of SW Platform,Jihoon Yoon,Product Marketing ManagerBeyond Just Hardware Full-stack Optimization Towards Efficient AI InferenceFuriosaAI Inc.AI Hardware Summit 2024FuriosaAI founded&Launch Gen 1 vision NPU RNGD raw silicon sample arrivalFirst LL
2、M demo 2017-2021 2024 May2024 JulyGPT3 inspired RNGD 2021 RNGD DevelopmentKick off 2022 FuriosaAI Inc.AI Hardware Summit 202401Mass AI adoption is bottlenecked02 Energy efficient AI inference03 Full-stack optimization for achieving efficiency Key Points FuriosaAI Inc.AI Hardware Summit 2024Source:Ma
3、sanet et al.(2020),Cisco,IEA,Goldman Sachs ResearchAI has broken energy efficiency V100Gaudi 1A100MI100MI250XH100Gaudi 2MI300XGaudi 3B200FuriosaAI Inc.AI Hardware Summit 2024Electricity is already a huge financial and environmental burden on data centersSource:HARTING White Paper(2024)FuriosaAI Inc.
4、AI Hardware Summit 2024AI inference will be everywhere.But is our infrastructure ready?FuriosaAI Inc.AI Hardware Summit 2024“Average server rack densities are increasing but remain below 8 kW.The majority of facilities do not have racks above 30 kW,and those that do have only a few.”-Uptime Institut
5、e Global Datacenter Summary 2024FuriosaAI Inc.AI Hardware Summit 2024What ifthere is a more energy efficient AI inferencesolutions that can be deployed anywhere within existing infrastructure.FuriosaAI Inc.AI Hardware Summit 2024Make AI computing sustainable,enabling access to powerful AI for everyo
6、ne on EarthFuriosaAIs MissionFuriosaAI Inc.AI Hardware Summit 2024RNGD:Powerfully Efficient AI Inference Data center AI accelerator built for the era of LLM and other generative AI modelsFuriosaAI Inc.AI Hardware Summit 2024512 TFLOPS64 TFLOPS(FP8)x 8 Processing Elements48 GBMemory Capacity256 MB SR