《联想:2025本地部署VS云部署:生成式AI总拥有成本(TCO)比较白皮书(英文版)(15页).pdf》由会员分享,可在线阅读,更多相关《联想:2025本地部署VS云部署:生成式AI总拥有成本(TCO)比较白皮书(英文版)(15页).pdf(15页珍藏版)》请在三个皮匠报告上搜索。
1、On-Premise vs Cloud:Generative AI Total Cost ofOwnershipPositioning InformationIn recent years,Generative AI,including Large Language Models(LLMs)and vision models,has emergedas a transformative technology in artificial intelligence,driving innovation across industries.However,deploying these models
2、whether for training,fine-tuning,or inferenceposes significant computationalchallenges.The scale of data in GenAI is staggering.Models like Llama 3.1,trained on over 15 trillion tokens using acustom-built GPU cluster with 39.3 million GPU hours,illustrate the immense computational demands.Suchtraini
3、ng can be prohibitively expensive when relying on cloud services.Hypothetically,running on AWS P5instance H100 system will run you over$483 M in cloud costs ignoring the storage requirements of thetraining data.Organizations must carefully evaluate deployment strategies,weighing the total cost ofown
4、ership(TCO)of on-premises infrastructure against cloud services.Figure 1.Lenovo ThinkSystem SR675 V3 with support for eight double-wide GPUs is an ideal on-premGenerative AI serverGenAI models typically operate in two key phases:training and inference.Training involves processingmassive datasetsofte
5、n measured in tens of trillions of tokensrequiring substantial compute resourcesover long periods.Inference,though less compute-intensive per request,demands continuous,low-latencyresponses at scale,especially as user demand grows.For both prolonged training and persistent inferenceat high throughpu
6、t,on-premises infrastructure offers significant advantages.The fixed nature of capitalexpenditure(CapEx),combined with optimized utilization of dedicated GPUs,makes on-prem a more cost-efficient option over time.In contrast,cloud costs scale linearly with usage,making them ideal for short-termor bur