《SNIA-SDC23-Williams-InSRAM-Compute-For-GenAI-LLM_0.pdf》由会员分享,可在线阅读,更多相关《SNIA-SDC23-Williams-InSRAM-Compute-For-GenAI-LLM_0.pdf(53页珍藏版)》请在三个皮匠报告上搜索。
1、1|2023 SNIA.All Rights Reserved.Virtual ConferenceSeptember 28-29,2021In-SRAM Computing For Lower Power LLMsGSI TechnologyGeorge Williams,Head of Embedded AI2|2021 Storage Developer Conference.Insert Company Name Here.All Rights Reserved.Generative AI In The NewsIt Was The Best Of Times,It Was The W
2、orst Times3|2021 Storage Developer Conference.Insert Company Name Here.All Rights Reserved.Generative AI ImpactMckinsey Co,2023It Was The Best Of Times4|2021 Storage Developer Conference.Insert Company Name Here.All Rights Reserved.Energy Costs of Advanced Computinghttps:/www.nnlabs.org/power-requir
3、ements-of-large-language-modelsIt Was The Worst Of Times.5|2023 SNIA.All Rights Reserved.AgendaNext Word PredictionTransformer EssentialsVon-Neumann Architecture&BottleneckNew Paradigm:Adding Compute Into SRAMAssociative Compute Grid PowerModular IP For Size and Power BudgetsToken RatesTry It Out!6|
4、2021 Storage Developer Conference.Insert Company Name Here.All Rights Reserved.Next Word Prediction7|2023 SNIA.All Rights Reserved.Neural Language ModelingNextTokenPredictor8|2023 SNIA.All Rights Reserved.Neural Language ModelingNextTokenPredictortask:next token predictionidea dates back to 70s9|202
5、3 SNIA.All Rights Reserved.Neural Language Modelingtask:next token predictionidea dates back to 70s90s:RNNs,LSTMs,GRUsnothing works well until Transformerwaitjust next token?10|2023 SNIA.All Rights Reserved.Neural Language Modelinginference:more“context”is better11|2023 SNIA.All Rights Reserved.Neur
6、al Language Modelinginference:more“context”is betterpositional encoding“it”1st“it”7th12|2023 SNIA.All Rights Reserved.Neural Language Modelinginference:more“context”is betterpositional encodingattention:weighted focus0.20.40.30.020.020.040.0213|2023 SNIA.All Rights Reserved.Neural Language Modeling1