《大语言模型LLM和芯片设计.pdf》由会员分享,可在线阅读,更多相关《大语言模型LLM和芯片设计.pdf(29页珍藏版)》请在三个皮匠报告上搜索。
1、HotChips 2024 TutorialHans Bouwmeester,PrimisAIOutline1.Introduction to LLMs2.Introduction to RAG3.From LLM+RAG to EDA AI-Agentv41.Introduction to LLMsWhat is an LLM?Large Language ModelArchitectureParameters(weights)Neural networkLayers of weightsFunctions operate on weightsJust a few 100 lines of
2、code!Datafile with value for each weightFrom 7 Billion to 1 TrillionOpen vs.ClosedY/N public availability of:-Architecture-Architecture+WeightsTrainingInferenceWeights are adjusted using training dataFor example,Meta Llama 3.1:-10 trillion words in training set-60 days training time-20000 Nvidia H10
3、0 GPUs-$30M compute costs(assuming H100=$1/hr)Weights are used to compute responseQuantizationCompress weightsSize vs.accuracy tradeoffFor example,Meta Llama-70B:-70 billion float16 140GB file-4-bits quantized 35GB fileExample LLM use for code generationWhere do LLMs fit in?AIMLDLGen-AILLMArtificial
4、 IntelligenceMachine LearningDeep LearningGenerative-AILarge Language ModelNatural Language ProcessingReason,learn and act autonomouslyTrain on input data,predict on unseen dataBuild model using a neural networkGenerate new data,similar to data used for trainingA model that can perform(N)LP tasksInt
5、erpret natural language NLPLLM HistoryNLP LLM1966ELIZA1st NLP model,rule-based(MIT)1972RNNRecurrent Neural Network(NN with feedback loop and internal state)1997LSTMLong Short-Term Memory(RNN selectively retains past information attention mechanism)2017Transformer“Attention is all you need”(non recur
6、rent,process entire sequence simultaneously)(Google DeepMind)2018GPT-1GPT-1 117M par.(OpenAI)2018BERTBERT 340M par.Bidirectional Encoder Representations for Transformers2019GPT-22.5B par.2020GPT-3175B par.2021CodexCode Generation(GPT-3-based)GitHub Copilot2022GPT-3.5GPT-3.5 175B par.2023GPT-41.7T pa