《王立威_Chain_of_Thought_watermark.pdf》由会员分享,可在线阅读,更多相关《王立威_Chain_of_Thought_watermark.pdf(44页珍藏版)》请在三个皮匠报告上搜索。
1、.Towards Revealing the Mystery behindChain of Thought:A Theoretical PerspectiveLiwei WangPeking UniversityLiwei Wang(Peking University)Chain of Thought1/34.Index1Introduction2Preliminary3CoT is the Key to Solving Math Problems4CoT is the Key to Solving General Problems5ExperimentsLiwei Wang(Peking U
2、niversity)Chain of Thought2/34.IntroductionIndex1Introduction2Preliminary3CoT is the Key to Solving Math Problems4CoT is the Key to Solving General Problems5ExperimentsLiwei Wang(Peking University)Chain of Thought3/34.IntroductionCapabilities of LLMsLarge Language Models(LLMs)have demonstrated emerg
3、ent capabilities invarious aspects:Generation:translation,summary,composition,Question answeringMathematicsCodingReasoning,Planning,Decision-making,Liwei Wang(Peking University)Chain of Thought4/34.IntroductionAutoregressive TransformersMost LLMs follow the autoregressive design paradigm Radford et
4、al.,2019,Brown et al.,2020,OpenAI,2023,Zhang et al.,2022,Touvron et al.,2023,Chowdhery et al.,2022,Rae et al.,2021,Scao et al.,2022.Main idea:various tasks canbe uniformly treated assequence generation problems.The input along with the taskdescription can be togetherencoded as a sequence oftokens,ca
5、lled the prompt.The answer is generated by predicting subsequent tokens conditioned on theprompt in an autoregressive way.Liwei Wang(Peking University)Chain of Thought5/34.IntroductionAutoregressive TransformersMost LLMs follow the autoregressive design paradigm Radford et al.,2019,Brown et al.,2020
6、,OpenAI,2023,Zhang et al.,2022,Touvron et al.,2023,Chowdhery et al.,2022,Rae et al.,2021,Scao et al.,2022.Main idea:various tasks canbe uniformly treated assequence generation problems.The input along with the taskdescription can be togetherencoded as a sequence oftokens,called the prompt.The answer