《方蒙_Generative AI and Large Language Models_watermark.pdf》由会员分享,可在线阅读,更多相关《方蒙_Generative AI and Large Language Models_watermark.pdf(95页珍藏版)》请在三个皮匠报告上搜索。
1、Introduction to Generative AI and Large Language ModelsMeng FangRL China 2023What is Generative AI?Artificial intelligence systems that can produce high quality content,specifically text,images,and audio.The rise of generative AIZhao,Wayne Xin,et al.A survey of large language models.arXiv preprint a
2、rXiv:2303.18223(2023).T5Text-to-text-transfer-transformerRaffel,Colin et al.Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer.arXiv preprint arXiv:2303.18223(2023).GPT-3Generative Pre-trained Transformer 3OpenAI 2020Language modellingLanguage Modeling is the task of p
3、redicting what word comes nextMore formally,given a sequence of words,language modelling can compute the probability distribution of the next word:P(foods|I like trying new),P(hobbies|I like trying new),.A system that does this is called a Language ModelI like trying new _ foods hobbies products act
4、ivitiesLanguage Models are everywhereLanguage Models are everywhereGenerative Pre-trained Transformers(GPT)Decoder-based transformers The first GPT model,introduced in 2018 by OpenAI,was just the decoder part of the original transformer.Input:What is NLP?Output:NLP stands for Natural Language Proces
5、sing,which is a subfield of artificial intelligence(AI)that focuses on the interaction between computers and.From InternetTransformersStandard transformers:encoder-decoder architectureVaswani,Ashish etal.Attention Is All You Need.2017.Transformer blockDecoder-based transformersFrom InternetGPT Archi
6、tecture A stack of decoders(decoder blocks)GPT models from OpenAIGPT-2/-3/-4 have mostly just been larger versions.With the key differences coming from training data and training processes.GPT(2018)GPT-2(2019)GPT-3(2020)GPT/GPT-112xBooksCorpusGPT-248xWebTextGPT-396xWebText 2The rise of generative AI