1、Introduction to Generative AI and Large Language ModelsMeng FangRL China 2023What is Generative AI?Artificial intelligence systems that can produce high quality content,specifically text,images,and audio.The rise of generative AIZhao,Wayne Xin,et al.A survey of large language models.arXiv preprint a
2、rXiv:2303.18223(2023).T5Text-to-text-transfer-transformerRaffel,Colin et al.Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer.arXiv preprint arXiv:2303.18223(2023).GPT-3Generative Pre-trained Transformer 3OpenAI 2020Language modellingLanguage Modeling is the task of p
3、redicting what word comes nextMore formally,given a sequence of words,language modelling can compute the probability distribution of the next word:P(foods|I like trying new),P(hobbies|I like trying new),.A system that does this is called a Language ModelI like trying new _ foods hobbies products act
4、ivitiesLanguage Models are everywhereLanguage Models are everywhereGenerative Pre-trained Transformers(GPT)Decoder-based transformers The first GPT model,introduced in 2018 by OpenAI,was just the decoder part of the original transformer.Input:What is NLP?Output:NLP stands for Natural Language Proces
5、sing,which is a subfield of artificial intelligence(AI)that focuses on the interaction between computers and.From InternetTransformersStandard transformers:encoder-decoder architectureVaswani,Ashish etal.Attention Is All You Need.2017.Transformer blockDecoder-based transformersFrom InternetGPT Archi
6、tecture A stack of decoders(decoder blocks)GPT models from OpenAIGPT-2/-3/-4 have mostly just been larger versions.With the key differences coming from training data and training processes.GPT(2018)GPT-2(2019)GPT-3(2020)GPT/GPT-112xBooksCorpusGPT-248xWebTextGPT-396xWebText 2The rise of generative AI