1、之江实验室 Zhejiang Lab基础理论研究院人工智能与安全团队生成式大模型安全与隐私白皮书作者:徐晓刚,吴慧雯,刘竹森,李想,涂文轩,梁伟轩,张毅,刘哲版权归之江实验室所有欢迎交流2023 年 6 月 6 日The development of the Generative AI,e.g.,Large Language Models(LLM),have been popular in both academic and industrial communities on a worldwidescale,especially the ChatGPT series.The success
2、of ChatGPT and GPT4 hasshown the future direction of developing AGI.However,large generative modelsalso suffer from the issue of data/model security and privacy.We should note thatlarge generative models would bring a lot of security and privacy problems,whenthey demonstrate great power in changing
3、our life,such as data leaking and thepropagation of fake news.In this white paper,we first conclude the developmentof large generative models,including its effects and social influences.Then,wesummarize the current security and privacy problems in existing large generativemodels,e.g.,the data and mo
4、del security,copyright problems,and ethical issues.Finally,we give the corresponding suggestions about the current security and privacyproblems.They can be employed to point out future research and develop directions,and can also be utilized as references for government decision-making.目录1序言12生成式大模型
5、的发展之路12.1.ChatGPT 和 GPT4 的前身.12.1.1GPT1.12.1.2GPT2.42.1.3GPT3.52.1.4GPT3.5.72.1.5InstructGPT.82.1.6Google Bert.102.2.ChatGPT 和 GPT4.112.2.1ChatGPT.112.2.2GPT4.142.3.ChatGPT 和 GPT4 之后发布的模型.172.3.1Facebook:LLaMa.172.3.2Stanford:Alpaca.182.3.3百度:文心一言.182.3.4阿里:通义千问.192.3.5清华:ChatGLM.193生成式大模型引发的变革203.1
6、.应用 1:助力人机交互.203.2.应用 2:助力信息资源管理.203.3.应用 3:助力科学研究.223.4.应用 4:助力内容创作.234生成式大模型存在的安全问题244.1.生成式大模型的数据安全.244.1.1生成式大模型使用过程中显式的隐私信息泄露.244.1.2生成式大模型使用过程中隐式的隐私信息泄露.244.2.生成式大模型的使用规范.264.2.1生成式大模型被用于虚假和恶意信息/软件编写.274.2.2生成式大模型违反当地法律法规.284.2.3生成式大模型没有预警机制.294.2.4生成式大模型安全优化不涉及灰色地带.294.3.生成式大模型的可信和伦理问题.304.3.