《智谱:ChatGLM技术报告(英文版)(19页).pdf》由会员分享,可在线阅读,更多相关《智谱:ChatGLM技术报告(英文版)(19页).pdf(19页珍藏版)》请在三个皮匠报告上搜索。
1、ChatGLM:A Family of Large Language Modelsfrom GLM-130B to GLM-4 All ToolsTeam GLM1Zhipu AI2Tsinghua UniversityAbstractWe introduce ChatGLM,an evolving family of large language models that we havebeen developing over time.This report primarily focuses on the GLM-4 languageseries,which includes GLM-4,
2、GLM-4-Air,and GLM-4-9B.They represent ourmost capable models that are trained with all the insights and lessons gained fromthe preceding three generations of ChatGLM.To date,the GLM-4 models arepre-trained on ten trillions of tokens mostly in Chinese and English,along witha small set of corpus from
3、24 languages,and aligned primarily for Chinese andEnglish usage.The high-quality alignment is achieved via a multi-stage post-training process,which involves supervised fi ne-tuning and learning from humanfeedback.Evaluations show that GLM-4,1)closely rivals or outperforms GPT-4in terms of general m
4、etrics such as MMLU,GSM8K,MATH,BBH,GPQA,andHumanEval,2)gets close to GPT-4-Turbo in instruction following as measured byIFEval,3)matches GPT-4 Turbo(128K)and Claude 3 for long context tasks,and 4)outperforms GPT-4 in Chinese alignments as measured by AlignBench.The GLM-4 All Tools model is further a
5、ligned to understand user intent and autonomouslydecide when and which tool(s)to useincluding web browser,Python interpreter,text-to-image model,and user-defi ned functionsto effectively complete complextasks.In practical applications,it matches and even surpasses GPT-4 All Toolsin tasks like access
6、ing online information via web browsing and solving mathproblems using Python interpreter.Over the course,we have open-sourced a seriesof models,including ChatGLM-6B(three generations),GLM-4-9B(128K,1M),GLM-4V-9B,WebGLM,and CodeGeeX,attracting over 10 million downloads onHugging face in the year 202