TAO 和强化学习:利用现有数据构建 AI.pdf

编号:718699 PDF 43页 3.30MB 下载积分:VIP专享
下载报告请您先登录!

TAO 和强化学习:利用现有数据构建 AI.pdf

1、TAO:Building AI with the data you haveBrandon Cui&Jonathan FrankleJune 12,2025So you want to customize an LLM2The data youre supposed to haveEvals.Benchmarks that perfectly measure AI performance on your task.Labeled examples.Example behaviors of an LLM on your task.Diverse,representative inputs and

2、 golden outputs.Ideally 10,000+.The reality:nobody has this.3The data you actually haveLLM inputs.If youve deployed a prompt engineered LLM,youve collected this already.Relevant context and documents.All of your other data,which provides implicit context for the problems youre trying to solve.Human

3、judgment.Your expertise and that of your subject matter experts.This talk:Customizing AI with the data you have.4TLDRTAO.New LLM finetuning method from Databricks.Whats Special?It finetunes without labels.Its much easier to get this data vs data that is needed for traditional finetuning.How?Test-tim

4、e compute,reinforcement learning,Databricks reward model(DBRM).5Databricks is the Data Intelligence PlatformFinetuning.A common way to specialize AI with your data.Problem.Finding high quality data is demanding.We need:(1)LLM inputs and(2)clean high quality example LLM outputs.6“Customers self-selec

5、t out or are guided out of finetuning because they dont have enough data”Data Quality MattersWhen doing finetuningData quality really matters.Higher quality data leads to better results.8Raghavendra et al.2024Databricks is the Data Intelligence PlatformFinetuning.A common way to specialize AI with y

6、our data.Problem.Finding high quality data is demanding.We need:(1)LLM inputs and(2)clean high quality example LLM outputs.Solution.What if we could finetune only using inputs?9Example:Traditional Finetuning1.Set up a LLM for people to interact with.1.Collect the inputs provided to the LLM.1.Hand wr

友情提示

1、下载报告失败解决办法
2、PDF文件下载后,可能会被浏览器默认打开,此种情况可以点击浏览器菜单,保存网页到桌面,就可以正常下载了。
3、本站不支持迅雷下载,请使用电脑自带的IE浏览器,或者360浏览器、谷歌浏览器下载即可。
4、本站报告下载后的文档和图纸-无水印,预览文档经过压缩,下载后原文更清晰。

本文(TAO 和强化学习:利用现有数据构建 AI.pdf)为本站 (Flechazo) 主动上传,三个皮匠报告文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。 若此文所含内容侵犯了您的版权或隐私,请立即通知三个皮匠报告文库(点击联系客服),我们立即给予删除!

温馨提示:如果因为网速或其他原因下载失败请重新下载,重复下载不扣分。
客服
商务合作
小程序
服务号
折叠