《TAO 和强化学习:利用现有数据构建 AI.pdf》由会员分享,可在线阅读,更多相关《TAO 和强化学习:利用现有数据构建 AI.pdf(43页珍藏版)》请在三个皮匠报告上搜索。
1、TAO:Building AI with the data you haveBrandon Cui&Jonathan FrankleJune 12,2025So you want to customize an LLM2The data youre supposed to haveEvals.Benchmarks that perfectly measure AI performance on your task.Labeled examples.Example behaviors of an LLM on your task.Diverse,representative inputs and
2、 golden outputs.Ideally 10,000+.The reality:nobody has this.3The data you actually haveLLM inputs.If youve deployed a prompt engineered LLM,youve collected this already.Relevant context and documents.All of your other data,which provides implicit context for the problems youre trying to solve.Human
3、judgment.Your expertise and that of your subject matter experts.This talk:Customizing AI with the data you have.4TLDRTAO.New LLM finetuning method from Databricks.Whats Special?It finetunes without labels.Its much easier to get this data vs data that is needed for traditional finetuning.How?Test-tim
4、e compute,reinforcement learning,Databricks reward model(DBRM).5Databricks is the Data Intelligence PlatformFinetuning.A common way to specialize AI with your data.Problem.Finding high quality data is demanding.We need:(1)LLM inputs and(2)clean high quality example LLM outputs.6“Customers self-selec
5、t out or are guided out of finetuning because they dont have enough data”Data Quality MattersWhen doing finetuningData quality really matters.Higher quality data leads to better results.8Raghavendra et al.2024Databricks is the Data Intelligence PlatformFinetuning.A common way to specialize AI with y
6、our data.Problem.Finding high quality data is demanding.We need:(1)LLM inputs and(2)clean high quality example LLM outputs.Solution.What if we could finetune only using inputs?9Example:Traditional Finetuning1.Set up a LLM for people to interact with.1.Collect the inputs provided to the LLM.1.Hand wr