从第一原则出发进行人工智能评估:你无法管理无法衡量的东西.pdf

编号:718764 PDF 40页 1,002.38KB 下载积分:VIP专享
下载报告请您先登录!

从第一原则出发进行人工智能评估:你无法管理无法衡量的东西.pdf

1、Pallavi Koppol,Research ScientistJonathan Frankle,Chief AI ScientistThursday,June 12AI Evaluation from First Principles:You Cant Manage What You Cant MeasureMotivation:A MetaphorYoure building a new software product.You write a 1000 line script in Python.You play with it a bit.“Seems good.”You ship

2、it.2This would be crazyYou write a design doc.You break into modules with abstraction boundaries.You write unit tests for typical cases and corner cases.You write integration tests.You know it will work before you ship it.3Motivation:A MetaphorYoure building a new software product.You write a 1000 l

3、ine script in Python.You play with it a bit.“Seems good.”You ship it.4Motivation:A MetaphorYoure building a new AI product.You write a 1000 word prompt.You check the vibes.“Seems good.”You ship it.52025 is the year of AI EngineeringThis is the year we move from AI demos to AI engineering.The watchwo

4、rd is reliability.How do you build an AI system that will still exist in a year?How do you build an AI system that multiple people can work on simultaneously?How do you build a“million line”equivalent AI system?62025 is the year of AI EngineeringThis is the year we move from AI demos to AI engineeri

5、ng.The watchword is reliability.We only know one way to do this:modularity,abstraction,and specification.AI isnt software,so we need to figure out what these concepts mean in this new context.72025 is the year of AI EngineeringThis is the year we move from AI demos to AI engineering.The watchword is

6、 reliability.Our belief at Databricks:it all starts with evals.8Outline1.Challenge:Measuring quality is important but difficult2.Framework:3x3 approach for understanding evaluation needs3.Solution:Recipe for building gold standard evaluations4.Takeaways:Discussion&next steps 9Outline101.Challenge:Me

友情提示

1、下载报告失败解决办法
2、PDF文件下载后,可能会被浏览器默认打开,此种情况可以点击浏览器菜单,保存网页到桌面,就可以正常下载了。
3、本站不支持迅雷下载,请使用电脑自带的IE浏览器,或者360浏览器、谷歌浏览器下载即可。
4、本站报告下载后的文档和图纸-无水印,预览文档经过压缩,下载后原文更清晰。

本文(从第一原则出发进行人工智能评估:你无法管理无法衡量的东西.pdf)为本站 (Flechazo) 主动上传,三个皮匠报告文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。 若此文所含内容侵犯了您的版权或隐私,请立即通知三个皮匠报告文库(点击联系客服),我们立即给予删除!

温馨提示:如果因为网速或其他原因下载失败请重新下载,重复下载不扣分。
客服
商务合作
小程序
服务号
折叠