1、2024 Databricks Inc.All rights reservedJulia Neagu|CEO,Quotient AI Julia Neagu|CEO,Quotient AI 1HOW TO COOK GOOD AI HOW TO COOK GOOD AI PRODUCTS WITH WHAT PRODUCTS WITH WHAT YOU ALREADY HAVE IN YOU ALREADY HAVE IN YOUR DATA WAREHOUSEYOUR DATA WAREHOUSE2024 Databricks Inc.All rights reserved“90%90%of
2、 enterprises are not confident going to of enterprises are not confident going to production with gen AI solutions”production with gen AI solutions”22024 Databricks Inc.All rights reservedwhy?why?32024 Databricks Inc.All rights reserved4HOW IT STARTEDHOW IT STARTEDWriting,testing,and shipping code w
3、ith deterministic behavior is Writing,testing,and shipping code with deterministic behavior is easyeasyeasiereasier 2024 Databricks Inc.All rights reserved5HOW ITS GOINGHOW ITS GOING but having realistic,reproducible,and compressive testing for LLMs is but having realistic,reproducible,and compressi
4、ve testing for LLMs is notnot2024 Databricks Inc.All rights reservedEvaluations that lead to consistent product shipsconsistent product ships must be:1.1.Realistic.Realistic.Reflecting actual production scenarios accurately.2.2.Aligned.Aligned.Correlated with human judgement.3.3.Comprehensive.Compre
5、hensive.Encompassing a wide range of production scenarios.4.4.Reproducible.Reproducible.Producing the same results under unchanged conditions.5.5.Secret.Secret.Not part of the training data.THE THE 3 3 4 4 5 RULES OF 5 RULES OF TESTSTESTS EVALSEVALS62024 Databricks Inc.All rights reservedhow?how?720
6、24 Databricks Inc.All rights reservedautomatic evaluationsautomatic evaluationsw/humanw/human-inin-thethe-loop feedbackloop feedbackstarting from real datastarting from real data82024 Databricks Inc.All rights reserved9WAREHOUSEWAREHOUSE-TOTO-REFERENCEREFERENC