使用 RAY 和 PINECONE 缩放 RAG 和嵌入计算.pdf

编号:167656 PDF 57页 3.67MB 下载积分:VIP专享
下载报告请您先登录!

使用 RAY 和 PINECONE 缩放 RAG 和嵌入计算.pdf

1、2024 Databricks Inc.All rights reserved1Scaling Scaling RAG and Embedding RAG and Embedding Computation with Computation with Ray and PineconeRay and PineconeCheng Su,AnyscaleCheng Su,AnyscaleRoy Miara,PineconeRoy Miara,Pinecone2024 Databricks Inc.All rights reservedRoy MiaraEngineering Manager,Gene

2、rative AI PineconePreviously worked on Data/ML infra(Spark,DBT,Entity Knowledge Graphs)Cheng SuEngineering Manager,Data AnyscalePreviously worked on Data Infra(Spark,Hadoop)Meta 2ABOUT USABOUT US2024 Databricks Inc.All rights reservedIntroRay&AnyscalePineconeIntroRay&AnyscalePinecone3AGENDAAGENDA“Th

3、e Problem”RAG:Retrieval Augmented GenerationVector Database&EmbeddingRay&AnyscaleEmbeddingLLM Offline InferenceServerless ArchitectureScale and CostQuality of RAG vs Training2024 Databricks Inc.All rights reserved4THE“PROBLEM”THE“PROBLEM”What did we try to solve together?What did we try to solve tog

4、ether?Evaluate a large scale RAG solutionData:Falcon RefinedWeb 1B documents from Common CrawlEmbedding Model:gte-large,dimension 1024Process and Embed with RayUpload and Index on Pinecone ServerlessRun a large scale RAG Evaluation2024 Databricks Inc.All rights reserved2024 Databricks Inc.All rights

5、 reserved5INTRO to RAGINTRO to RAG2024 Databricks Inc.All rights reserved62024 Databricks Inc.All rights reserved7WHAT IS RAG,WHY WE RAG?WHAT IS RAG,WHY WE RAG?MotivationMotivationLLMs dont knowdont know what they do not knowLLMs hallucinatehallucinate even when they know the answerRAG solves these

6、issues by providing models with factual correct context*RAG solves these issues by providing models with factual correct context*Errors and omissions excepted2024 Databricks Inc.All rights reserved8WHAT IS RAG,WHY WE RAG?WHAT IS RAG,WHY WE RAG?New informationNew information2024 Databricks Inc.All ri

友情提示

1、下载报告失败解决办法
2、PDF文件下载后,可能会被浏览器默认打开,此种情况可以点击浏览器菜单,保存网页到桌面,就可以正常下载了。
3、本站不支持迅雷下载,请使用电脑自带的IE浏览器,或者360浏览器、谷歌浏览器下载即可。
4、本站报告下载后的文档和图纸-无水印,预览文档经过压缩,下载后原文更清晰。

本文(使用 RAY 和 PINECONE 缩放 RAG 和嵌入计算.pdf)为本站 (张5G) 主动上传,三个皮匠报告文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。 若此文所含内容侵犯了您的版权或隐私,请立即通知三个皮匠报告文库(点击联系客服),我们立即给予删除!

温馨提示:如果因为网速或其他原因下载失败请重新下载,重复下载不扣分。
客服
商务合作
小程序
服务号
折叠