《张颖峰-RAG 2.0 引擎的设计挑战和实现.pdf》由会员分享,可在线阅读,更多相关《张颖峰-RAG 2.0 引擎的设计挑战和实现.pdf(33页珍藏版)》请在三个皮匠报告上搜索。
1、DataFunSummitDataFunSummit#20242024InfiniFlowRAG 2.0引擎设计挑战和实现张颖峰/InfiniFlow 创始人InfiniFlowRAG 1.0的痛点和解决方向如何有效Chunking高级RAG和预处理RAG 未来如何发展如何准确召回InfiniFlow01RAG 1.0 的痛点和解决方向InfiniFlowExtractionIndexingRetrievalGenerationChunksEmbeddingsVectorDBEmbeddingsQuestionAnswerChunkingRelevant chunksEmbedding mod
2、elEmbedding modelSearchRecommenderConversational AIpromptsRAG 架构模式InfiniFlowRAG 面临的挑战n 挑战二:文档结构复杂,数据太乱,Garbage In,Garbage Outn 挑战一:向量的召回无法满足要求n 挑战三:问题和答案所在文档关联不大,很难通过问题找到正确文档InfiniFlow下一代 RAG 架构切块切块切块切块全文索引向量索引稀疏向量索引表格布局模型文档布局模型Embedding模型向量稀疏向量Embedding 模型Tensor Reranker问题关键词知识图谱构建数据抽取模型查询改写模型图索引LL
3、MAI Native DatabaseofflineonlineGarbage In,Garbage Out向量召回无法满足要求问题和答案之间存在语义鸿沟答案和引用生成InfiniFlowInfinity+RAGFlow=InfiniflowExtractionIndexingRetrievalGenerationRetrieval AugmentationQuery rewriting modelReranking modelTensorSparse VectorDense VectorFull TextGraph embeddingGraph queryStructured data qu
4、eryFused RankingRAGFlowInfinityDocument structure recognition modelTable structure recognition modelKnowledge graph construction modelDocument ClusteringDocument parsingDocument semantic pre-processingInfiniFlow02如何有效ChunkingInfiniFlow概要Documents文档结构识别模型页眉页脚段落图片表格扫描?OCR文字换行检测NYChunking结果标题补全图片截取表格结构
5、识别模型流程图、饼图、柱状图Chunking结果多模态模型ChunkingInfiniFlow调整抽取模型的 RAGFlow 对比0.00.51.0AccuracyRAGFlow ProOpensource naive RAGCommercial RAG product0.85RAGFlow0.650.80.970.350.650.150.5完全准确率部分准确率InfiniFlow表格识别模型n 单元格边界判定n 表头信息判定n 单元格合并判定n 表格跨页判定InfiniFlow表格识别模型Code BookCNN EncoderCNN DecoderImageTransformer Enco
6、derTransformer Decoder VAEEncoderDecoderInfiniFlow文档“大”模型Vision Encoder表格流程图饼图柱状图Transformer EncoderTransformer DecoderHTMLText DecoderInfiniFlow03如何准确召回InfiniFlowIndexing Database多路召回结构化数据查询融合排序TensorSparse VectorDense VectorFull Text SearchColumnar StoreSecondary IndexNumeric/StringDense VectorTex
7、tVector IndexFull text IndexSparse VectorTensorSparse Vector IndexTensor IndexInfiniFlowBenchmarkInfiniFlowEfficiencyEffectElasticsearchElasticsearchVector DatabasesTraditional DatabasesInfinityInfinityLanceDBLanceDBRAG数据库选型对比全文搜索+向量WeaviateWeaviateInfiniFlow几路召回?nDCG10406080MLDR long-document retri
8、eval benchmark(English)DenseSparseBM25+Dense+RRFBM25+Dense+Sparse+RRFDense+Sparse+RRFBM25+Dense+Sparse+ColBERT RerankerEmbedding Model:BGE-M3BM25BM25+Sparse+RRF49.0561.6459.8663.5267.5174.5463.3366.72InfiniFlow排序模型QueryDocument PassageTransformerTransformerEmbeddingEmbeddingEmbeddingEmbeddingEmbeddi
9、ngEmbeddingEmbeddingEmbeddingEmbeddingEmbeddingPoolingPoolingEmbeddingEmbeddingSimilarityQueryDocument PassageTransformerMLPScoreDual EncoderCross EncoderLate Interaction EncoderTransformerTransformerEmbeddingEmbeddingEmbeddingEmbeddingEmbeddingEmbeddingEmbeddingEmbeddingEmbeddingMaxSimMaxSimMaxSimO
10、ffline IndexingScoreQueryDocument PassageInfiniFlow20VectorDBQuestionQuestionTop 10 resultsTop resultsSparse VectorDense VectorFull Text SearchTensor RerankerTop 1000 resultsTop resultsQuestionVSVSColBERT的收益InfiniFlowColBERT的收益nDCG10406080MLDR long-document retrieval benchmark(English)DenseSparseBM2
11、5+Dense+RRFBM25+Dense+Sparse+RRFDense+Sparse+RRFBM25+Dense+Sparse+ColBERTEmbedding Model:BGE-M3BM25BM25+Dense+ColBERTBM25+ColBERTDense+ColBERTSparse+ColBERTDense+Sparse+ColBERT49.0561.6459.8663.5263.3366.7273.3574.5465.6372.8273.4573.35InfiniFlowColBERT ranker 还是 reranker?nDCG10406080MLDR long-docum
12、ent retrieval benchmark(English)Embedding Model:BGE-M3ColBERTEMVB IndexBM25+ColBERT RerankerColBERT Brute force72.2373.3574.11InfiniFlow延迟交互是 RAG的未来nDCG10406080MIRACLBge-m3JaColBERT73.878JaColBERTJina-ColBERT v2InfiniFlow延迟交互是 RAG的未来n 超过 BGE 110Mn 每个Token 96维n Binary量化后每个Token 12 byte answerai-colbe
13、rt-small-v1 基于JaColBERT 33M参数 InfiniFlow04高级RAG和预处理InfiniFlow复杂问答之文档预处理RAPTORChunking原始文档ChunksChunks and summaries across chunksFlattenAndIndexingQueryInfiniFlow复杂问答之Agentic RAGQueryRetrievalGradeGenerationAnswerQuery RewriteRelevant?Answer question?NoYesYesNoQuery IntentRouter 1Web SearchAsk LLMRouter 3Router 2InfiniFlow复杂问答之知识图谱EntityEntityEntityPassagePassageEntityEntityDataEntitiesGraph Construction and Augmenta