RAG 落地场景中的关键优化技巧与实战经验 | 极客日志

PythonAI算法

RAG 落地场景中的关键优化技巧与实战经验

综述由AI生成深入探讨了 RAG 技术在落地场景中的关键优化技巧与实战经验。内容涵盖知识加工流程（加载、切片、多元信息抽取、存储）及检索链路优化（查询改写、多策略召回、重排序）。重点介绍了知识图谱、Doc Tree、元数据过滤等增强手段，以及静态知识与动态工具结合的 RAG 架构。通过运维与金融财报两个实际案例，展示了如何通过严谨的流程设计与混合召回机制提升系统专业性与准确性，并提供了相应的评测指标建议。

imJackJia发布于 2025/2/6更新于 2026/5/2919 浏览

RAG 落地场景中的关键优化技巧与实战经验

在过去两年中，检索增强生成（RAG, Retrieval-Augmented Generation）技术逐渐成为提升智能体的核心组成部分。通过结合检索与生成的双重能力，RAG 能够引入外部知识，从而为大模型在复杂场景中的应用提供更多可能性。但是在实际落地场景中，往往会存在检索准确率低，噪音干扰多，召回完整性，专业性不够，导致 LLM 幻觉严重的问题。本文将聚焦 RAG 在实际落地场景中的知识加工和检索细节，如何去优化 RAG Pipeline 链路，最终提升召回准确率。

快速搭建一个 RAG 智能问答应用很简单，但是在实际业务场景落地还需要做大量的准备工作。

RAG 架构图

RAG 关键流程源码解读

主要分为知识加工和RAG 检索部分关键流程：

RAG 流程

1. 知识加工

知识加载 -> 知识切片 -> 信息抽取 -> 知识加工 (embedding/graph/keywords) -> 知识存储

知识加工流程

1.1 知识加载

# 知识工厂进行实例化
KnowledgeFactory -> create() -> load() -> Document

支持格式包括：knowledge, markdown, pdf, docx, txt, html, pptx, url 等。

如何扩展：

class Knowledge(ABC):
    def load(self) -> List[Document]:
        """Load knowledge from data loader."""
    
    @classmethod
    def document_type(cls) -> Any:
        """Get document type."""
    
     () -> [ChunkStrategy]:
        
         [
            ChunkStrategy.CHUNK_BY_SIZE,
            ChunkStrategy.CHUNK_BY_PAGE,
            ChunkStrategy.CHUNK_BY_PARAGRAPH,
            ChunkStrategy.CHUNK_BY_MARKDOWN_HEADER,
            ChunkStrategy.CHUNK_BY_SEPARATOR,
        ]
    

     () -> ChunkStrategy:
        
         ChunkStrategy.CHUNK_BY_SIZE

相关免费在线工具

加密/解密文本
使用加密算法（如AES、TripleDES、Rabbit或RC4）加密和解密文本明文。在线工具，加密/解密文本在线工具，online
RSA密钥对生成器
生成新的随机RSA私钥和公钥pem证书。在线工具，RSA密钥对生成器在线工具，online
Mermaid 预览与可视化编辑
基于 Mermaid.js 实时预览流程图、时序图等图表，支持源码编辑与即时渲染。在线工具，Mermaid 预览与可视化编辑在线工具，online
随机西班牙地址生成器
随机生成西班牙地址（支持马德里、加泰罗尼亚、安达卢西亚、瓦伦西亚筛选），支持数量快捷选择、显示全部与下载。在线工具，随机西班牙地址生成器在线工具，online
Gemini 图片去水印
基于开源反向 Alpha 混合算法去除 Gemini/Nano Banana 图片水印，支持批量处理与下载。在线工具，Gemini 图片去水印在线工具，online
curl 转代码
解析常见 curl 参数并生成 fetch、axios、PHP curl 或 Python requests 示例代码。在线工具，curl 转代码在线工具，online

class ChunkManager:
    """Manager for chunks."""
    def __init__(
        self,
        knowledge: Knowledge,
        chunk_parameter: Optional[ChunkParameters] = None,
        extractor: Optional[Extractor] = None,
    ):
        """Create a new ChunkManager with the given knowledge."""
        self._knowledge = knowledge
        self._extractor = extractor
        self._chunk_parameters = chunk_parameter or ChunkParameters()
        self._chunk_strategy = (
            chunk_parameter.chunk_strategy
            if chunk_parameter and chunk_parameter.chunk_strategy
            else self._knowledge.default_chunk_strategy().name
        )
        self._text_splitter = self._chunk_parameters.text_splitter
        self._splitter_type = self._chunk_parameters.splitter_type

class ChunkStrategy(Enum):
    """Chunk Strategy Enum."""
    CHUNK_BY_SIZE: _STRATEGY_ENUM_TYPE = (
        RecursiveCharacterTextSplitter,
        [
            {
                "param_name": "chunk_size",
                "param_type": "int",
                "default_value": 512,
                "description": "The size of the data chunks used in processing.",
            },
            {
                "param_name": "chunk_overlap",
                "param_type": "int",
                "default_value": 50,
                "description": "The amount of overlap between adjacent data chunks.",
            },
        ],
        "chunk size",
        "split document by chunk size",
    )
    # ... other strategies like CHUNK_BY_PAGE, CHUNK_BY_PARAGRAPH etc.

@abstractmethod
def embed_documents(self, texts: List[str]) -> List[List[float]]:
    """Embed search docs."""

@abstractmethod
def embed_query(self, text: str) -> List[float]:
    """Embed query text."""

# EMBEDDING_MODEL=proxy_openai
# proxy_openai_proxy_server_url=https://api.openai.com/v1
# proxy_openai_proxy_api_key={your-openai-sk}
# proxy_openai_proxy_backend=text-embedding-ada-002

class TripletExtractor(LLMExtractor):
    """TripletExtractor class."""
    def __init__(self, llm_client: LLMClient, model_name: str):
        super().__init__(llm_client, model_name, TRIPLET_EXTRACT_PT)

TRIPLET_EXTRACT_PT = (
    "Some text is provided below. Given the text, "
    "extract up to knowledge triplets as more as possible "
    "in the form of (subject, predicate, object).\n"
    "Avoid stopwords. The subject, predicate, object can not be none.\n"
    "---------------------\n"
    "Example:\n"
    "Text: Alice is Bob's mother.\n"
    "Triplets:\n(Alice, is mother of, Bob)\n"
    # ... more examples
)

class VectorStoreBase(IndexStoreBase, ABC):
    """Vector store base class."""
    @abstractmethod
    def load_document(self, chunks: List[Chunk]) -> List[str]:
        """Load document in index database."""
    
    @abstractmethod
    async def aload_document(self, chunks: List[Chunk]) -> List[str]:
        """Load document in index database."""
    
    @abstractmethod
    def similar_search_with_scores(
        self,
        text,
        topk,
        score_threshold: float,
        filters: Optional[MetadataFilters] = None,
    ) -> List[Chunk]:
        """Similar search with scores in index database."""

def insert_triplet(self, subj: str, rel: str, obj: str) -> None:
    """Add triplet."""
    subj_query = f"MERGE (n1:{self._node_label} {{id:'{subj}'}})"
    obj_query = f"MERGE (n1:{self._node_label} {{id:'{obj}'}})"
    rel_query = (
        f"MERGE (n1:{self._node_label} {{id:'{subj}'}})"
        f"-[r:{self._edge_label} {{id:'{rel}'}}]->"
        f"(n2:{self._node_label} {{id:'{obj}'}})"
    )
    self.conn.run(query=subj_query)
    self.conn.run(query=obj_query)
    self.conn.run(query=rel_query)

{
    "analysis": {"analyzer": {"default": {"type": "standard"}}},
    "similarity": {
        "custom_bm25": {
            "type": "BM25",
            "k1": self._k1,
            "b": self._b,
        }
    }
}

class EmbeddingRetriever(BaseRetriever):
    """Embedding retriever."""
    def __init__(
        self,
        index_store: IndexStoreBase,
        top_k: int = 4,
        query_rewrite: Optional[QueryRewrite] = None,
        rerank: Optional[Ranker] = None,
        retrieve_strategy: Optional[RetrieverStrategy] = RetrieverStrategy.EMBEDDING,
    ):
        pass

    async def _aretrieve_with_score(
        self,
        query: str,
        score_threshold: float,
        filters: Optional[MetadataFilters] = None,
    ) -> List[Chunk]:
        """Retrieve knowledge chunks with score."""
        queries = [query]
        new_queries = await self._query_rewrite.rewrite(
            origin_query=query, context=context, nums=1
        )
        queries.extend(new_queries)
        candidates_with_score = [
            self._similarity_search_with_score(
                query, score_threshold, filters, root_tracer.get_current_span_id()
            )
            for query in queries
        ]
        new_candidates_with_score = await self._rerank.arank(
            new_candidates_with_score, query
        )
        return new_candidates_with_score

class MetadataFilter(BaseModel):
    key: str = Field(..., description="The key of metadata to filter.")
    operator: FilterOperator = Field(default=FilterOperator.EQ, description="The operator of metadata filter.")
    value: Union[str, int, float, List[str], List[int], List[float]] = Field(..., description="The value of metadata to filter.")

KEYWORD_EXTRACT_PT = (
    "A question is provided below. Given the question, extract up to "
    "keywords from the text. Focus on extracting the keywords that we can use "
    "to best lookup answers to the question.\n"
    "Generate as more as possible synonyms or alias of the keywords "
    "considering possible cases of capitalization, pluralization, "
    "common expressions, etc.\n"
    "Avoid stopwords.\n"
    "Provide the keywords and synonyms in comma-separated format."
    "Formatted keywords and synonyms text should be separated by a semicolon.\n"
    "---------------------\n"
    "Example:\n"
    "Text: Alice is Bob's mother.\n"
    "Keywords:\nAlice,mother,Bob;mummy\n"
    # ... more examples
)

def explore(
    self,
    subs: List[str],
    direct: Direction = Direction.BOTH,
    depth: Optional[int] = None,
    fan: Optional[int] = None,
    limit: Optional[int] = None,
) -> Graph:
    """Explore on graph."""

def _similarity_search(
    self, query, filters: Optional[MetadataFilters] = None
) -> List[Chunk]:
    """Similar search."""
    table_chunks = self._table_vector_store_connector.similar_search_with_scores(
        query, self._top_k, 0, filters
    )
    not_sep_chunks = [
        chunk for chunk in table_chunks if not chunk.metadata.get("separated")
    ]
    separated_chunks = [
        chunk for chunk in table_chunks if chunk.metadata.get("separated")
    ]
    if not separated_chunks:
        return not_sep_chunks
    tasks = [
        lambda c=chunk: self._retrieve_field(c, query) for chunk in separated_chunks
    ]
    separated_result = run_tasks(tasks, concurrency_limit=3)
    return not_sep_chunks + separated_result

async def aretrieve(
    self, query: str, filters: Optional[MetadataFilters] = None
) -> List[Chunk]:
    """Retrieve knowledge chunks."""
    return await self._aretrieve(query, filters)

## Rerank model
#RERANK_MODEL=bce-reranker-base
#RERANK_TOP_K=5

RAG 落地场景中的关键优化技巧与实战经验

RAG 落地场景中的关键优化技巧与实战经验

RAG 关键流程源码解读

1. 知识加工

1.1 知识加载

微信扫一扫，关注极客日志

更多推荐文章

相关免费在线工具

1.2 知识切片

1.3 知识抽取

1.4 知识存储

2. 知识检索

2.1 EmbeddingRetriever

2.2 Graph RAG

2.3 DBSchemaRetriever

知识加工与检索优化思路

1. 知识处理优化

1.1 知识加载优化

1.2 切片 Chunk 尽量保持完整

1.3 多元化的信息抽取

1.4 知识处理工作流

2. RAG 流程优化

2.1 静态知识 RAG 优化

（1）原始问题处理

（2）元数据过滤

（3）多策略混合召回

（4）后置过滤

（5）显示优化 + 兜底话术/话题引导

2.2 动态知识 RAG 优化

（1）工具资产库

（2）工具召回

2.3 RAG 评测

RAG 落地案例分享

1. 数据基础设施领域的 RAG

1.1 运维智能体背景

1.2 严谨专业的 RAG

1.3 知识处理

1.4 知识检索

1.5 AWEL + Agent

2. 金融财报分析领域的 RAG

总结

微信扫一扫，关注极客日志

更多推荐文章

相关免费在线工具