Adaptive RAG 系统搭建：LangGraph、FastAPI 与 Streamlit 实战 | 极客日志

PythonAI算法

Adaptive RAG 系统搭建：LangGraph、FastAPI 与 Streamlit 实战

该方案通过 LangGraph 编排工作流，结合 FastAPI 后端与 Streamlit 前端，实现根据查询复杂度动态调整检索深度的智能助手。核心包含 FAISS 向量库构建、自适应检索逻辑及状态图管理，支持生产级扩展如混合检索、幻觉监控及容器化部署。

赛博朋克发布于 2026/4/8更新于 2026/5/218 浏览

Adaptive RAG 系统搭建：LangGraph、FastAPI 与 Streamlit 实战

我们要构建一个技术支持智能助手，它能理解用户查询，根据问题复杂度动态选择检索深度（Adaptive RAG），通过 LangGraph 执行推理工作流，经由 FastAPI 返回结果，最后在 Streamlit UI 上呈现响应。这个场景针对的是一个真实痛点：团队面对大规模文档集时，传统 RAG 在模糊查询或多步骤问题上经常答非所问。

技术概览

Adaptive RAG 可以理解为'搜索之前先思考'的 RAG。简单查询走轻量级检索就够了，遇到复杂问题则自动切换到多跳深度搜索、重排序或查询扩展，用更低的延迟换更高的准确率。

LangGraph 是用来构建有状态、多步骤 AI 工作流的框架。和传统链式调用不同，它把 LLM 工作流建模成一张图——每个节点对应一个步骤（检索 → 推理 → 验证 → 响应），原生支持重试、记忆、循环和故障转移。对于需要在生产环境中保证可预测行为的场景，这种抽象比线性 chain 灵活得多。

FastAPI 把 Adaptive RAG + LangGraph 包装成 API 接口对外暴露，处理请求分发，天然适配异步 I/O。前端用 Streamlit 搭建，聊天风格的界面，不需要写 HTML/CSS，做 POC 演示足够了。

系统架构

数据流走向如下：

User → Query → Streamlit UI → Sends request → FastAPI → Passes query → LangGraph → Runs Adaptive RAG → Retriever → Gets chunks → Vector DB → Returns results → LangGraph → Generates final response → FastAPI → Sends to UI → User

项目结构

项目结构尽量精简，核心文件分布如下：

ai-poc/
├── backend/           # 后端逻辑
│   ├── app.py         # FastAPI API 服务器
│   ├── rag_pipeline.py # Adaptive RAG 检索
│   ├── graph_workflow.py # LangGraph 工作流
│   ├── config.py      # 配置和环境设置
│   ├── data/          # 源文档
│   └── __init__.py    # 包初始化器
├── frontend/          # UI 层
│   ├── ui.py          # Streamlit 界面
│   └── __init__.py    # 包初始化器
├── .env               # API 密钥和机密信息
├── requirements.txt   # 项目依赖
└── README.md          # 设置说明

requirements.txt 包含以下依赖：

fastapi uvicorn[standard] streamlit requests pydantic langchain langchain-community langgraph faiss-cpu sentence-transformers openai python-dotenv

核心代码实现

自适应检索管道 (rag_pipeline.py)

这里的关键在于根据查询长度动态调整检索深度。我们定义一个类来管理 FAISS 向量库。

相关免费在线工具

加密/解密文本
使用加密算法（如AES、TripleDES、Rabbit或RC4）加密和解密文本明文。在线工具，加密/解密文本在线工具，online
RSA密钥对生成器
生成新的随机RSA私钥和公钥pem证书。在线工具，RSA密钥对生成器在线工具，online
Mermaid 预览与可视化编辑
基于 Mermaid.js 实时预览流程图、时序图等图表，支持源码编辑与即时渲染。在线工具，Mermaid 预览与可视化编辑在线工具，online
随机西班牙地址生成器
随机生成西班牙地址（支持马德里、加泰罗尼亚、安达卢西亚、瓦伦西亚筛选），支持数量快捷选择、显示全部与下载。在线工具，随机西班牙地址生成器在线工具，online
Gemini 图片去水印
基于开源反向 Alpha 混合算法去除 Gemini/Nano Banana 图片水印，支持批量处理与下载。在线工具，Gemini 图片去水印在线工具，online
curl 转代码
解析常见 curl 参数并生成 fetch、axios、PHP curl 或 Python requests 示例代码。在线工具，curl 转代码在线工具，online

# backend/rag_pipeline.py
from typing import List
from langchain_community.vectorstores import FAISS
from langchain_community.embeddings import HuggingFaceEmbeddings
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.schema import Document

class AdaptiveRAG:
    """Adaptive Retrieval Pipeline"""
    def __init__(self, vector_db: FAISS):
        self.db = vector_db

    def retrieve(self, query: str) -> List[Document]:
        if not query.strip():
            return []
        # 启发式策略：根据 Token 数决定 k 值
        token_count = len(query.split())
        k = 3 if token_count < 6 else 8
        return self.db.similarity_search(query, k=k)

def build_vector_store(texts: List[str]) -> FAISS:
    """从原始文本构建 FAISS 索引（POC 阶段）"""
    embeddings = HuggingFaceEmbeddings(
        model_name="sentence-transformers/all-MiniLM-L6-v2"
    )
    splitter = RecursiveCharacterTextSplitter(
        chunk_size=1000,
        chunk_overlap=100
    )
    docs = []
    for text in texts:
        chunks = splitter.split_text(text)
        for chunk in chunks:
            docs.append(chunk)
    return FAISS.from_texts(docs, embeddings)

# backend/graph_workflow.py
from typing import TypedDict, List
from langgraph.graph import StateGraph, END
from langchain.schema import Document
from langchain_openai import ChatOpenAI

class GraphState(TypedDict):
    question: str
    docs: List[Document]
    answer: str

def create_workflow(rag):
    llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
    workflow = StateGraph(GraphState)

    async def retrieve_node(state: GraphState):
        docs = rag.retrieve(state["question"])
        return {"docs": docs}

    async def reasoning_node(state: GraphState):
        question = state["question"]
        docs = state.get("docs", [])
        context = "\n\n".join([d.page_content for d in docs])
        prompt = f"""
        You are a technical assistant. Use ONLY the context below to answer the question.
        If the answer is not in the context, say you don't know.
        Context: {context}
        Question: {question}
        """
        response = await llm.ainvoke(prompt)
        return {"answer": response.content}

    workflow.add_node("retrieve", retrieve_node)
    workflow.add_node("reason", reasoning_node)
    workflow.set_entry_point("retrieve")
    workflow.add_edge("retrieve", "reason")
    workflow.add_edge("reason", END)
    return workflow.compile()

# backend/app.py
import os
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from dotenv import load_dotenv
from rag_pipeline import AdaptiveRAG, build_vector_store
from graph_workflow import create_workflow

load_dotenv()
app = FastAPI(title="Adaptive RAG API")

class AskRequest(BaseModel):
    query: str

@app.on_event("startup")
async def startup_event():
    global workflow
    # 示例知识库（生产环境请替换为真实文档）
    sample_docs = [
        "LangGraph supports stateful workflows and retry logic.",
        "Adaptive RAG dynamically changes retrieval depth based on query complexity.",
        "FastAPI is a high-performance async Python framework.",
    ]
    vector_db = build_vector_store(sample_docs)
    rag = AdaptiveRAG(vector_db)
    workflow = create_workflow(rag)

@app.post("/ask")
async def ask(payload: AskRequest):
    if not payload.query.strip():
        raise HTTPException(status_code=400, detail="Query cannot be empty")
    try:
        result = await workflow.ainvoke({"question": payload.query})
        return {"response": result["answer"]}
    except Exception as e:
        raise HTTPException(status_code=500, detail="Internal RAG processing error")

# frontend/ui.py
import streamlit as st
import requests

API_URL = "http://localhost:8000/ask"
st.set_page_config(page_title="Adaptive RAG Assistant")
st.title("Adaptive RAG Support Assistant")

query = st.text_input("Enter your question")
if st.button("Ask"):
    if not query.strip():
        st.warning("Please enter a question.")
    else:
        try:
            with st.spinner("Thinking..."):
                response = requests.post(API_URL, json={"query": query}, timeout=60)
                response.raise_for_status()
                answer = response.json()["response"]
                st.markdown("### Answer:")
                st.write(answer)
        except Exception as e:
            st.error(f"Error: {e}")

pip install -r requirements.txt
export OPENAI_API_KEY="your_key_here"
uvicorn backend.app:app --reload
streamlit run frontend/ui.py

Adaptive RAG 系统搭建：LangGraph、FastAPI 与 Streamlit 实战

技术概览

系统架构

项目结构

核心代码实现

自适应检索管道 (rag_pipeline.py)

更多推荐文章

相关免费在线工具

LangGraph 工作流 (graph_workflow.py)

FastAPI 后端 (app.py)

Streamlit UI (ui.py)

运行项目

内部执行流程

下一步：生产部署

总结

更多推荐文章

相关免费在线工具

Adaptive RAG 系统搭建：LangGraph、FastAPI 与 Streamlit 实战

技术概览

系统架构

项目结构

核心代码实现

自适应检索管道 (rag_pipeline.py)

微信扫一扫，关注极客日志

更多推荐文章

相关免费在线工具

LangGraph 工作流 (graph_workflow.py)

FastAPI 后端 (app.py)

Streamlit UI (ui.py)

运行项目

内部执行流程

下一步：生产部署

总结

微信扫一扫，关注极客日志

更多推荐文章

相关免费在线工具