基于 Llama3 与 LangChain 搭建本地私有知识库的 RAG 实践 | 极客日志

PythonAI算法

基于 Llama3 与 LangChain 搭建本地私有知识库的 RAG 实践

综述由AI生成介绍如何利用 Llama3 大模型和 LangChain 框架，结合 Weaviate 向量数据库构建本地私有知识库。通过 RAG 技术解决大模型时效性与幻觉问题。涵盖环境配置、语料加载分块、向量化存储、检索增强及问答链实现。修正了原代码参数错误，提供完整 Python 示例及常见问题排查指南，助力开发者快速上手本地化 AI 应用。

雾岛听风发布于 2025/2/6更新于 2026/5/2321 浏览

LLM 存在时效性和幻觉问题，RAG（检索增强生成）技术是解决这一问题的核心方案。本文将分享如何基于 Llama3 和 LangChain 搭建本地私有知识库，实现个性化知识管理。

先决条件

在开始之前，请确保完成以下环境配置：

安装 Ollama 和 Llama3 模型：参考官方文档进行本地部署。
安装 Python 3.9+：确保 Python 环境可用。
安装 LangChain：用于协调大模型应用逻辑。
安装 Weaviate Client：用于连接向量数据库。

pip3 install langchain weaviate-client

RAG 实践流程

RAG 的核心流程是从向量数据库中检索相关上下文，然后输入 LLM 进行生成。主要步骤包括：准备文本资料、将文本分块、嵌入并存储到向量数据库。

1. 下载与加载语料

我们使用公开的文本数据作为示例。首先下载文件，然后使用 TextLoader 加载。

import requests
from langchain_community.document_loaders import TextLoader

# 下载文件
url = "https://raw.githubusercontent.com/langchain-ai/langchain/master/docs/docs/modules/state_of_the_union.txt"
res = requests.get(url)
with open("state_of_the_union.txt", "w") as f:
    f.write(res.text)

# 加载文件
loader = TextLoader('./state_of_the_union.txt')
documents = loader.load()

2. 语料分块

由于原始文档可能超出 LLM 的上下文窗口限制，需要将其切分为合适的块。LangChain 提供了多种分块工具，这里使用 CharacterTextSplitter。

from langchain.text_splitter import CharacterTextSplitter

# chunk_size 控制每个块的大小，chunk_overlap 控制重叠部分
text_splitter = CharacterTextSplitter(chunk_size=500, chunk_overlap=50)
chunks = text_splitter.split_documents(documents)

3. 嵌入及存储到向量数据库

为了支持语义搜索，需要将文本块转换为向量并存储。这里使用 Ollama 生成向量，Weaviate 作为向量数据库。

import weaviate
from weaviate.embedded import EmbeddedOptions
 langchain_community.embeddings  OllamaEmbeddings
 langchain_community.vectorstores  Weaviate


client = weaviate.Client(
    embedded_options=EmbeddedOptions()
)
()


vectorstore = Weaviate.from_documents(
    client=client,
    documents=chunks,
    embedding=OllamaEmbeddings(model=),
    by_text=
)

相关免费在线工具

加密/解密文本
使用加密算法（如AES、TripleDES、Rabbit或RC4）加密和解密文本明文。在线工具，加密/解密文本在线工具，online
RSA密钥对生成器
生成新的随机RSA私钥和公钥pem证书。在线工具，RSA密钥对生成器在线工具，online
Mermaid 预览与可视化编辑
基于 Mermaid.js 实时预览流程图、时序图等图表，支持源码编辑与即时渲染。在线工具，Mermaid 预览与可视化编辑在线工具，online
随机西班牙地址生成器
随机生成西班牙地址（支持马德里、加泰罗尼亚、安达卢西亚、瓦伦西亚筛选），支持数量快捷选择、显示全部与下载。在线工具，随机西班牙地址生成器在线工具，online
Gemini 图片去水印
基于开源反向 Alpha 混合算法去除 Gemini/Nano Banana 图片水印，支持批量处理与下载。在线工具，Gemini 图片去水印在线工具，online
curl 转代码
解析常见 curl 参数并生成 fetch、axios、PHP curl 或 Python requests 示例代码。在线工具，curl 转代码在线工具，online

from langchain.prompts import ChatPromptTemplate

# 定义检索器
retriever = vectorstore.as_retriever()

# 定义聊天提示模板
template = """You are an assistant for question-answering tasks. 
   Use the following pieces of retrieved context to answer the question. 
   If you don't know the answer, just say that you don't know. 
   Use three sentences maximum and keep the answer concise.
   Question: {question} 
   Context: {context} 
   Answer:
   """
prompt = ChatPromptTemplate.from_template(template)

from langchain_community.chat_models import ChatOllama
from langchain.schema.runnable import RunnablePassthrough
from langchain.schema.output_parser import StrOutputParser

# 初始化 LLM，注意 temperature 参数通常设为 0.7 左右
llm = ChatOllama(model="llama3", temperature=0.7)

# 构建 RAG 链
rag_chain = (
        {"context": retriever, "question": RunnablePassthrough()} 
        | prompt
        | llm
        | StrOutputParser()
)

# 执行查询
query = "What did the president mainly say?"
result = rag_chain.invoke(query)
print(result)

from langchain_community.document_loaders import TextLoader
from langchain.text_splitter import CharacterTextSplitter
from langchain_community.embeddings import OllamaEmbeddings
import weaviate
from weaviate.embedded import EmbeddedOptions
from langchain.prompts import ChatPromptTemplate
from langchain_community.chat_models import ChatOllama
from langchain.schema.runnable import RunnablePassthrough
from langchain.schema.output_parser import StrOutputParser
from langchain_community.vectorstores import Weaviate
import requests

# 1. 下载数据
url = "https://raw.githubusercontent.com/langchain-ai/langchain/master/docs/docs/modules/state_of_the_union.txt"
res = requests.get(url)
with open("state_of_the_union.txt", "w") as f:
    f.write(res.text)

# 2. 加载数据
loader = TextLoader('./state_of_the_union.txt')
documents = loader.load()

# 3. 文本分块
text_splitter = CharacterTextSplitter(chunk_size=500, chunk_overlap=50)
chunks = text_splitter.split_documents(documents)

# 4. 初始化向量数据库并嵌入目标文档
client = weaviate.Client(
    embedded_options=EmbeddedOptions()
)
vectorstore = Weaviate.from_documents(
    client=client,
    documents=chunks,
    embedding=OllamaEmbeddings(model="llama3"),
    by_text=False
)

# 5. 检索器
retriever = vectorstore.as_retriever()

# 6. LLM 提示模板
template = """You are an assistant for question-answering tasks. 
   Use the following pieces of retrieved context to answer the question. 
   If you don't know the answer, just say that you don't know. 
   Use three sentences maximum and keep the answer concise.
   Question: {question} 
   Context: {context} 
   Answer:
   """
prompt = ChatPromptTemplate.from_template(template)

# 7. 初始化 LLM
llm = ChatOllama(model="llama3", temperature=0.7)

# 8. 构建 RAG 链
rag_chain = (
        {"context": retriever, "question": RunnablePassthrough()}
        | prompt
        | llm
        | StrOutputParser()
)

# 9. 开始查询&生成
query = "What did the president mainly say?"
print(rag_chain.invoke(query))

基于 Llama3 与 LangChain 搭建本地私有知识库的 RAG 实践

先决条件

RAG 实践流程

1. 下载与加载语料

2. 语料分块

3. 嵌入及存储到向量数据库

更多推荐文章

相关免费在线工具

4. 检索与增强

5. 生成回答

完整代码示例

常见问题与排查

优化建议

更多推荐文章

相关免费在线工具

基于 Llama3 与 LangChain 搭建本地私有知识库的 RAG 实践

先决条件

RAG 实践流程

1. 下载与加载语料

2. 语料分块

3. 嵌入及存储到向量数据库

微信扫一扫，关注极客日志

更多推荐文章

相关免费在线工具

4. 检索与增强

5. 生成回答

完整代码示例

常见问题与排查

优化建议

微信扫一扫，关注极客日志

更多推荐文章

相关免费在线工具