知识库问答机器人：基于 SpringAI+RAG 的完整实现 | 极客日志

JavaAIjava

知识库问答机器人：基于 SpringAI+RAG 的完整实现

综述由AI生成基于 Spring AI 和 RAG 技术构建知识库问答机器人的完整实现方案。内容涵盖 RAG 原理、项目结构搭建、自定义向量存储实现、文档分块与向量化处理、问答服务核心逻辑及控制器开发。通过集成 Tika 文档解析、HanLP 中文分词及智谱大模型，实现了文档上传、检索增强生成及流式响应功能。方案包含内存向量库实现及标准 Spring AI 向量存储两种模式，适用于快速原型开发与生产环境部署。

邪神洛基发布于 2026/4/6更新于 2026/5/2634 浏览

一、引言

随着大语言模型的快速发展，RAG（Retrieval-Augmented Generation）技术已成为构建知识库问答系统的核心技术之一。本文将介绍如何从零开始，使用 Spring AI 框架构建一个支持文档上传的知识库问答机器人。

1.1 什么是 RAG？ RAG（检索增强生成）是一种结合了信息检索和文本生成的技术。基本工作流程如下：

用户提出问题
系统从知识库中检索相关信息
大语言模型基于检索到的信息生成答案

从系统设计角度，RAG 的核心作用是在 LLM 调用生成响应之前，由系统动态构造一个'最小且相关的知识上下文'。

动态：每次问题不同，检索的知识也不同。
最小：只注入必要信息，避免窗口溢出与注意力竞争。

1.2 RAG 在交互链路中的位置 在企业知识库场景中，RAG 主要位于用户提问与向 LLM 发起请求的中间段，用于检索关联文档构建上下文。

RAG 交互链路图

1.3 RAG 工作原理 RAG 的工作原理涉及检索与生成的协同，具体流程可参考相关技术文档。

RAG 工作原理图

二、核心实现

2.1 项目结构概览

D05-rag-qa-bot/
├── src/main/java/com/git/hui/springai/app/
│   ├── D05Application.java # 启动类
│   ├── mvc/
│   │   ├── QaApiController.java # API 控制器
│   │   └── QaController.java # 页面控制器
│   ├── qa/QaBoltService.java # 问答服务
│   └── vectorstore/
│       ├── DocumentChunker.java # 文档分块工具
│       ├── DocumentQuantizer.java # 文档量化器
│       └── TextBasedVectorStore.java # 文本向量存储
├── src/main/resources/
│   ├── application.yml # 配置文件
│   ├── prompts/qa-prompts.pt # 提示词模板
│   └── templates/chat.html # 前端页面
└── pom.xml # 依赖配置

2.2 项目初始化

2.2.1 Maven 依赖配置 在 pom.xml 中配置必要的依赖。向量数据库、Tika 文档解析属于核心依赖项。HanLP 适用于中文分词场景。示例中使用智谱免费大模型，也可切换至 OpenAI-Starter。


    
    
        org.springframework.ai
        spring-ai-advisors-vector-store
    
    
    
        org.springframework.ai
        spring-ai-tika-document-reader
    
    
    
        org.springframework.ai
        spring-ai-pdf-document-reader
    
    
        org.springframework.ai
        spring-ai-rag
    
    
        org.springframework.boot
        spring-boot-starter-web
    
    
    
        org.springframework.ai
        spring-ai-starter-model-zhipuai
    
    
    
        org.springframework.boot
        spring-boot-starter-thymeleaf
    
    
    
        com.hankcs
        hanlp
        portable-1.8.4

相关免费在线工具

Keycode 信息
查找任何按下的键的javascript键代码、代码、位置和修饰符。在线工具，Keycode 信息在线工具，online
Escape 与 Native 编解码
JavaScript 字符串转义/反转义；Java 风格 \uXXXX（Native2Ascii）编码与解码。在线工具，Escape 与 Native 编解码在线工具，online
JavaScript / HTML 格式化
使用 Prettier 在浏览器内格式化 JavaScript 或 HTML 片段。在线工具，JavaScript / HTML 格式化在线工具，online
JavaScript 压缩与混淆
Terser 压缩、变量名混淆，或 javascript-obfuscator 高强度混淆（体积会增大）。在线工具，JavaScript 压缩与混淆在线工具，online
RSA密钥对生成器
生成新的随机RSA私钥和公钥pem证书。在线工具，RSA密钥对生成器在线工具，online
Mermaid 预览与可视化编辑
基于 Mermaid.js 实时预览流程图、时序图等图表，支持源码编辑与即时渲染。在线工具，Mermaid 预览与可视化编辑在线工具，online

spring:
  ai:
    zhipuai:
      api-key: ${zhipuai-api-key}
    chat:
      options:
        model: GLM-4-Flash
        temperature: 0.1
  thymeleaf:
    cache: false
  servlet:
    multipart:
      max-file-size: 10MB
      max-request-size: 50MB
logging:
  level:
    org.springframework.ai.chat.client.advisor.SimpleLoggerAdvisor: debug
    org.springframework.ai.chat.client: DEBUG
server:
  port: 8080

public class TextBasedVectorStore extends AbstractObservationVectorStore {
    @Getter
    protected Map<String, SimpleVectorStoreContent> store = new ConcurrentHashMap<>();
    private Set<String> persistMd5 = new CopyOnWriteArraySet<>();

    @Override
    public void doAdd(List<Document> documents) {
        if (CollectionUtils.isEmpty(documents)) return;
        List<Document> mutableDocuments = new ArrayList<>();
        for (Document document : documents) {
            if (!persistMd5.contains((String) document.getMetadata().get("md5"))) {
                mutableDocuments.add(document);
            }
        }
        if (CollectionUtils.isEmpty(mutableDocuments)) return;
        
        List<Document> chunkers = DocumentChunker.DEFAULT_CHUNKER.chunkDocuments(mutableDocuments);
        chunkers.forEach(document -> {
            float[] embedding = DocumentQuantizer.quantizeDocument(document);
            if (embedding.length == 0) return;
            SimpleVectorStoreContent storeContent = new SimpleVectorStoreContent(
                document.getId(), document.getText(), document.getMetadata(), embedding
            );
            this.store.put(document.getId(), storeContent);
        });
        mutableDocuments.forEach(document -> persistMd5.add((String) document.getMetadata().get("md5")));
    }

    @Override
    public List<Document> doSimilaritySearch(SearchRequest request) {
        Predicate<SimpleVectorStoreContent> documentFilterPredicate = this.doFilterPredicate(request);
        final float[] userQueryEmbedding = this.getUserQueryEmbedding(request.getQuery());
        return this.store.values().stream()
            .filter(documentFilterPredicate)
            .map(content -> content.toDocument(DocumentQuantizer.calculateCosineSimilarity(userQueryEmbedding, content.getEmbedding())))
            .filter(document -> document.getScore() >= request.getSimilarityThreshold())
            .sorted(Comparator.comparing(Document::Score).reversed())
            .limit((long) request.getTopK())
            .toList();
    }

    private float[] getUserQueryEmbedding(String query) {
        return DocumentQuantizer.quantizeQuery(query);
    }
}

public class DocumentChunker {
    private final int maxChunkSize;
    private final int overlapSize;

    public DocumentChunker() {
        this(500, 50); // 默认值：最大块大小 500 个字符，重叠 50 个字符
    }

    public List<Document> chunkDocument(Document document) {
        String content = document.getText();
        if (content == null || content.trim().isEmpty()) return List.of(document);
        
        List<String> chunks = splitText(content);
        List<Document> chunkedDocuments = new ArrayList<>();
        for (int i = 0; i < chunks.size(); i++) {
            String chunk = chunks.get(i);
            String chunkId = document.getId() + "_chunk_" + i;
            Document chunkDoc = new Document(chunkId, chunk, new HashMap<>(document.getMetadata()));
            chunkDoc.getMetadata().put("chunk_index", i);
            chunkDoc.getMetadata().put("total_chunks", chunks.size());
            chunkDoc.getMetadata().put("original_document_id", document.getId());
            chunkedDocuments.add(chunkDoc);
        }
        return chunkedDocuments;
    }

    private List<String> splitText(String text) {
        List<String> chunks = new ArrayList<>();
        String[] sentences = text.split("(?<=。)|(?<=！)|(?<=!)|(?<=？)|(?<=\\?)|(?<=\\n\\n)");
        StringBuilder currentChunk = new StringBuilder();
        
        for (String sentence : sentences) {
            if (sentence.trim().isEmpty()) continue;
            if (currentChunk.length() + sentence.length() <= maxChunkSize) {
                if (currentChunk.length() > 0) currentChunk.append(sentence);
                else currentChunk.append(sentence);
            } else {
                if (currentChunk.length() == 0) {
                    List<String> subChunks = forceSplit(sentence, maxChunkSize);
                    for (int i = 0; i < subChunks.size(); i++) {
                        String subChunk = subChunks.get(i);
                        if (i < subChunks.size() - 1) chunks.add(subChunk);
                        else currentChunk.append(subChunk);
                    }
                } else {
                    chunks.add(currentChunk.toString());
                    currentChunk = new StringBuilder();
                    if (sentence.length() > overlapSize) {
                        String overlap = sentence.substring(Math.max(0, sentence.length() - overlapSize));
                        currentChunk.append(overlap);
                        currentChunk.append(sentence);
                    } else {
                        currentChunk.append(sentence);
                    }
                }
            }
        }
        if (currentChunk.length() > 0) chunks.add(currentChunk.toString());
        return chunks;
    }
}

public class DocumentQuantizer {
    private static final Segment SEGMENT = HanLP.newSegment();

    public static float[] quantizeText(String text) {
        if (text == null || text.trim().isEmpty()) return new float[0];
        String[] words = preprocessText(text);
        Map<String, Integer> wordFreq = countWordFrequency(words);
        return generateFixedLengthVector(wordFreq, 128);
    }

    private static String[] preprocessText(String text) {
        List<Term> termList = SEGMENT.seg(text);
        return termList.stream()
            .filter(term -> !isStopWord(term.word))
            .filter(term -> !term.nature.toString().startsWith("w"))
            .map(term -> term.word.toLowerCase())
            .toArray(String[]::new);
    }

    private static float[] generateFixedLengthVector(Map<String, Integer> wordFreq, int length) {
        float[] vector = new float[length];
        List<Map.Entry<String, Integer>> sortedEntries = wordFreq.entrySet().stream()
            .sorted(Map.Entry.<String, Integer>comparingByValue().reversed())
            .limit(length)
            .collect(Collectors.toList());
        for (int i = 0; i < Math.min(sortedEntries.size(), length); i++) {
            vector[i] = sortedEntries.get(i).getValue();
        }
        return vector;
    }

    public static double calculateCosineSimilarity(float[] vectorA, float[] vectorB) {
        if (vectorA == null || vectorB == null || vectorA.length == 0 || vectorB.length == 0) return 0.0;
        int minLength = Math.min(vectorA.length, vectorB.length);
        float[] adjustedA = Arrays.copyOf(vectorA, minLength);
        float[] adjustedB = Arrays.copyOf(vectorB, minLength);
        double dotProduct = 0.0, normA = 0.0, normB = 0.0;
        for (int i = 0; i < minLength; i++) {
            dotProduct += adjustedA[i] * adjustedB[i];
            normA += Math.pow(adjustedA[i], 2);
            normB += Math.pow(adjustedB[i], 2);
        }
        normA = Math.sqrt(normA);
        normB = Math.sqrt(normB);
        if (normA == 0 || normB == 0) return 0.0;
        return dotProduct / (normA * normB);
    }
}

@Bean
public VectorStore vectorStore() {
    return TextBasedVectorStore.builder().build();
}

@Bean
public VectorStore vectorStore(EmbeddingModel embeddingModel) {
    return SimpleVectorStore.builder(embeddingModel).build();
}

@Service
public class QaBoltService {
    private final ChatClient chatClient;
    private final ChatMemory chatMemory;
    private final VectorStore vectorStore;

    @Value("classpath:/prompts/qa-prompts.pt")
    private Resource boltPrompts;

    public QaBoltService(ChatClient.Builder builder, VectorStore vectorStore, ChatMemory chatMemory) {
        this.vectorStore = vectorStore;
        this.chatMemory = chatMemory;
        this.chatClient = builder.defaultAdvisors(
            new SimpleLoggerAdvisor(ModelOptionsUtils::toJsonStringPrettyPrinter, ModelOptionsUtils::toJsonStringPrettyPrinter, 0),
            MessageChatMemoryAdvisor.builder(chatMemory).build(),
            RetrievalAugmentationAdvisor.builder()
                .queryTransformers(RewriteQueryTransformer.builder().chatClientBuilder(builder.build().mutate()).build())
                .queryAugmenter(ContextualQueryAugmenter.builder().allowEmptyContext(true).build())
                .documentRetriever(VectorStoreDocumentRetriever.builder()
                    .similarityThreshold(0.50)
                    .vectorStore(vectorStore)
                    .build())
                .build()
        ).build();
    }
}

private ProceedInfo processFiles(String chatId, Collection<MultipartFile> files) {
    StringBuilder context = new StringBuilder("\n\n");
    List<Media> mediaList = new ArrayList<>();
    files.forEach(file -> {
        try {
            var data = new ByteArrayResource(file.getBytes());
            var md5 = calculateHash(chatId, file.getBytes());
            MimeTypemime = MimeType.valueOf(file.getContentType());
            if (mime.equalsTypeAndSubtype(MediaType.APPLICATION_PDF)) {
                PagePdfDocumentReader pdfReader = new PagePdfDocumentReader(data, PdfDocumentReaderConfig.builder()
                    .withPageTopMargin(0)
                    .withPageExtractedTextFormatter(ExtractedTextFormatter.builder().withNumberOfTopTextLinesToDelete(0).build())
                    .withPagesPerDocument(1).build());
                List<Document> documents = pdfReader.read();
                documents.forEach(document -> {
                    document.getMetadata().put("md5", md5);
                    if (document.getMetadata().containsKey("file_name") && document.getMetadata().get("file_name") == null) {
                        document.getMetadata().put("file_name", file.getName());
                    }
                });
                vectorStore.add(documents);
                var content = String.join("\n", documents.stream().map(Document::getText).toList());
                context.append(String.format(ATTACHMENT_TEMPLATE, file.getName(), content));
            } else if ("text".equalsIgnoreCase(mime.getType())) {
                List<Document> documents = new TikaDocumentReader(data).read();
                documents.forEach(document -> document.getMetadata().put("md5", md5));
                vectorStore.add(documents);
                var content = String.join("\n", documents.stream().map(Document::getText).toList());
                context.append(String.format(ATTACHMENT_TEMPLATE, file.getName(), content));
            }
        } catch (IOException e) {
            throw new RuntimeException(e);
        }
    });
    return new ProceedInfo(context.toString(), mediaList);
}

public Flux<String> ask(String chatId, String question, Collection<MultipartFile> files) {
    processFiles(chatId, files);
    PromptTemplate customPromptTemplate = PromptTemplate.builder()
        .renderer(StTemplateRenderer.builder().startDelimiterToken('<').endDelimiterToken('>').build())
        .template("""
            <query>
            Context information is below.
            ---------------------
            <question_answer_context>
            ---------------------
            Given the context information and no prior knowledge, answer the query. Follow these rules:
            1. If the answer is not in the context, just say that you don't know.
            2. Avoid statements like "Based on the context..." or "The provided information...".
            """).build();
    
    var qaAdvisor = QuestionAnswerAdvisor.builder(vectorStore)
        .searchRequest(SearchRequest.builder().similarityThreshold(0.5d).topK(3).build())
        .promptTemplate(customPromptTemplate)
        .build();
    
    var requestSpec = chatClient.prompt()
        .system(boltPrompts)
        .user(question)
        .advisors(qaAdvisor)
        .advisors(a -> a.param(ChatMemory.CONVERSATION_ID, chatId));
    
    return requestSpec.stream().content().map(s -> s.replaceAll("\n", "<br/>"));
}

@RestController
@RequestMapping("/api")
public class QaApiController {
    @Autowired
    private QaBoltService qaBolt;

    @GetMapping(path = "/chat/{chatId}", produces = MediaType.TEXT_EVENT_STREAM_VALUE)
    public Flux<String> qaGet(@PathVariable("chatId") String chatId, @RequestParam("question") String question) {
        return qaBolt.ask(chatId, question, Collections.emptyList());
    }

    @PostMapping(path = "/chat/{chatId}", produces = MediaType.TEXT_EVENT_STREAM_VALUE)
    public Flux<String> qaPost(@PathVariable("chatId") String chatId, @RequestParam("question") String question, @RequestParam(value = "files", required = false) Collection<MultipartFile> files) {
        if (files == null) files = Collections.emptyList();
        return qaBolt.ask(chatId, question, files);
    }
}

@SpringBootApplication
public class D05Application {
    @Bean
    public VectorStore vectorStore() {
        return TextBasedVectorStore.builder().build();
    }

    public static void main(String[] args) {
        SpringApplication.run(D05Application.class, args);
        System.out.println("启动成功，前端测试访问地址：http://localhost:8080/chat");
    }
}

## 角色设定
你是一个智能问答助手，专门负责根据用户提供的文档内容进行准确的回答和信息提取。

## 核心任务
- 仔细阅读并理解用户上传的文档内容
- 基于文档中的信息回答用户的问题
- 提供准确、相关且基于文档的答案
- 当问题超出文档范围时，明确告知用户该信息未在文档中提及

## 工作流程
1. 首先分析用户上传的文档，提取关键信息
2. 理解用户提出的问题
3. 在文档中查找与问题相关的信息
4. 整合相关信息并形成结构化答案
5. 如无法从文档中找到相关信息，则说明情况

## 回答规范
- 严格基于文档内容作答，不得编造信息
- 引用文档中的具体信息时，请保持原文准确性
- 如果问题涉及多个知识点，在答案中清晰分点说明
- 对于不确定的内容，应诚实表达不确定性，而非猜测
- 保持回答简洁明了，同时确保信息完整

## 注意事项
- 不得脱离文档内容进行回答
- 遇到模糊或不明确的问题时，可以请求用户提供更详细的信息
- 如果文档中没有相关内容，必须明确告知用户
- 保持专业、礼貌的沟通态度

知识库问答机器人：基于 SpringAI+RAG 的完整实现

一、引言

二、核心实现

更多推荐文章

相关免费在线工具

三、体验与小结

更多推荐文章

相关免费在线工具

知识库问答机器人：基于 SpringAI+RAG 的完整实现

一、引言

二、核心实现

微信扫一扫，关注极客日志

更多推荐文章

相关免费在线工具

三、体验与小结

微信扫一扫，关注极客日志

更多推荐文章

相关免费在线工具