JavaAIjava

SpringAI 全栈开发 + RAG 检索增强实战

综述由AI生成SpringAI 是 Spring 官方推出的 AI 应用开发框架，支持大模型接入、向量数据库及 RAG 全链路能力。基于 Spring Boot 3.3.5 和 Spring AI 1.2.0，结合 Milvus 向量库与豆包大模型，演示了企业级 RAG 应用的搭建流程。内容包括环境配置、文档解析分块、向量化存储、智能检索及问答接口实现。重点讲解了 RAG 原理、Prompt 工程优化、混合检索策略及生产级事务一致性处理，为 Java 开发者提供了一套完整的生成式 AI 落地方案。

并发大师发布于 2026/3/24更新于 2026/5/515 浏览

前言

随着生成式 AI 技术的规模化落地，企业级 AI 应用开发已从技术验证走向生产级部署。Java 作为企业级开发的主流语言，长期以来缺乏原生适配 Spring 生态的 AI 开发框架，导致开发者需要对接多套异构 SDK、处理复杂的适配逻辑、难以快速落地核心 AI 能力。SpringAI 的出现彻底改变了这一现状，它以 Spring 生态原生的设计理念，提供了统一的大模型接入抽象、全链路的 RAG 能力支持、无缝整合 Spring Boot 的自动配置特性，让 Java 开发者可以用极低的成本完成企业级 AI 应用的开发与落地。

一、核心技术栈底层原理与选型

1.1 SpringAI 核心架构与设计理念

SpringAI 是 Spring 官方推出的开源 AI 应用开发框架，完全遵循 Spring 生态的设计哲学，提供了可移植的 API 抽象，支持主流大模型服务、向量数据库、文档处理、RAG、Function Calling 等 AI 应用开发的全场景能力。其核心架构分为四层：

接入层：统一封装主流大模型、Embedding 模型、向量数据库的 SDK，屏蔽底层异构差异
抽象层：定义 ChatModel、EmbeddingModel、VectorStore、DocumentReader 等核心接口，实现业务代码与底层服务的解耦
能力层：提供文档拆分、文本向量化、向量检索、Prompt 工程、Function Calling、记忆管理等 RAG 全链路能力
整合层：原生适配 Spring Boot、Spring Security、Spring Cloud 等生态组件，支持自动配置、依赖注入、事务管理等企业级特性

核心优势在于：切换大模型/向量数据库仅需修改配置，无需改动业务代码；完全兼容 Spring Boot 3.x，无缝融入现有 Java 企业级项目；提供了生产级的异常处理、限流重试、监控观测能力。

1.2 RAG 检索增强生成底层逻辑

RAG（Retrieval Augmented Generation，检索增强生成）是解决大模型幻觉、知识滞后、私有数据安全接入三大核心痛点的最优方案，其核心逻辑是在大模型生成回答前，先从私有知识库中检索与用户问题相关的上下文信息，将其与用户问题拼接成完整 Prompt 后再输入大模型，让大模型基于精准的私有数据生成回答，从根源上降低幻觉概率。RAG 全流程分为两大核心阶段，对应的流程图如下：

文章配图

这里必须明确区分 RAG 与大模型微调（Fine-tuning）的核心差异，避免开发者选型错误：

特性	RAG 检索增强生成	大模型微调
核心能力	实时接入私有数据，解决知识滞后与幻觉	优化大模型在特定领域的生成风格与能力
数据更新	实时更新，新增文档无需重新训练	数据更新需重新微调，成本高周期长
数据安全	私有数据无需传入大模型训练环境，合规性高	需将训练数据传入大模型，存在数据泄露风险
开发成本	极低，小时级可落地	极高，需要大量标注数据与算力资源
适用场景	企业知识库、智能客服、文档问答、私有数据查询	特定领域的生成风格优化、垂类任务能力增强

1.3 技术栈选型与版本规范

本文所有实战内容均采用当前最新的稳定 GA 版本，确保兼容性与安全性，核心选型如下：

开发环境：JDK 17（LTS 版本）
项目框架：Spring Boot 3.3.5（最新稳定 GA 版）
AI 开发框架：Spring AI 1.2.0（最新稳定 GA 版）
项目管理：Maven 3.9.x
持久层框架：MyBatis-Plus 3.5.7
数据库：MySQL 8.0（LTS 版本）
向量数据库：Milvus 2.4.x（企业级开源向量数据库，SpringAI 原生支持）

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>
    <parent>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-parent</artifactId>
        <version>3.3.5</version>
        <relativePath/>
    </parent>
    <groupId>com.jam.demo</groupId>
    <artifactId>spring-ai-rag-demo</artifactId>
    <version>0.0.1-SNAPSHOT</version>
    <name>spring-ai-rag-demo</name>
    <description>SpringAI RAG 企业级实战项目</description>
    <properties>
        <java.version>17</java.version>
        <spring-ai.version>1.2.0</spring-ai.version>
        <mybatis-plus.version>3.5.7</mybatis-plus.version>
        <milvus.version>2.4.5</milvus.version>
        <tika.version>2.9.2</tika.version>
        <fastjson2.version>2.0.53</fastjson2.version>
        <guava.version>33.2.1-jre</guava.version>
        <springdoc.version>2.6.0</springdoc.version>
    </properties>
    <dependencyManagement>
        <dependencies>
            <dependency>
                <groupId>org.springframework.ai</groupId>
                <artifactId>spring-ai-bom</artifactId>
                <version>${spring-ai.version}</version>
                <type>pom</type>
                <scope>import</scope>
            </dependency>
        </dependencies>
    </dependencyManagement>
    <dependencies>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-web</artifactId>
        </dependency>
        <dependency>
            <groupId>org.springframework.ai</groupId>
            <artifactId>spring-ai-starter-doubao</artifactId>
        </dependency>
        <dependency>
            <groupId>org.springframework.ai</groupId>
            <artifactId>spring-ai-starter-milvus</artifactId>
        </dependency>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-validation</artifactId>
        </dependency>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-jdbc</artifactId>
        </dependency>
        <dependency>
            <groupId>com.baomidou</groupId>
            <artifactId>mybatis-plus-boot-starter</artifactId>
            <version>${mybatis-plus.version}</version>
        </dependency>
        <dependency>
            <groupId>com.mysql</groupId>
            <artifactId>mysql-connector-j</artifactId>
            <scope>runtime</scope>
        </dependency>
        <dependency>
            <groupId>org.apache.tika</groupId>
            <artifactId>tika-core</artifactId>
            <version>${tika.version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.tika</groupId>
            <artifactId>tika-parsers-standard-package</artifactId>
            <version>${tika.version}</version>
            <type>pom</type>
        </dependency>
        <dependency>
            <groupId>com.alibaba.fastjson2</groupId>
            <artifactId>fastjson2</artifactId>
            <version>${fastjson2.version}</version>
        </dependency>
        <dependency>
            <groupId>com.google.guava</groupId>
            <artifactId>guava</artifactId>
            <version>${guava.version}</version>
        </dependency>
        <dependency>
            <groupId>org.springdoc</groupId>
            <artifactId>springdoc-openapi-starter-webmvc-ui</artifactId>
            <version>${springdoc.version}</version>
        </dependency>
        <dependency>
            <groupId>org.projectlombok</groupId>
            <artifactId>lombok</artifactId>
            <version>1.18.34</version>
            <scope>provided</scope>
        </dependency>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-test</artifactId>
            <scope>test</scope>
        </dependency>
    </dependencies>
    <build>
        <plugins>
            <plugin>
                <groupId>org.springframework.boot</groupId>
                <artifactId>spring-boot-maven-plugin</artifactId>
                <configuration>
                    <excludes>
                        <exclude>
                            <groupId>org.projectlombok</groupId>
                            <artifactId>lombok</artifactId>
                        </exclude>
                    </excludes>
                </configuration>
            </plugin>
        </plugins>
    </build>
</project>

server:
  port: 8080
  servlet:
    context-path: /ai
spring:
  application:
    name: spring-ai-rag-demo
  datasource:
    driver-class-name: com.mysql.cj.jdbc.Driver
    url: jdbc:mysql://127.0.0.1:3306/ai_rag_db?useUnicode=true&characterEncoding=utf8&useSSL=false&serverTimezone=Asia/Shanghai&allowPublicKeyRetrieval=true
    username: root
    password: root
  ai:
    doubao:
      api-key: 你的豆包 API 密钥
      base-url: https://ark.cn-beijing.volces.com/api/v3
      chat:
        options:
          model: doubao-pro-32k
          temperature: 0.3
          top-p: 0.9
          max-tokens: 4096
      embedding:
        options:
          model: doubao-embedding-text-240515
  milvus:
    client:
      host: 127.0.0.1
      port: 19530
      username: root
      password: milvus
      database-name: default
mybatis-plus:
  mapper-locations: classpath*:/mapper/**/*.xml
  type-aliases-package: com.jam.demo.entity
  configuration:
    map-underscore-to-camel-case: true
    cache-enabled: false
    log-impl: org.apache.ibatis.logging.stdout.StdOutImpl
springdoc:
  api-docs:
    enabled: true
    path: /v3/api-docs
  swagger-ui:
    enabled: true
    path: /swagger-ui.html
    tags-sorter: alpha
    operations-sorter: alpha

CREATE DATABASE IF NOT EXISTS ai_rag_db DEFAULT CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
USE ai_rag_db;
DROP TABLE IF EXISTS tb_document_chunk;
CREATE TABLE tb_document_chunk (
    id BIGINT NOT NULL AUTO_INCREMENT COMMENT '主键 ID',
    document_id VARCHAR(64) NOT NULL COMMENT '文档唯一 ID',
    document_name VARCHAR(255) NOT NULL COMMENT '文档名称',
    chunk_id INT NOT NULL COMMENT '分块序号',
    chunk_content TEXT NOT NULL COMMENT '分块文本内容',
    vector_id VARCHAR(64) NOT NULL COMMENT '对应向量数据库的向量 ID',
    create_time DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT '创建时间',
    update_time DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT '更新时间',
    PRIMARY KEY (id),
    UNIQUE KEY uk_document_chunk (document_id, chunk_id),
    KEY idx_document_id (document_id),
    KEY idx_create_time (create_time)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci COMMENT='文档分块元数据表';

package com.jam.demo;

import org.mybatis.spring.annotation.MapperScan;
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;

@SpringBootApplication
@MapperScan("com.jam.demo.mapper")
public class SpringAiRagDemoApplication {
    public static void main(String[] args) {
        SpringApplication.run(SpringAiRagDemoApplication.class, args);
    }
}

package com.jam.demo.config;

import io.swagger.v3.oas.models.OpenAPI;
import io.swagger.v3.oas.models.info.Contact;
import io.swagger.v3.oas.models.info.Info;
import io.swagger.v3.oas.models.info.License;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;

@Configuration
public class SpringDocConfig {
    @Bean
    public OpenAPI openAPI() {
        return new OpenAPI()
                .info(new Info()
                        .title("SpringAI RAG 实战项目接口文档")
                        .description("企业级 AI 应用开发与 RAG 检索增强系统接口文档")
                        .version("v1.0.0")
                        .contact(new Contact().name("ken").email("[email protected]"))
                        .license(new License().name("Apache 2.0").url("https://www.apache.org/licenses/LICENSE-2.0")))
                ;
    }
}

package com.jam.demo.config;

import io.milvus.param.IndexType;
import io.milvus.param.MetricType;
import org.springframework.ai.document.Document;
import org.springframework.ai.embedding.EmbeddingModel;
import org.springframework.ai.vectorstore.MilvusVectorStore;
import org.springframework.ai.vectorstore.VectorStore;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;

@Configuration
public class MilvusConfig {
    private static final int VECTOR_DIMENSION = 1024;
    private static final String COLLECTION_NAME = "rag_document_vector";

    @Bean
    public VectorStore vectorStore(EmbeddingModel embeddingModel) {
        return MilvusVectorStore.builder(embeddingModel)
                .withCollectionName(COLLECTION_NAME)
                .withVectorDimension(VECTOR_DIMENSION)
                .withIndexType(IndexType.HNSW)
                .withMetricType(MetricType.COSINE)
                .withAutoCreateCollection(true)
                .withAutoCreateIndex(true)
                .build();
    }
}

package com.jam.demo.dto;

import io.swagger.v3.oas.annotations.media.Schema;
import jakarta.validation.constraints.NotBlank;
import lombok.Data;

@Data
@Schema(description = "对话请求 DTO")
public class ChatRequestDTO {
    @NotBlank(message = "用户提问不能为空")
    @Schema(description = "用户提问内容", requiredMode = Schema.RequiredMode.REQUIRED, example = "Java 中 HashMap 和 ConcurrentHashMap 的区别")
    private String query;
    @Schema(description = "会话 ID，用于多轮对话", example = "123456789")
    private String sessionId;
}

package com.jam.demo.dto;

import io.swagger.v3.oas.annotations.media.Schema;
import lombok.AllArgsConstructor;
import lombok.Data;
import lombok.NoArgsConstructor;

@Data
@NoArgsConstructor
@AllArgsConstructor
@Schema(description = "对话响应 DTO")
public class ChatResponseDTO {
    @Schema(description = "大模型生成的回答内容")
    private String answer;
    @Schema(description = "输入 Token 消耗")
    private Long promptTokens;
    @Schema(description = "输出 Token 消耗")
    private Long completionTokens;
    @Schema(description = "总 Token 消耗")
    private Long totalTokens;
}

package com.jam.demo.service;

import com.jam.demo.dto.ChatRequestDTO;
import com.jam.demo.dto.ChatResponseDTO;
import lombok.extern.slf4j.Slf4j;
import org.springframework.ai.chat.client.ChatClient;
import org.springframework.ai.chat.model.ChatResponse;
import org.springframework.ai.chat.prompt.Prompt;
import org.springframework.stereotype.Service;
import org.springframework.util.StringUtils;

@Service
@Slf4j
public class ChatService {
    private final ChatClient chatClient;

    public ChatService(ChatClient.Builder chatClientBuilder) {
        this.chatClient = chatClientBuilder.build();
    }

    /**
     * 基础单轮对话
     */
    public ChatResponseDTO singleChat(ChatRequestDTO requestDTO) {
        String query = requestDTO.getQuery();
        log.info("收到用户对话请求，query:{}", query);
        ChatResponse chatResponse = chatClient.prompt()
                .user(query)
                .call()
                .chatResponse();
        String answer = chatResponse.getResult().getOutput().getText();
        Long promptTokens = chatResponse.getMetadata().getUsage().getPromptTokens();
        Long completionTokens = chatResponse.getMetadata().getUsage().getCompletionTokens();
        Long totalTokens = chatResponse.getMetadata().getUsage().getTotalTokens();
        log.info("大模型响应完成，总 Token 消耗:{}", totalTokens);
        return new ChatResponseDTO(answer, promptTokens, completionTokens, totalTokens);
    }

    /**
     * 带系统提示词的对话
     */
    public ChatResponseDTO chatWithSystemPrompt(ChatRequestDTO requestDTO, String systemPrompt) {
        String query = requestDTO.getQuery();
        log.info("收到带系统提示词的对话请求，query:{}", query);
        Prompt prompt;
        if (StringUtils.hasText(systemPrompt)) {
            prompt = new Prompt(
                    org.springframework.ai.chat.messages.SystemMessage.from(systemPrompt),
                    org.springframework.ai.chat.messages.UserMessage.from(query)
            );
        } else {
            prompt = new Prompt(org.springframework.ai.chat.messages.UserMessage.from(query));
        }
        ChatResponse chatResponse = chatClient.prompt(prompt).call().chatResponse();
        String answer = chatResponse.getResult().getOutput().getText();
        Long promptTokens = chatResponse.getMetadata().getUsage().getPromptTokens();
        Long completionTokens = chatResponse.getMetadata().getUsage().getCompletionTokens();
        Long totalTokens = chatResponse.getMetadata().getUsage().getTotalTokens();
        return new ChatResponseDTO(answer, promptTokens, completionTokens, totalTokens);
    }
}

package com.jam.demo.controller;

import com.jam.demo.dto.ChatRequestDTO;
import com.jam.demo.dto.ChatResponseDTO;
import com.jam.demo.service.ChatService;
import io.swagger.v3.oas.annotations.Operation;
import io.swagger.v3.oas.annotations.Parameter;
import io.swagger.v3.oas.annotations.tags.Tag;
import jakarta.validation.Valid;
import org.springframework.http.ResponseEntity;
import org.springframework.web.bind.annotation.*;

@RestController
@RequestMapping("/chat")
@Tag(name = "大模型对话接口", description = "SpringAI 对接大模型的基础对话能力接口")
public class ChatController {
    private final ChatService chatService;

    public ChatController(ChatService chatService) {
        this.chatService = chatService;
    }

    @PostMapping("/single")
    @Operation(summary = "单轮对话接口", description = "基础的大模型单轮对话能力，无上下文记忆")
    public ResponseEntity<ChatResponseDTO> singleChat(@Valid @RequestBody ChatRequestDTO requestDTO) {
        return ResponseEntity.ok(chatService.singleChat(requestDTO));
    }

    @PostMapping("/system")
    @Operation(summary = "带系统提示词的对话接口", description = "可自定义系统提示词，指定大模型的角色与行为规范")
    public ResponseEntity<ChatResponseDTO> chatWithSystemPrompt(
            @Valid @RequestBody ChatRequestDTO requestDTO,
            @Parameter(description = "系统提示词", example = "你是一名资深 Java 技术专家，只回答 Java 相关的技术问题，回答要专业、简洁、有代码示例")
            @RequestParam(required = false) String systemPrompt
    ) {
        return ResponseEntity.ok(chatService.chatWithSystemPrompt(requestDTO, systemPrompt));
    }
}

/**
 * 流式对话，返回 Flux 响应流
 */
public reactor.core.publisher.Flux<String> streamChat(ChatRequestDTO requestDTO) {
    String query = requestDTO.getQuery();
    log.info("收到流式对话请求，query:{}", query);
    return chatClient.prompt()
            .user(query)
            .stream()
            .content();
}

@GetMapping(value = "/stream", produces = "text/event-stream")
@Operation(summary = "流式对话接口", description = "SSE 流式响应，实现打字机效果，支持前端实时渲染")
public reactor.core.publisher.Flux<String> streamChat(
        @Parameter(description = "用户提问内容", required = true)
        @RequestParam String query
) {
    ChatRequestDTO requestDTO = new ChatRequestDTO();
    requestDTO.setQuery(query);
    return chatService.streamChat(requestDTO);
}

package com.jam.demo.util;

import lombok.extern.slf4j.Slf4j;
import org.apache.tika.Tika;
import org.apache.tika.exception.TikaException;
import org.springframework.stereotype.Component;
import org.springframework.util.ObjectUtils;
import java.io.IOException;
import java.io.InputStream;

@Component
@Slf4j
public class DocumentParseUtil {
    private final Tika tika = new Tika();

    /**
     * 从输入流中提取文档文本内容
     */
    public String extractText(InputStream inputStream, String fileName) throws IOException, TikaException {
        if (ObjectUtils.isEmpty(inputStream)) {
            throw new IllegalArgumentException("文档输入流不能为空");
        }
        if (!org.springframework.util.StringUtils.hasText(fileName)) {
            throw new IllegalArgumentException("文档名称不能为空");
        }
        log.info("开始解析文档，fileName:{}", fileName);
        String content = tika.parseToString(inputStream);
        log.info("文档解析完成，文本长度:{}", content.length());
        return content;
    }
}

package com.jam.demo.util;

import org.springframework.ai.document.Document;
import org.springframework.ai.transformer.splitter.TokenTextSplitter;
import org.springframework.stereotype.Component;
import org.springframework.util.StringUtils;
import java.util.List;
import java.util.Map;

@Component
public class TextSplitUtil {
    private static final int DEFAULT_CHUNK_SIZE = 800;
    private static final int DEFAULT_CHUNK_OVERLAP = 150;
    private final TokenTextSplitter tokenTextSplitter = new TokenTextSplitter(
            DEFAULT_CHUNK_SIZE,
            DEFAULT_CHUNK_OVERLAP,
            5,
            10000
    );

    /**
     * 文本分块处理
     */
    public List<Document> splitText(String content, Map<String, Object> metadata) {
        if (!StringUtils.hasText(content)) {
            throw new IllegalArgumentException("文本内容不能为空");
        }
        Document document = new Document(content, metadata);
        return tokenTextSplitter.split(List.of(document));
    }
}

package com.jam.demo.entity;

import com.baomidou.mybatisplus.annotation.IdType;
import com.baomidou.mybatisplus.annotation.TableId;
import com.baomidou.mybatisplus.annotation.TableName;
import io.swagger.v3.oas.annotations.media.Schema;
import lombok.Data;
import java.time.LocalDateTime;

@Data
@TableName("tb_document_chunk")
@Schema(description = "文档分块元数据实体")
public class DocumentChunkEntity {
    @TableId(type = IdType.AUTO)
    @Schema(description = "主键 ID")
    private Long id;
    @Schema(description = "文档唯一 ID")
    private String documentId;
    @Schema(description = "文档名称")
    private String documentName;
    @Schema(description = "分块序号")
    private Integer chunkId;
    @Schema(description = "分块文本内容")
    private String chunkContent;
    @Schema(description = "对应向量数据库的向量 ID")
    private String vectorId;
    @Schema(description = "创建时间")
    private LocalDateTime createTime;
    @Schema(description = "更新时间")
    private LocalDateTime updateTime;
}

package com.jam.demo.mapper;

import com.baomidou.mybatisplus.core.mapper.BaseMapper;
import com.jam.demo.entity.DocumentChunkEntity;
import org.apache.ibatis.annotations.Mapper;

@Mapper
public interface DocumentChunkMapper extends BaseMapper<DocumentChunkEntity> {
}

package com.jam.demo.service;

import com.jam.demo.entity.DocumentChunkEntity;
import com.jam.demo.mapper.DocumentChunkMapper;
import com.jam.demo.util.DocumentParseUtil;
import com.jam.demo.util.TextSplitUtil;
import com.google.common.collect.Lists;
import lombok.extern.slf4j.Slf4j;
import org.apache.tika.exception.TikaException;
import org.springframework.ai.document.Document;
import org.springframework.ai.vectorstore.VectorStore;
import org.springframework.stereotype.Service;
import org.springframework.transaction.TransactionTemplate;
import org.springframework.util.CollectionUtils;
import org.springframework.util.StringUtils;
import org.springframework.web.multipart.MultipartFile;
import java.io.IOException;
import java.util.List;
import java.util.Map;
import java.util.UUID;

@Service
@Slf4j
public class DocumentIndexService {
    private final DocumentParseUtil documentParseUtil;
    private final TextSplitUtil textSplitUtil;
    private final VectorStore vectorStore;
    private final DocumentChunkMapper documentChunkMapper;
    private final TransactionTemplate transactionTemplate;

    public DocumentIndexService(DocumentParseUtil documentParseUtil, TextSplitUtil textSplitUtil,
                                VectorStore vectorStore, DocumentChunkMapper documentChunkMapper,
                                TransactionTemplate transactionTemplate) {
        this.documentParseUtil = documentParseUtil;
        this.textSplitUtil = textSplitUtil;
        this.vectorStore = vectorStore;
        this.documentChunkMapper = documentChunkMapper;
        this.transactionTemplate = transactionTemplate;
    }

    /**
     * 文档上传与索引构建全流程
     */
    public String uploadDocumentAndBuildIndex(MultipartFile file) throws IOException, TikaException {
        if (file.isEmpty()) {
            throw new IllegalArgumentException("上传的文件不能为空");
        }
        String fileName = file.getOriginalFilename();
        if (!StringUtils.hasText(fileName)) {
            throw new IllegalArgumentException("文件名称不能为空");
        }
        String documentId = UUID.randomUUID().toString().replace("-", "");
        log.info("开始处理文档，documentId:{}, fileName:{}", documentId, fileName);
        String content = documentParseUtil.extractText(file.getInputStream(), fileName);
        if (!StringUtils.hasText(content)) {
            throw new RuntimeException("文档内容为空，无法构建索引");
        }
        Map<String, Object> metadata = Map.of("documentId", documentId, "fileName", fileName);
        List<Document> documentList = textSplitUtil.splitText(content, metadata);
        if (CollectionUtils.isEmpty(documentList)) {
            throw new RuntimeException("文本分块结果为空，无法构建索引");
        }
        log.info("文档分块完成，分块数量:{}", documentList.size());
        vectorStore.add(documentList);
        Boolean executeResult = transactionTemplate.execute(status -> {
            try {
                List<DocumentChunkEntity> entityList = Lists.newArrayListWithCapacity(documentList.size());
                for (int i = 0; i < documentList.size(); i++) {
                    Document doc = documentList.get(i);
                    DocumentChunkEntity entity = new DocumentChunkEntity();
                    entity.setDocumentId(documentId);
                    entity.setDocumentName(fileName);
                    entity.setChunkId(i + 1);
                    entity.setChunkContent(doc.getText());
                    entity.setVectorId(doc.getId());
                    entityList.add(entity);
                }
                documentChunkMapper.insertBatch(entityList);
                return Boolean.TRUE;
            } catch (Exception e) {
                status.setRollbackOnly();
                log.error("文档元数据持久化失败，documentId:{}", documentId, e);
                return Boolean.FALSE;
            }
        });
        if (!Boolean.TRUE.equals(executeResult)) {
            List<String> vectorIdList = documentList.stream().map(Document::getId).toList();
            vectorStore.delete(vectorIdList);
            throw new RuntimeException("文档索引构建失败，事务回滚");
        }
        log.info("文档索引构建完成，documentId:{}", documentId);
        return documentId;
    }
}

package com.jam.demo.controller;

import com.jam.demo.service.DocumentIndexService;
import io.swagger.v3.oas.annotations.Operation;
import io.swagger.v3.oas.annotations.tags.Tag;
import org.apache.tika.exception.TikaException;
import org.springframework.http.ResponseEntity;
import org.springframework.web.bind.annotation.PostMapping;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;
import org.springframework.web.multipart.MultipartFile;
import java.io.IOException;

@RestController
@RequestMapping("/document")
@Tag(name = "文档管理接口", description = "RAG 文档上传、索引构建、管理接口")
public class DocumentController {
    private final DocumentIndexService documentIndexService;

    public DocumentController(DocumentIndexService documentIndexService) {
        this.documentIndexService = documentIndexService;
    }

    @PostMapping("/upload")
    @Operation(summary = "文档上传与索引构建", description = "支持 txt、md、pdf、docx 等格式，自动完成解析、分块、向量化、索引构建全流程")
    public ResponseEntity<String> uploadDocument(@RequestParam("file") MultipartFile file) throws IOException, TikaException {
        String documentId = documentIndexService.uploadDocumentAndBuildIndex(file);
        return ResponseEntity.ok("文档索引构建成功，documentId:" + documentId);
    }
}

package com.jam.demo.service;

import lombok.extern.slf4j.Slf4j;
import org.springframework.ai.document.Document;
import org.springframework.ai.vectorstore.SearchRequest;
import org.springframework.ai.vectorstore.VectorStore;
import org.springframework.stereotype.Service;
import org.springframework.util.StringUtils;
import java.util.List;

@Service
@Slf4j
public class RagRetrievalService {
    private final VectorStore vectorStore;
    private static final int DEFAULT_TOP_K = 4;
    private static final double DEFAULT_SIMILARITY_THRESHOLD = 0.75;

    public RagRetrievalService(VectorStore vectorStore) {
        this.vectorStore = vectorStore;
    }

    /**
     * 基础向量相似度检索
     */
    public List<Document> retrieval(String query) {
        if (!StringUtils.hasText(query)) {
            throw new IllegalArgumentException("检索 query 不能为空");
        }
        log.info("开始执行 RAG 检索，query:{}", query);
        SearchRequest searchRequest = SearchRequest.builder()
                .query(query)
                .topK(DEFAULT_TOP_K)
                .similarityThreshold(DEFAULT_SIMILARITY_THRESHOLD)
                .build();
        List<Document> documentList = vectorStore.similaritySearch(searchRequest);
        log.info("RAG 检索完成，召回文档数量:{}", documentList.size());
        return documentList;
    }

    /**
     * 自定义参数的检索
     */
    public List<Document> retrievalWithParams(String query, int topK, double similarityThreshold) {
        if (!StringUtils.hasText(query)) {
            throw new IllegalArgumentException("检索 query 不能为空");
        }
        if (topK < 1 || topK > 20) {
            throw new IllegalArgumentException("topK 必须在 1-20 之间");
        }
        if (similarityThreshold < 0 || similarityThreshold > 1) {
            throw new IllegalArgumentException("相似度阈值必须在 0-1 之间");
        }
        SearchRequest searchRequest = SearchRequest.builder()
                .query(query)
                .topK(topK)
                .similarityThreshold(similarityThreshold)
                .build();
        return vectorStore.similaritySearch(searchRequest);
    }
}

package com.jam.demo.service;

import com.jam.demo.dto.ChatRequestDTO;
import com.jam.demo.dto.ChatResponseDTO;
import lombok.extern.slf4j.Slf4j;
import org.springframework.ai.chat.client.ChatClient;
import org.springframework.ai.chat.model.ChatResponse;
import org.springframework.ai.document.Document;
import org.springframework.stereotype.Service;
import org.springframework.util.CollectionUtils;
import java.util.List;
import java.util.stream.Collectors;

@Service
@Slf4j
public class RagChatService {
    private final ChatClient chatClient;
    private final RagRetrievalService ragRetrievalService;
    private static final String RAG_SYSTEM_PROMPT = """
            你是一名专业的智能问答助手，必须严格基于以下提供的参考上下文回答用户的问题，禁止编造信息。
            1. 如果参考上下文中包含用户问题的答案，必须基于上下文内容进行回答，回答要准确、完整、逻辑清晰
            2. 如果参考上下文中没有用户问题的相关内容，必须直接回答"参考资料中没有找到相关内容，无法为您解答该问题"，禁止编造任何信息
            3. 禁止使用参考上下文以外的任何知识回答问题，即使你知道该问题的答案
            4. 回答时不要提及"参考上下文"、"参考资料"等相关词汇，直接给出答案即可
            以下是参考上下文：
            {context}
            """;

    public RagChatService(ChatClient.Builder chatClientBuilder, RagRetrievalService ragRetrievalService) {
        this.chatClient = chatClientBuilder.build();
        this.ragRetrievalService = ragRetrievalService;
    }

    /**
     * RAG 智能问答核心方法
     */
    public ChatResponseDTO ragChat(ChatRequestDTO requestDTO) {
        String query = requestDTO.getQuery();
        log.info("收到 RAG 问答请求，query:{}", query);
        List<Document> documentList = ragRetrievalService.retrieval(query);
        String context;
        if (CollectionUtils.isEmpty(documentList)) {
            context = "无相关参考内容";
        } else {
            context = documentList.stream()
                    .map(Document::getText)
                    .collect(Collectors.joining("\n\n"));
        }
        String systemPrompt = RAG_SYSTEM_PROMPT.replace("{context}", context);
        ChatResponse chatResponse = chatClient.prompt()
                .system(systemPrompt)
                .user(query)
                .call()
                .chatResponse();
        String answer = chatResponse.getResult().getOutput().getText();
        Long promptTokens = chatResponse.getMetadata().getUsage().getPromptTokens();
        Long completionTokens = chatResponse.getMetadata().getUsage().getCompletionTokens();
        Long totalTokens = chatResponse.getMetadata().getUsage().getTotalTokens();
        log.info("RAG 问答完成，总 Token 消耗:{}", totalTokens);
        return new ChatResponseDTO(answer, promptTokens, completionTokens, totalTokens);
    }
}

@PostMapping("/rag")
@Operation(summary = "RAG 智能问答接口", description = "基于私有知识库的智能问答，自动检索相关内容，避免大模型幻觉")
public ResponseEntity<ChatResponseDTO> ragChat(@Valid @RequestBody ChatRequestDTO requestDTO) {
    return ResponseEntity.ok(ragChatService.ragChat(requestDTO));
}

踩坑场景	问题原因	解决方案
向量写入失败，提示维度不匹配	Embedding 模型的输出维度与向量数据库配置的维度不一致	切换 Embedding 模型时，必须同步修改向量数据库的维度配置，确保完全一致
大模型频繁出现幻觉，回答与上下文不符	系统提示词约束不足、检索噪声太多、分块不合理	优化系统提示词，严格约束大模型行为；提高相似度阈值，减少噪声；优化分块策略，确保分块语义完整
流式接口响应中断，出现乱码	前端 EventSource 配置错误、后端响应格式不符合 SSE 规范	确保接口 produces 设置为 text/event-stream，每条响应以 data: 开头，以\n\n结尾；前端 EventSource 配置正确的 URL
文档解析乱码，中文显示异常	文档编码格式不兼容、Tika 解析器配置错误	升级 Tika 到最新稳定版，指定文档编码格式为 UTF-8；避免上传加密的文档
高并发场景下大模型接口限流	大模型 API 有 QPS 限制，并发请求超过阈值	集成 Resilience4j 实现限流与重试机制；采用请求池化，控制并发请求数量；异步处理非实时请求
向量数据库检索性能下降	向量数据量过大，索引配置不合理	针对海量数据采用 HNSW 索引，优化索引参数；对数据进行分区，按时间或业务维度分区检索；定期优化向量数据库的索引

SpringAI 全栈开发 + RAG 检索增强实战

前言

一、核心技术栈底层原理与选型

1.1 SpringAI 核心架构与设计理念

1.2 RAG 检索增强生成底层逻辑

1.3 技术栈选型与版本规范

SpringAI 全栈开发 + RAG 检索增强实战

前言

一、核心技术栈底层原理与选型

1.1 SpringAI 核心架构与设计理念

1.2 RAG 检索增强生成底层逻辑

1.3 技术栈选型与版本规范

更多推荐文章

相关免费在线工具

二、项目初始化与环境搭建

2.1 Maven 核心依赖配置

2.2 应用配置文件

2.3 数据库初始化

2.4 项目启动类与基础配置

三、SpringAI 对接大模型 API 全实战

3.1 SpringAI 大模型核心抽象

3.2 基础对话能力实现

3.3 流式对话能力实现

四、RAG 技术栈全链路落地实战

4.1 文档处理与文本分块

4.2 文档元数据实体与 Mapper 层

4.3 文档上传与索引构建服务

4.4 智能检索与 RAG 问答实现

五、生产级最佳实践与踩坑指南

5.1 RAG 效果优化核心方案

5.2 常见踩坑与解决方案

5.3 安全与合规最佳实践

六、总结

更多推荐文章

相关免费在线工具

SpringAI 全栈开发 + RAG 检索增强实战

前言

一、核心技术栈底层原理与选型

1.1 SpringAI 核心架构与设计理念

1.2 RAG 检索增强生成底层逻辑

1.3 技术栈选型与版本规范

SpringAI 全栈开发 + RAG 检索增强实战

前言

一、核心技术栈底层原理与选型

1.1 SpringAI 核心架构与设计理念

1.2 RAG 检索增强生成底层逻辑

1.3 技术栈选型与版本规范

微信扫一扫，关注极客日志

更多推荐文章

相关免费在线工具

二、项目初始化与环境搭建

2.1 Maven 核心依赖配置

2.2 应用配置文件

2.3 数据库初始化

2.4 项目启动类与基础配置

三、SpringAI 对接大模型 API 全实战

3.1 SpringAI 大模型核心抽象

3.2 基础对话能力实现

3.3 流式对话能力实现

四、RAG 技术栈全链路落地实战

4.1 文档处理与文本分块

4.2 文档元数据实体与 Mapper 层

4.3 文档上传与索引构建服务

4.4 智能检索与 RAG 问答实现

五、生产级最佳实践与踩坑指南

5.1 RAG 效果优化核心方案

5.2 常见踩坑与解决方案

5.3 安全与合规最佳实践

六、总结

微信扫一扫，关注极客日志

更多推荐文章

相关免费在线工具