Spring AI：Java 开发者的AI 应用开发利器

Ne0inhk

15 Mar 2026 — 15 min read

在生成式 AI 席卷行业的今天，Java 开发者常常面临一个尴尬的困境：想给现有 Spring 项目集成 AI 能力，却要被迫学习 Python 生态的 LangChain、LlamaIndex，还要反复适配 OpenAI、通义千问等不同模型的 API 格式——这就像用熟悉的工具拧陌生的螺丝，效率低下且容易出错。

而 Spring AI 的出现，彻底改变了这一现状。作为 Spring 生态官方推出的企业级 AI 框架，它将 Spring 一贯的“抽象解耦”“开箱即用”设计哲学延伸到 AI 领域，让 Java 开发者无需切换技术栈，就能用熟悉的 Spring 风格快速构建稳定、可扩展的 AI 应用。本文将从核心认知、实战案例到企业级实践，带您全面掌握 Spring AI。

一、Spring AI 核心认知：它到底解决了什么问题？

在动手之前，我们先搞懂一个关键问题：Spring AI 不是“另一个 AI 模型”，而是“连接企业系统与 AI 模型的桥梁”。它的核心价值，就是解决 Java 开发者集成 AI 时的三大痛点：

1.1 为什么需要 Spring AI？

痛点1：模型API碎片化：OpenAI 的 ChatCompletion、通义千问的 ChatCompletions、Gemini 的 GenerateContent，每个模型的 API 格式、参数名、返回结构都不同，切换模型就要重构代码。
痛点2：与 Spring 生态脱节：传统 AI 框架（如 LangChain）是 Python 生态的产物，集成到 Spring Boot 项目中需要大量“胶水代码”，无法利用 Spring 的依赖注入、自动配置、安全等特性。
痛点3：企业级能力缺失：生产环境需要的 API 密钥安全管理、调用监控、重试降级、向量存储集成等，传统框架要么不支持，要么需要手动实现。

Spring AI 则针对性解决这些问题：

统一 API 抽象：用 ChatClient 对接所有聊天模型，EmbeddingModel 对接所有嵌入模型，切换模型只需改配置。
原生 Spring 体验：支持自动配置（@EnableAutoConfiguration）、依赖注入，与 Spring Boot、Spring Cloud 无缝集成。
企业级特性内置：提供可观测性（Micrometer 监控）、安全（密钥加密）、工具调用、RAG 流程编排等生产级能力。

1.2 核心组件与分层架构

Spring AI 的架构遵循“分层职责分离”原则，从下到上可分为 4 层，每一层都解决特定问题：

层级	核心组件	职责描述
外部服务层	OpenAI/通义千问/Ollama 等	实际提供 AI 能力的模型或向量数据库（如 Pinecone、Milvus）
数据支撑层	VectorStore、DocumentReader	处理企业数据：向量存储（存嵌入向量）、文档读取器（解析 PDF/Word）、文本分块器
模型抽象层	ChatClient、EmbeddingModel	统一 AI 能力接口：屏蔽不同模型的 API 差异，提供同步/流式调用能力
功能增强层	PromptTemplate、RAG 编排	提升 AI 效果：结构化提示、检索增强生成、工具调用（如调用企业 API）
应用层	智能客服、文档问答等	基于上述组件构建的业务应用

用一张简单的架构图理解：

1.3 与主流 AI 框架的对比

很多开发者会问：Spring AI 和 LangChain 有什么区别？下表清晰对比了核心差异：

框架	语言支持	Spring 集成度	企业级特性	适用场景
Spring AI	Java	★★★★★（原生）	★★★★★（内置）	Java 企业级 AI 应用、现有 Spring 项目升级
LangChain	Python	★★☆☆☆（需胶水代码）	★★★☆☆（需扩展）	Python 生态的 AI 原型开发、个人项目
LangChain4j	Java	★★★☆☆（部分支持）	★★★☆☆（基础）	简单 Java AI 应用，无需复杂企业特性
Dify	低代码	★★☆☆☆（API 对接）	★★★★☆（部分）	非技术人员快速搭建 AI 应用

结论：如果您是 Java 开发者，或需要将 AI 集成到 Spring 生态的企业级项目中，Spring AI 是最优选择。

二、实战上手：从 0 到 1 构建 Spring AI 应用

接下来，我们通过 3 个实战案例，从基础到进阶掌握 Spring AI：

基础案例：普通聊天（对接 OpenAI/通义千问）
进阶案例：流式聊天（实时输出，如 ChatGPT 打字效果）
企业级案例：RAG 文档问答（结合向量存储，让 AI 读你的文档）

2.1 环境准备

首先确认开发环境满足以下要求：

JDK 17+（Spring AI 最低要求，推荐 JDK 17）
Spring Boot 3.2+（需与 Spring AI 版本兼容，本文用 3.2.5）
Maven 3.8+ 或 Gradle 8.0+
一个 AI 模型的 API 密钥（如 OpenAI API Key、通义千问 API Key）

2.2 基础案例 1：普通聊天（对接 OpenAI）

需求：构建一个 HTTP 接口，接收用户问题，返回 AI 的回答。

步骤 1：添加依赖

在 pom.xml 中引入 Spring AI OpenAI starter 和 Spring Web 依赖：

<<dependencies><!-- Spring AI OpenAI starter（自动配置 ChatClient） --><dependency><groupId>org.springframework.ai</groupId><artifactId>spring-ai-openai-spring-boot-starter</artifactId><version>1.0.0-M1</version><!-- 最新稳定版可查官网 --></dependency><!-- Spring Web（提供 HTTP 接口） --><dependency><groupId>org.springframework.boot</groupId><artifactId>spring-boot-starter-web</artifactId></dependency> </</dependencies>

步骤 2：配置 API 密钥

在 src/main/resources/application.properties 中配置 OpenAI 密钥和模型：

# OpenAI API 配置 spring.ai.openai.api-key=sk-your-openai-api-key # 替换成你的密钥 spring.ai.openai.chat.model=gpt-3.5-turbo # 模型名称 spring.ai.openai.chat.temperature=0.7 # 创造性：0（严谨）~1（灵活）

步骤 3：编写 Controller

利用 Spring AI 自动注入的 ChatClient，编写一个简单的 HTTP 接口：

importorg.springframework.ai.chat.client.ChatClient;importorg.springframework.web.bind.annotation.GetMapping;importorg.springframework.web.bind.annotation.RequestParam;importorg.springframework.web.bind.annotation.RestController;@RestControllerpublicclassAIChatController{// Spring AI 自动注入 ChatClient，无需手动创建privatefinalChatClient chatClient;// 构造函数注入（Spring 推荐方式）publicAIChatController(ChatClient chatClient){this.chatClient = chatClient;}/** * 普通聊天接口：接收问题，返回 AI 回答 * 访问示例：http://localhost:8080/ai/chat?question=用Java写HelloWorld */@GetMapping("/ai/chat")publicStringchat(@RequestParamString question){// 调用 AI：链式调用，简洁明了return chatClient.prompt()// 开始构建提示.user(question)// 设置用户问题.call()// 同步调用 AI.content();// 获取回答内容}}

步骤 4：测试接口

启动 Spring Boot 应用（主类加 @SpringBootApplication 即可）。
用浏览器或 Postman 访问：http://localhost:8080/ai/chat?question=用Java写HelloWorld。

预期返回：

publicclassHelloWorld{publicstaticvoidmain(String[] args){System.out.println("Hello, World!");}}

拓展：切换到国内模型（通义千问）

如果无法访问 OpenAI，只需修改依赖和配置，无需改业务代码：

重新启动，访问相同接口，即可得到通义千问的回答——这就是 Spring AI 统一 API 的魅力！

修改配置文件：

# 通义千问配置 spring.ai.tongyi.api-key=your-tongyi-api-key # 替换成你的通义千问密钥 spring.ai.tongyi.chat.model=qwen-turbo # 通义千问的模型名称

替换依赖为通义千问 starter：

<dependency><groupId>org.springframework.ai</groupId><artifactId>spring-ai-tongyi-spring-boot-starter</artifactId><version>1.0.0-M1</version></dependency>

2.3 进阶案例 2：流式聊天（实时输出）

普通聊天需要等待 AI 生成完整回答后才返回，体验不佳。流式聊天则像 ChatGPT 一样，AI 生成一句就输出一句，响应更快。

Spring AI 用 响应式编程（Flux） 实现流式输出，步骤如下：

步骤 1：修改 Controller 接口

在原有 Controller 中添加流式接口，注意设置 produces = MediaType.TEXT_EVENT_STREAM_VALUE（SSE 协议，支持服务器实时推送）：

importorg.springframework.http.MediaType;importorg.springframework.web.bind.annotation.GetMapping;importorg.springframework.web.bind.annotation.RequestParam;importreactor.core.publisher.Flux;@RestControllerpublicclassAIChatController{// 省略已有代码.../** * 流式聊天接口：实时输出 AI 回答 * 访问示例：http://localhost:8080/ai/chat/stream?question=解释Spring Boot自动配置原理 */@GetMapping( value ="/ai/chat/stream", produces =MediaType.TEXT_EVENT_STREAM_VALUE // 启用 SSE 流式输出)publicFlux<String>streamChat(@RequestParamString question){// 关键：用 stream() 替代 call()，返回 Flux<ChatResponse>return chatClient.prompt().user(question).stream()// 开启流式调用.map(response ->{// 提取每一段生成的内容return response.getResult().getOutput().getContent();});}}

步骤 2：测试流式效果

启动应用后，用浏览器访问 http://localhost:8080/ai/chat/stream?question=解释Spring Boot自动配置原理。
你会看到页面上文字“逐字逐句”出现，就像 ChatGPT 的实时输出效果。

前端集成提示：如果要在前端页面展示，只需用 JavaScript 监听 SSE 事件：

<!DOCTYPEhtml><html><body><divid="answer"></div><script>// 建立 SSE 连接const eventSource =newEventSource("http://localhost:8080/ai/chat/stream?question=解释Spring Boot自动配置原理");// 监听消息推送 eventSource.onmessage=function(event){ document.getElementById("answer").innerText += event.data;};// 监听连接关闭 eventSource.onclose=function(){ console.log("流式连接已关闭");};</script></body></html>

2.4 企业级案例 3：RAG 文档问答（让 AI 读你的文档）

普通聊天的问题是 AI 知识可能过时，且无法回答企业内部文档（如产品手册、API 文档）的问题。RAG（检索增强生成） 则通过“先检索企业文档，再让 AI 基于文档回答”，解决了这一问题。

我们以“PDF 文档问答”为例，用 Spring AI + Pinecone（向量数据库）实现 RAG：

步骤 1：核心流程理解

RAG 分为“数据准备”和“问答”两个阶段：

数据准备阶段：
- 读取 PDF 文档 → 分割成小文本块（避免超过模型上下文限制）→ 用嵌入模型生成向量 → 存储到 Pinecone。
问答阶段：
- 用户提问 → 生成问题向量 → 从 Pinecone 检索相似文本块 → 把“相似文本块+用户问题”作为提示给 AI → AI 生成基于文档的回答。

步骤 2：添加依赖

除了之前的依赖，还需引入 PDF 解析和 Pinecone 向量存储依赖：

<<dependencies><!-- 省略已有依赖... --><!-- PDF 文档解析 --><dependency><groupId>org.springframework.ai</groupId><artifactId>spring-ai-pdf-document-reader</artifactId><version>1.0.0-M1</version></dependency><!-- Pinecone 向量存储 --><dependency><groupId>org.springframework.ai</groupId><artifactId>spring-ai-pinecone-store-spring-boot-starter</artifactId><version>1.0.0-M1</version></dependency> </</dependencies>

步骤 3：配置 Pinecone

在 application.properties 中添加 Pinecone 配置（需先在 Pinecone 官网创建索引）：

# Pinecone 配置 spring.ai.pinecone.api-key=your-pinecone-api-key # 你的 Pinecone 密钥 spring.ai.pinecone.environment=gcp-starter # Pinecone 环境（如 gcp-starter） spring.ai.pinecone.index-name=spring-ai-demo # 你的 Pinecone 索引名 # 嵌入模型配置（用 OpenAI 的嵌入模型生成向量） spring.ai.openai.embedding.model=text-embedding-3-small spring.ai.openai.embedding.dimensions=1536 # 向量维度（需与 Pinecone 索引一致）

步骤 4：编写 RAG 服务类

创建 RagService，封装“数据准备”和“问答”逻辑：

importorg.springframework.ai.document.Document;importorg.springframework.ai.embedding.EmbeddingClient;importorg.springframework.ai.openai.OpenAiEmbeddingClient;importorg.springframework.ai.pinecone.PineconeVectorStore;importorg.springframework.ai.reader.pdf.PdfDocumentReader;importorg.springframework.ai.transformer.splitter.TextSplitter;importorg.springframework.ai.transformer.splitter.TokenTextSplitter;importorg.springframework.ai.chat.client.ChatClient;importorg.springframework.beans.factory.annotation.Value;importorg.springframework.core.io.Resource;importorg.springframework.stereotype.Service;importjava.util.List;@ServicepublicclassRagService{privatefinalPineconeVectorStore vectorStore;privatefinalChatClient chatClient;privatefinalEmbeddingClient embeddingClient;// 注入 PDF 资源（将你的 PDF 放在 src/main/resources/docs/ 目录下）@Value("classpath:docs/spring-ai-manual.pdf")privateResource pdfResource;// 构造函数注入依赖publicRagService(PineconeVectorStore vectorStore,ChatClient chatClient,EmbeddingClient embeddingClient){this.vectorStore = vectorStore;this.chatClient = chatClient;this.embeddingClient = embeddingClient;}/** * 数据准备：读取 PDF → 分块 → 生成向量 → 存入 Pinecone */publicvoidprepareDocumentData(){try{// 1. 读取 PDF 文档PdfDocumentReader pdfReader =newPdfDocumentReader(pdfResource);List<Document> documents = pdfReader.read();// 2. 分割文本块（每块 500 个token，重叠 50 个token，避免语义断裂）TextSplitter textSplitter =newTokenTextSplitter(500,50);List<Document> splitDocuments = textSplitter.split(documents);// 3. 将文本块存入 Pinecone（自动生成向量） vectorStore.add(splitDocuments);System.out.println("PDF 文档处理完成，共存入 "+ splitDocuments.size()+" 个文本块");}catch(Exception e){thrownewRuntimeException("PDF 文档处理失败", e);}}/** * RAG 问答：基于 Pinecone 检索的文档回答问题 */publicStringragChat(String question){// 1. 检索相似文本块（取 top3 最相关的）List<Document> similarDocs = vectorStore.similaritySearch(question,3);// 2. 构建提示：将相似文档内容和用户问题组合String prompt =String.format(""" 基于以下文档内容回答用户问题，不要编造信息： 文档内容：%s 用户问题：%s """, similarDocs.stream().map(Document::getContent).reduce((a, b)-> a +"\n"+ b).orElse(""), question);// 3. 调用 AI 生成回答return chatClient.prompt().user(prompt).call().content();}}

步骤 5：编写 RAG 接口

在 Controller 中添加 RAG 相关接口：

importorg.springframework.web.bind.annotation.*;@RestController@RequestMapping("/ai/rag")publicclassRagController{privatefinalRagService ragService;publicRagController(RagService ragService){this.ragService = ragService;}/** * 初始化文档数据：仅需调用一次 * 访问示例：http://localhost:8080/ai/rag/init */@GetMapping("/init")publicStringinitDocument(){ ragService.prepareDocumentData();return"文档初始化完成！";}/** * RAG 问答接口：基于 PDF 文档回答问题 * 访问示例：http://localhost:8080/ai/rag/chat?question=Spring AI 如何集成向量存储 */@GetMapping("/chat")publicStringragChat(@RequestParamString question){return ragService.ragChat(question);}}

步骤 6：测试 RAG 效果

先调用初始化接口：http://localhost:8080/ai/rag/init，等待 PDF 处理完成。
再调用问答接口：http://localhost:8080/ai/rag/chat?question=Spring AI 如何集成向量存储。
预期返回：AI 会基于你上传的 PDF 文档内容，准确回答问题，而不是依赖通用知识。

三、企业级实践：让 Spring AI 应用更稳定、更安全

完成基础开发后，还需要考虑生产环境的关键问题：

3.1 API 密钥安全管理

硬编码 API 密钥（如 spring.ai.openai.api-key=xxx）存在泄露风险，企业级项目建议：

方式2：Spring Cloud Config 集中管理：将密钥存储在配置中心（如 Nacos、Apollo），动态刷新配置。
方式3：加密存储：用 Spring Cloud Config Server + JCE 加密配置文件中的敏感信息。

方式1：环境变量注入：在服务器上设置环境变量 OPENAI_API_KEY，配置文件中引用：

spring.ai.openai.api-key=${OPENAI_API_KEY:default-key} # 冒号后是默认值（可选）

3.2 可观测性：监控 AI 调用

Spring AI 内置了 Micrometer 监控支持，可跟踪 AI 调用的次数、响应时间、错误率：

启动后访问 http://localhost:8080/actuator/prometheus，即可看到 AI 调用的监控指标（如 spring_ai_chat_client_calls_seconds_count）。

配置监控端点：

# 暴露 Prometheus 端点 management.endpoints.web.exposure.include=prometheus,health,info management.metrics.export.prometheus.enabled=true

添加 Actuator 依赖：

<dependency><groupId>org.springframework.boot</groupId><artifactId>spring-boot-starter-actuator</artifactId></dependency><dependency><groupId>io.micrometer</groupId><artifactId>micrometer-registry-prometheus</artifactId></dependency>

3.3 性能优化：向量缓存与批量处理

向量缓存：频繁查询相同问题时，缓存检索结果（用 Redis），避免重复调用向量数据库。
批量处理：若需处理大量文档，用 vectorStore.addAll() 批量添加文本块，减少网络请求次数。
模型选择：非关键场景用轻量模型（如 gpt-3.5-turbo、qwen-turbo），降低延迟和成本。

四、总结：Spring AI 开启 Java 开发者的 AI 新时代

Spring AI 不是简单地“封装 AI API”，而是将 AI 能力与 Spring 生态深度融合，让 Java 开发者能以最低成本拥抱生成式 AI。它的核心价值在于：

降低门槛：用熟悉的 Spring 风格开发 AI 应用，无需学习新语言或框架。
提升效率：统一 API 抽象、自动配置、内置企业级特性，减少重复开发。
保障稳定：支持可观测性、安全、降级，满足生产环境要求。

对于 Java 开发者而言，Spring AI 就像一把“瑞士军刀”——无论是简单的聊天功能，还是复杂的 RAG 文档问答，都能快速实现。未来，随着 Spring AI 对多模态（图像、语音）、Agent 能力的进一步增强，它将成为企业级 AI 应用开发的事实标准。

现在就动手吧：从一个简单的聊天接口开始，逐步探索 RAG、工具调用等高级特性，让你的 Spring 项目拥有智能大脑！