LangChain 实战：工具调用与结构化输出 | 极客日志

PythonAI算法

LangChain 实战：工具调用与结构化输出

LangChain 工具调用与结构化输出是构建智能应用的关键。了三种工具创建方式及本地/第三方集成流程，对比了 Pydantic、TypedDict 和 JSON Schema 三种结构化输出方案。通过简历解析、意图识别及天气助手等实际案例，展示了如何将非结构化数据转化为规范格式，实现 AI 从对话到执行的跨越。

片刻发布于 2026/3/22更新于 2026/6/2426 浏览

LangChain 实战：工具调用与结构化输出

工具调用是 LangChain 的核心功能之一，允许 AI 模型调用外部函数或 API 来完成特定任务。比如获取天气情况时，LLM 无法直接获取实时信息，借助工具通过外部服务搜索就能完成查询。

工具调用示意图

再如获取数据库表数据，LLM 无法直接读取，此时可借助工具与数据库交互完成查询。

数据库交互示意图

工具创建方式

直接用 `@tool` 装饰函数

最简单的方式，适合小工具。

from langchain_core.tools import tool

@tool
def add(a: int, b: int) -> int:
    return a + b

用 `@tool` + 自定义参数结构（Pydantic）

参数更清晰，能写详细说明。

from pydantic import BaseModel
from langchain_core.tools import tool

class AddInput(BaseModel):
    a: int
    b: int

@tool(args_schema=AddInput)
def add(a: int, b: int) -> int:
    return a + b

继承 `BaseTool` 写类

相关免费在线工具

加密/解密文本
使用加密算法（如AES、TripleDES、Rabbit或RC4）加密和解密文本明文。在线工具，加密/解密文本在线工具，online
RSA密钥对生成器
生成新的随机RSA私钥和公钥pem证书。在线工具，RSA密钥对生成器在线工具，online
Mermaid 预览与可视化编辑
基于 Mermaid.js 实时预览流程图、时序图等图表，支持源码编辑与即时渲染。在线工具，Mermaid 预览与可视化编辑在线工具，online
随机西班牙地址生成器
随机生成西班牙地址（支持马德里、加泰罗尼亚、安达卢西亚、瓦伦西亚筛选），支持数量快捷选择、显示全部与下载。在线工具，随机西班牙地址生成器在线工具，online
Gemini 图片去水印
基于开源反向 Alpha 混合算法去除 Gemini/Nano Banana 图片水印，支持批量处理与下载。在线工具，Gemini 图片去水印在线工具，online
curl 转代码
解析常见 curl 参数并生成 fetch、axios、PHP curl 或 Python requests 示例代码。在线工具，curl 转代码在线工具，online

from langchain_core.tools import BaseTool

class AddTool(BaseTool):
    name = "add"
    description = "Adds two numbers"

    def _run(self, a: int, b: int) -> int:
        return a + b

from langchain_core.tools import tool

@tool
def sum_to_n(n: int) -> int:
    """计算从 0 累加到 n 的结果"""
    total = 0
    for i in range(n + 1):
        total += i
    return total

from langchain_deepseek import ChatDeepSeek

llm = ChatDeepSeek(
    model_name="deepseek-chat",
    api_key="your-api-key",
    temperature=0.7,
    max_tokens=8192,
)
# 绑定工具
bound_llm = llm.bind_tools([sum_to_n])

from langchain_core.messages import HumanMessage, SystemMessage, ToolMessage

# 1. 构建用户消息
messages = [
    SystemMessage(content="你是一个数学助手。当用户给你一个数字 n 时，使用 sum_to_n 工具计算从 0 累加到 n 的结果。"),
    HumanMessage(content="10"),
]

# 2. 模型决定是否调用工具
ai_message = bound_llm.invoke(messages)
messages.append(ai_message)

# 3. 执行工具调用
for tool_call in ai_message.tool_calls:
    if tool_call["name"] == "sum_to_n":
        result = sum_to_n.invoke(tool_call["args"])
        messages.append(
            ToolMessage(content=str(result), tool_call_id=tool_call["id"])
        )

# 4. 获取最终回复
final_msg = bound_llm.invoke(messages).content
print(final_msg)

{
    'content': '我来计算从 0 累加到 10 的结果。',
    'tool_calls': [
        {
            'name': 'sum_to_n',
            'args': {'n': 10},
            'id': 'call_00_xxx',
            'type': 'tool_call'
        }
    ],
    'response_metadata': {
        'token_usage': {...},
        'model_name': 'deepseek-chat',
        'finish_reason': 'tool_calls'
    }
}

from langchain_tavily import TavilySearch

# 创建 Tavily 搜索工具
tavily_tool = TavilySearch(
    max_results=4,
    tavily_api_key="your-tavily-api-key"
)
# 绑定到模型
bound_llm = llm.bind_tools([tavily_tool])

messages = [
    SystemMessage(content="你是一个天气助手。使用 tavily_search 工具搜索天气情况。"),
    HumanMessage(content="2026 年 3 月 2 日北京天气情况"),
]

# 工具调用循环（最多 3 轮）
max_rounds = 3
for round_num in range(1, max_rounds + 1):
    # 调用模型
    ai_message = bound_llm.invoke(messages)
    messages.append(ai_message)

    # 如果没有工具调用，说明得到最终答案了
    if not ai_message.tool_calls:
        print(ai_message.content)
        break

    # 执行工具调用
    for tool_call in ai_message.tool_calls:
        result = tavily_tool.invoke(tool_call["args"])
        tool_message = ToolMessage(
            content=str(result), tool_call_id=tool_call["id"]
        )
        messages.append(tool_message)

from pydantic import BaseModel, Field

class TestOutput(BaseModel):
    """城市信息的 Pydantic 模型"""
    test: str = Field(description="这个城市现在的详细天气情况，包括天气状况、温度、风力等")
    test2: str = Field(description="城市名称")
    test3: str = Field(description="在这个城市出生的一个著名名人，包括简短介绍")

from langchain_deepseek import ChatDeepSeek

llm_deepseek = ChatDeepSeek(
    model_name="deepseek-chat",
    api_key="your-api-key",
    base_url="https://api.deepseek.cn/v1",
)

# 绑定结构化输出
struct_output_model = llm_deepseek.with_structured_output(TestOutput)

# 调用模型
result = struct_output_model.invoke("上海")

# 访问结构化数据
print(f"天气：{result.test}")
print(f"城市：{result.test2}")
print(f"名人：{result.test3}")

# 返回的是 Pydantic 对象
test='上海现在天气晴朗，温度 25°C，东南风 3-4 级，湿度 65%，空气质量良好'
test2='上海'
test3='姚明，中国著名篮球运动员，1980 年出生于上海，曾效力于 NBA 休斯顿火箭队'

from typing import TypedDict, Annotated

class TestOutputDict(TypedDict):
    """城市信息的字典类型"""
    weather: Annotated[str, "这个城市现在的详细天气情况，包括天气状况、温度、风力等"]
    city_name: Annotated[str, "城市名称"]
    famous_person: Annotated[str, "在这个城市出生的一个著名名人，包括简短介绍"]
    population: Annotated[int, "城市人口数量（万人）"]

# 绑定结构化输出
struct_output_model_dict = llm_deepseek.with_structured_output(TestOutputDict)

# 调用模型
result_dict = struct_output_model_dict.invoke("上海")

# 访问字典数据
print(f"天气：{result_dict['weather']}")
print(f"城市：{result_dict['city_name']}")
print(f"人口：{result_dict['population']} 万人")

# 返回的是纯字典
{'city_name': '上海', 'weather': '上海现在天气晴朗，温度 18-25°C，东南风 3-4 级', 'population': 2487, 'famous_person': '姚明，中国著名篮球运动员，前 NBA 休斯顿火箭队球员'}

import json

json_schema = {
    "title": "CityInfo",
    "description": "城市信息的 JSON 格式",
    "type": "object",
    "properties": {
        "city": {"type": "string", "description": "城市名称"},
        "weather": {"type": "string", "description": "当前天气情况，包括温度、天气状况、风力"},
        "attractions": {
            "type": "array",
            "description": "城市的著名景点列表",
            "items": {"type": "string"}
        },
        "gdp": {"type": "number", "description": "城市 GDP（亿元）"}
    },
    "required": ["city", "weather", "attractions", "gdp"]
}

# 绑定结构化输出
struct_output_model_json = llm_deepseek.with_structured_output(json_schema)

# 调用模型
result_json = struct_output_model_json.invoke("北京")

# 格式化输出
print(json.dumps(result_json, ensure_ascii=False, indent=2))

{
  "city": "北京",
  "weather": "晴，温度 15°C，风力 2 级",
  "attractions": ["故宫", "天安门广场", "长城", "颐和园", "天坛", "圆明园"],
  "gdp": 40269.6
}

from pydantic import BaseModel, Field
from typing import Literal

# 定义不同的数据结构
class PersonInfo(BaseModel):
    """人物信息"""
    name: str = Field(description="姓名")
    age: int = Field(description="年龄")
    occupation: str = Field(description="职业")

class CityInfo(BaseModel):
    """城市信息"""
    city: str = Field(description="城市名称")
    population: int = Field(description="人口（万人）")

class NormalAnswer(BaseModel):
    """普通回答"""
    answer: str = Field(description="根据用户的提问正常答复的内容")

# 定义包装类，让模型选择返回类型
class Response(BaseModel):
    """响应结构，模型根据问题选择返回人物、城市信息或普通回答"""
    type: Literal["person", "city", "normal"] = Field(
        description="响应类型：person 表示人物，city 表示城市，normal 表示普通回答"
    )
    person: PersonInfo | None = Field(default=None, description="人物信息，仅当 type 为 person 时填充")
    city: CityInfo | None = Field(default=None, description="城市信息，仅当 type 为 city 时填充")
    normal: NormalAnswer | None = Field(default=None, description="普通回答，仅当 type 为 normal 时填充")

from langchain_deepseek import ChatDeepSeek

llm = ChatDeepSeek(
    model_name="deepseek-chat",
    api_key="your-api-key",
    base_url="https://api.deepseek.cn/v1",
)

# 绑定结构化输出
model = llm.with_structured_output(Response)

# 测试 1：询问城市
result1 = model.invoke("上海")
if result1.type == "city":
    print(f"城市：{result1.city.city}")
    print(f"人口：{result1.city.population}万")

# 测试 2：普通问题
result2 = model.invoke("你是哪个模型")
if result2.type == "normal":
    print(f"回答：{result2.normal.answer}")

from pydantic import BaseModel, Field
from typing import List

class Education(BaseModel):
    """教育经历"""
    school: str = Field(description="学校名称")
    degree: str = Field(description="学位：本科、硕士、博士等")
    major: str = Field(description="专业")
    start_year: int = Field(description="入学年份")
    end_year: int = Field(description="毕业年份")

class WorkExperience(BaseModel):
    """工作经历"""
    company: str = Field(description="公司名称")
    position: str = Field(description="职位")
    start_date: str = Field(description="入职日期")
    end_date: str = Field(description="离职日期，如果是当前工作则为'至今'")
    responsibilities: List[str] = Field(description="主要职责列表")

class ResumeInfo(BaseModel):
    """简历信息"""
    name: str = Field(description="姓名")
    phone: str = Field(description="电话")
    email: str = Field(description="邮箱")
    education: List[Education] = Field(description="教育经历列表")
    work_experience: List[WorkExperience] = Field(description="工作经历列表")
    skills: List[str] = Field(description="技能列表")

# 使用示例
model = llm.with_structured_output(ResumeInfo)
resume_text = """
张三，电话：138****1234，邮箱：[email protected]
教育背景：
- 2015-2019 清华大学 计算机科学与技术 本科
- 2019-2021 清华大学 人工智能 硕士
工作经历：
- 2021.07-2023.06 字节跳动 算法工程师 负责推荐系统开发，优化点击率提升 20%
- 2023.07-至今 阿里巴巴 高级算法工程师 负责大模型应用开发
技能：Python, PyTorch, LangChain, 机器学习
"""

result = model.invoke(f"请从以下简历中提取信息：\n{resume_text}")

# 提取后的结构化数据可以直接存入数据库
print(f"姓名：{result.name}")
print(f"联系方式：{result.phone} / {result.email}")
print(f"教育经历：{len(result.education)} 条")
print(f"工作经历：{len(result.work_experience)} 条")
print(f"技能：{', '.join(result.skills)}")

from pydantic import BaseModel, Field
from typing import Literal

class SearchIntent(BaseModel):
    """搜索意图分析"""
    intent_type: Literal["informational", "transactional", "navigational", "comparison"] = Field(
        description="意图类型：informational(查询信息)、transactional(购买交易)、navigational(导航访问)、comparison(对比比较)"
    )
    keywords: List[str] = Field(description="提取的关键词列表")
    filters: dict = Field(description="筛选条件，如价格范围、品牌、地区等")
    urgency: Literal["low", "medium", "high"] = Field(description="紧急程度")
    clarification_needed: bool = Field(description="是否需要进一步澄清")
    suggested_questions: List[str] = Field(description="建议的澄清问题列表")

# 使用示例
model = llm.with_structured_output(SearchIntent)
user_query = "我想买个性价比高的笔记本电脑，预算 5000 左右，主要用来写代码"

# 第一步：理解用户意图
intent = model.invoke(f"分析以下用户需求：{user_query}")
print(f"意图类型：{intent.intent_type}")
print(f"关键词：{', '.join(intent.keywords)}")
print(f"筛选条件：{intent.filters}")
print(f"紧急程度：{intent.urgency}")

# 第二步：根据理解的意图执行精准搜索
if intent.intent_type == "transactional":
    search_params = {
        "category": "笔记本电脑",
        "price_range": intent.filters.get("price_range", ""),
        "keywords": intent.keywords,
        "sort_by": "price_performance_ratio"
    }
    print(f"\n执行搜索：{search_params}")
elif intent.clarification_needed:
    print(f"\n需要澄清的问题:")
    for q in intent.suggested_questions:
        print(f" - {q}")

from langchain.chat_models import init_chat_model
from langchain_tavily import TavilySearch
from pydantic import BaseModel, Field

# 初始化模型
llm_tavily = init_chat_model(
    model="deepseek-chat",
    model_provider="deepseek",
    api_key="your-api-key",
    temperature=0.7,
    max_tokens=8192,
)

# 创建搜索工具
tavily_tool = TavilySearch(
    max_results=4,
    tavily_api_key="your-tavily-api-key"
)

# 定义结构化输出格式
class WeatherResult(BaseModel):
    """天气查询结果"""
    location: str = Field(description="地点")
    date: str = Field(description="日期")
    temperature: str = Field(description="温度范围")
    weather: str = Field(description="天气状况")
    wind: str = Field(description="风力风向")
    warning: str = Field(description="预警信息，如果没有则为'无'")

# 步骤 1：先绑定工具
bound_llm = llm_tavily.bind_tools([tavily_tool])
# 步骤 2：再绑定结构化输出
structured_llm = bound_llm.with_structured_output(WeatherResult)
# 步骤 3：直接调用，自动完成工具调用和结构化输出
result = structured_llm.invoke("2026 年 3 月 2 日北京天气情况")

# 输出结构化结果
print(f"地点：{result.location}")
print(f"日期：{result.date}")
print(f"温度：{result.temperature}")
print(f"天气：{result.weather}")
print(f"风力：{result.wind}")
print(f"预警：{result.warning}")

场景	核心价值	主要用途	典型应用	技术特点
信息提取器	非结构化 → 结构化	数据提取与转换	简历解析、合同分析、发票识别	单向转换、数据标准化
提示词增强	模糊意图 → 明确指令	意图理解与澄清	搜索优化、任务分解、参数提取	双向交互、意图明确化
Tool 联合使用	智能决策 + 精准执行	复杂任务自动化	智能助手、自动化工作流、数据分析	多步骤、工具协作

方式	优点	缺点	适用场景	返回类型
Pydantic BaseModel	类型安全、字段验证、IDE 支持好、可使用 Field	需要定义类	复杂数据结构、需要验证	Pydantic 对象
TypedDict	轻量级、返回纯字典、使用 Annotated 添加描述	类型检查较弱、不支持 Field	简单字典结构	dict
JSON Schema	标准化、跨语言、灵活、支持复杂嵌套	定义繁琐、无 IDE 提示	跨系统对接、复杂嵌套	dict
可选结构化输出	灵活性高、类型明确、易扩展、代码简洁	需要定义包装类	多类型动态响应	Pydantic 对象

LangChain 实战：工具调用与结构化输出

LangChain 实战：工具调用与结构化输出

工具创建方式

直接用 @tool 装饰函数

用 @tool + 自定义参数结构（Pydantic）

继承 BaseTool 写类

微信扫一扫，关注极客日志

更多推荐文章

相关免费在线工具

本地自定义工具实战

定义工具

绑定工具到模型

工具调用流程

AI 响应结构解析

第三方工具集成（Tavily 搜索）

集成第三方工具

多轮工具调用

实际输出示例

结构化输出

Pydantic BaseModel（推荐）

定义 Pydantic 模型

绑定结构化输出

输出示例

TypedDict

定义 TypedDict

使用 TypedDict

输出示例

JSON Schema

定义 JSON Schema

使用 JSON Schema

输出示例

可选结构化输出（动态类型选择）

实现方式

使用示例

输出示例

关键特性

优势

结构化输出的三大实际应用场景

场景 1：作为信息提取器

场景 2：作为提示词增强

场景 3：与 Tool 联合使用

总结

三大场景对比

选择建议

四种结构化输出方式对比

参考资源

微信扫一扫，关注极客日志

更多推荐文章

相关免费在线工具

直接用 `@tool` 装饰函数

用 `@tool` + 自定义参数结构（Pydantic）

继承 `BaseTool` 写类