LangChain 实战：工具调用与结构化输出 | 极客日志

PythonAI算法

LangChain 实战：工具调用与结构化输出

介绍 LangChain 中工具调用（Tool Calling）和结构化输出（Structured Output）的实战应用。涵盖工具的三种创建方式、本地及第三方工具集成流程。详解 Pydantic、TypedDict、JSON Schema 三种结构化输出方法，以及可选结构化输出的动态类型选择。结合简历提取、搜索意图理解、智能天气助手三大场景，对比不同方案的优劣与适用性，帮助开发者实现 AI 从聊天到实际任务的转变。

战神发布于 2026/4/5更新于 2026/6/241 浏览

工具调用（Tool Calling）

工具调用是 LangChain 的核心功能之一，允许 AI 模型调用外部函数或 API 来完成特定任务。

例如，当我们希望获取当前天气情况时，由于 LLM 无法获取实时信息，此时我们就可以借助工具，通过外部服务进行搜索完成查询。

再例如，当我们希望获取数据库表中的数据时，由于 LLM 无法直接获取表数据，此时我们就可以借助工具，通过与数据库交互完成查询。

1. Tool 创建的三种方式

1.1. 直接用 `@tool` 装饰函数

最简单，适合小工具。

@tool
def add(a: int, b: int) -> int:
    return a + b

1.2. 用 `@tool` + 自定义参数结构（Pydantic）

参数更清晰，能写详细说明。

@tool(args_schema=AddInput)
def add(a: int, b: int) -> int:
    return a + b

1.3. 继承 `BaseTool` 写类

最灵活，适合复杂或异步操作（比如调 API）。

class AddTool(BaseTool):
    def _run(self, a: int, b: int):
        return a + b

2. 本地自定义工具

示例文件： [email protected]

2.1 定义工具

使用 @tool 装饰器定义一个本地工具：

from langchain_core.tools import tool

@tool
def sum_to_n() -> :
    
    sum_val = 
     i  (n + ):
        sum_val += i
     sum_val

相关免费在线工具

加密/解密文本
使用加密算法（如AES、TripleDES、Rabbit或RC4）加密和解密文本明文。在线工具，加密/解密文本在线工具，online
RSA密钥对生成器
生成新的随机RSA私钥和公钥pem证书。在线工具，RSA密钥对生成器在线工具，online
Mermaid 预览与可视化编辑
基于 Mermaid.js 实时预览流程图、时序图等图表，支持源码编辑与即时渲染。在线工具，Mermaid 预览与可视化编辑在线工具，online
随机西班牙地址生成器
随机生成西班牙地址（支持马德里、加泰罗尼亚、安达卢西亚、瓦伦西亚筛选），支持数量快捷选择、显示全部与下载。在线工具，随机西班牙地址生成器在线工具，online
Gemini 图片去水印
基于开源反向 Alpha 混合算法去除 Gemini/Nano Banana 图片水印，支持批量处理与下载。在线工具，Gemini 图片去水印在线工具，online
curl 转代码
解析常见 curl 参数并生成 fetch、axios、PHP curl 或 Python requests 示例代码。在线工具，curl 转代码在线工具，online

from langchain.chat_models import init_chat_model

llm = init_chat_model(
    model="deepseek-chat",
    model_provider="deepseek",
    api_key="your-api-key",
    temperature=0.7,
    max_tokens=8192,
)
# 绑定工具
bound_llm = llm.bind_tools([sum_to_n])

from langchain_core.messages import HumanMessage, SystemMessage, ToolMessage

# 1. 构建用户消息
message = [
    SystemMessage(content="你是一个数学助手。当用户给你一个数字 n 时，使用 sum_to_n 工具计算从 0 累加到 n 的结果。"),
    HumanMessage(content="10"),
]
# 2. 模型决定是否调用工具
ai_message = bound_llm.invoke(message)
message.append(ai_message)
# 3. 执行工具调用
for tool_call in ai_message.tool_calls:
    if tool_call["name"] == "sum_to_n":
        result = sum_to_n.invoke(tool_call["args"])
        message.append(
            ToolMessage(
                content=str(result),
                tool_call_id=tool_call["id"]
            )
        )
# 4. 获取最终回复
final_msg = bound_llm.invoke(message).content
print(final_msg)

用户输入 → AI 分析 → 决定调用工具 → 执行工具 → 返回结果 → AI 生成最终回答

# ai_message 结构（经过 LangChain 框架整合）
{
    'content': '我来计算从 0 累加到 10 的结果。',
    'tool_calls': [
        {
            'name': 'sum_to_n',  # 工具名称
            'args': {'n': 10},   # 工具参数
            'id': 'call_00_xxx', # 调用 ID
            'type': 'tool_call'
        }
    ],
    'response_metadata': {
        'token_usage': {...},
        'model_name': 'deepseek-chat',
        'finish_reason': 'tool_calls'  # 表示模型决定调用工具
    }
}

from langchain_tavily import TavilySearch

# 创建 Tavily 搜索工具
tavily_tool = TavilySearch(
    max_results=4,
    tavily_api_key="your-tavily-api-key"
)
# 绑定到模型
bound_llm = llm_tavily.bind_tools([tavily_tool])

message = [
    SystemMessage(content="你是一个天气助手。使用 tavily_search 工具搜索天气情况。"),
    HumanMessage(content="2026 年 3 月 2 日北京天气情况"),
]
# 工具调用循环（最多 3 轮）
max_rounds = 3
for round_num in range(1, max_rounds + 1):
    # 调用模型
    ai_message = bound_llm.invoke(message)
    message.append(ai_message)
    # 如果没有工具调用，说明得到最终答案了
    if not ai_message.tool_calls:
        print(ai_message.content)
        break
    # 执行工具调用
    for tool_call in ai_message.tool_calls:
        result = tavily_tool.invoke(tool_call["args"])
        tool_message = ToolMessage(
            content=str(result),
            tool_call_id=tool_call["id"]
        )
        message.append(tool_message)

第 1 轮：用户提问 → AI 调用搜索工具 → 获取搜索结果
第 2 轮：AI 分析结果 → 生成最终答案（不再调用工具）

根据搜索结果，2026 年 3 月 2 日北京天气情况：
白天天气：
- 天气状况：阴天，大部分地区有小雪或雨夹雪
- 风向风力：北风转南风 2-3 级
- 最高气温：5-6℃
夜间天气：
- 天气状况：阴天，山区有小雪
- 风向风力：南风转北风 1-2 级
- 最低气温：-1℃到 0℃
特别预警：
北京市气象台发布大雾黄色预警信号，夜间至次日上午能见度小于 1000 米。

from pydantic import BaseModel, Field

class TestOutput(BaseModel):
    """城市信息的 Pydantic 模型"""
    test: str = Field(description="这个城市现在的详细天气情况，包括天气状况、温度、风力等")
    test2: str = Field(description="城市名称")
    test3: str = Field(description="在这个城市出生的一个著名名人，包括简短介绍")

from langchain_deepseek import ChatDeepSeek

llm_deepseek = ChatDeepSeek(
    model_name="deepseek-chat",
    api_key="your-api-key",
    base_url="https://api.deepseek.cn/v1",
)
# 绑定结构化输出
struct_output_model = llm_deepseek.with_structured_output(TestOutput)
# 调用模型
result = struct_output_model.invoke("上海")
# 访问结构化数据
print(f"天气：{result.test}")
print(f"城市：{result.test2}")
print(f"名人：{result.test3}")

# 返回的是 Pydantic 对象
test='上海现在天气晴朗，温度 25°C，东南风 3-4 级，湿度 65%，空气质量良好'
test2='上海'
test3='姚明，中国著名篮球运动员，1980 年出生于上海，曾效力于 NBA 休斯顿火箭队'

from typing import TypedDict, Annotated

class TestOutputDict(TypedDict):
    """城市信息的字典类型"""
    weather: Annotated[str, "这个城市现在的详细天气情况，包括天气状况、温度、风力等"]
    city_name: Annotated[str, "城市名称"]
    famous_person: Annotated[str, "在这个城市出生的一个著名名人，包括简短介绍"]
    population: Annotated[int, "城市人口数量（万人）"]

# 绑定结构化输出
struct_output_model_dict = llm_deepseek.with_structured_output(TestOutputDict)
# 调用模型
result_dict = struct_output_model_dict.invoke("上海")
# 访问字典数据
print(f"天气：{result_dict['weather']}")
print(f"城市：{result_dict['city_name']}")
print(f"人口：{result_dict['population']} 万人")

# 返回的是纯字典
{'city_name':'上海','weather':'上海现在天气晴朗，温度 18-25°C，东南风 3-4 级','population':2487,'famous_person':'姚明，中国著名篮球运动员，前 NBA 休斯顿火箭队球员'}

import json

json_schema = {
    "title": "CityInfo",
    "description": "城市信息的 JSON 格式",
    "type": "object",
    "properties": {
        "city": {"type": "string", "description": "城市名称"},
        "weather": {"type": "string", "description": "当前天气情况，包括温度、天气状况、风力"},
        "attractions": {
            "type": "array",
            "description": "城市的著名景点列表",
            "items": {"type": "string"}
        },
        "gdp": {"type": "number", "description": "城市 GDP（亿元）"}
    },
    "required": ["city", "weather", "attractions", "gdp"]
}

# 绑定结构化输出
struct_output_model_json = llm_deepseek.with_structured_output(json_schema)
# 调用模型
result_json = struct_output_model_json.invoke("北京")
# 格式化输出
print(json.dumps(result_json, ensure_ascii=False, indent=2))

{"city":"北京","weather":"晴，温度 15°C，风力 2 级","attractions":["故宫","天安门广场","长城","颐和园","天坛","圆明园"],"gdp":40269.6}

from pydantic import BaseModel, Field
from typing import Literal

# 定义不同的数据结构
class PersonInfo(BaseModel):
    """人物信息"""
    name: str = Field(description="姓名")
    age: int = Field(description="年龄")
    occupation: str = Field(description="职业")

class CityInfo(BaseModel):
    """城市信息"""
    city: str = Field(description="城市名称")
    population: int = Field(description="人口（万人）")

class NormalAnswer(BaseModel):
    """普通回答"""
    answer: str = Field(description="根据用户的提问正常答复的内容")

# 定义包装类，让模型选择返回类型
class Response(BaseModel):
    """响应结构，模型根据问题选择返回人物、城市信息或普通回答"""
    type: Literal["person", "city", "normal"] = Field(
        description="响应类型：person 表示人物，city 表示城市，normal 表示普通回答"
    )
    person: PersonInfo | None = Field(default=None, description="人物信息，仅当 type 为 person 时填充")
    city: CityInfo | None = Field(default=None, description="城市信息，仅当 type 为 city 时填充")
    normal: NormalAnswer | None = Field(default=None, description="普通回答，仅当 type 为 normal 时填充")

from langchain_deepseek import ChatDeepSeek

llm = ChatDeepSeek(
    model_name="deepseek-chat",
    api_key="your-api-key",
    base_url="https://api.deepseek.cn/v1",
)
# 绑定结构化输出
model = llm.with_structured_output(Response)
# 测试 1：询问城市
result1 = model.invoke("上海")
if result1.type == "city":
    print(f"城市：{result1.city.city}")
    print(f"人口：{result1.city.population}万")
# 测试 2：普通问题
result2 = model.invoke("你是哪个模型")
if result2.type == "normal":
    print(f"回答：{result2.normal.answer}")

类型：city
城市：上海
人口：2489 万

类型：normal
回答：我是 DeepSeek 最新版本的 AI 助手，由深度求索公司开发...

from pydantic import BaseModel, Field

class Education(BaseModel):
    """教育经历"""
    school: str = Field(description="学校名称")
    degree: str = Field(description="学位：本科、硕士、博士等")
    major: str = Field(description="专业")
    start_year: int = Field(description="入学年份")
    end_year: int = Field(description="毕业年份")

class WorkExperience(BaseModel):
    """工作经历"""
    company: str = Field(description="公司名称")
    position: str = Field(description="职位")
    start_date: str = Field(description="入职日期")
    end_date: str = Field(description="离职日期，如果是当前工作则为'至今'")
    responsibilities: list[str] = Field(description="主要职责列表")

class ResumeInfo(BaseModel):
    """简历信息"""
    name: str = Field(description="姓名")
    phone: str = Field(description="电话")
    email: str = Field(description="邮箱")
    education: list[Education] = Field(description="教育经历列表")
    work_experience: list[WorkExperience] = Field(description="工作经历列表")
    skills: list[str] = Field(description="技能列表")

# 使用示例
model = llm.with_structured_output(ResumeInfo)
resume_text = """
张三，电话：138****1234，邮箱：[email protected]
教育背景：
- 2015-2019 清华大学 计算机科学与技术 本科
- 2019-2021 清华大学 人工智能 硕士
工作经历：
- 2021.07-2023.06 字节跳动 算法工程师 负责推荐系统开发，优化点击率提升 20%
- 2023.07-至今 阿里巴巴 高级算法工程师 负责大模型应用开发
技能：Python, PyTorch, LangChain, 机器学习
"""
result = model.invoke(f"请从以下简历中提取信息：\n{resume_text}")
# 提取后的结构化数据可以直接存入数据库
print(f"姓名：{result.name}")
print(f"联系方式：{result.phone} / {result.email}")
print(f"教育经历：{len(result.education)} 条")
print(f"工作经历：{len(result.work_experience)} 条")
print(f"技能：{', '.join(result.skills)}")

姓名：张三
联系方式：138****1234 / zhangsan@example.com
教育经历：2 条
工作经历：2 条
技能：Python, PyTorch, LangChain, 机器学习

from pydantic import BaseModel, Field
from typing import Literal

class SearchIntent(BaseModel):
    """搜索意图分析"""
    intent_type: Literal["informational", "transactional", "navigational", "comparison"] = Field(
        description="意图类型：informational(查询信息)、transactional(购买交易)、navigational(导航访问)、comparison(对比比较)"
    )
    keywords: list[str] = Field(description="提取的关键词列表")
    filters: dict[str, str] = Field(description="筛选条件，如价格范围、品牌、地区等")
    urgency: Literal["low", "medium", "high"] = Field(description="紧急程度")
    clarification_needed: bool = Field(description="是否需要进一步澄清")
    suggested_questions: list[str] = Field(description="建议的澄清问题列表")

# 使用示例
model = llm.with_structured_output(SearchIntent)
user_query = "我想买个性价比高的笔记本电脑，预算 5000 左右，主要用来写代码"
# 第一步：理解用户意图
intent = model.invoke(f"分析以下用户需求：{user_query}")
print(f"意图类型：{intent.intent_type}")
print(f"关键词：{', '.join(intent.keywords)}")
print(f"筛选条件：{intent.filters}")
print(f"紧急程度：{intent.urgency}")
# 第二步：根据理解的意图执行精准搜索
if intent.intent_type == "transactional":
    # 执行购买导向的搜索
    search_params = {
        "category": "笔记本电脑",
        "price_range": intent.filters.get("price_range", ""),
        "keywords": intent.keywords,
        "sort_by": "price_performance_ratio"
    }
    print(f"\n执行搜索：{search_params}")
elif intent.clarification_needed:
    # 需要进一步澄清
    print(f"\n需要澄清的问题:")
    for q in intent.suggested_questions:
        print(f" - {q}")

意图类型：transactional
关键词：笔记本电脑，性价比，写代码，编程
筛选条件：{'price_range': '4000-6000', 'usage': '编程开发', 'priority': '性价比'}
紧急程度：medium
执行搜索：{
    'category': '笔记本电脑',
    'price_range': '4000-6000',
    'keywords': ['笔记本电脑', '性价比', '写代码', '编程'],
    'sort_by': 'price_performance_ratio'
}

from langchain.chat_models import init_chat_model
from langchain_tavily import TavilySearch
from pydantic import BaseModel, Field

# 初始化模型
llm_tavily = init_chat_model(
    model="deepseek-chat",
    model_provider="deepseek",
    api_key="your-api-key",
    temperature=0.7,
    max_tokens=8192,
)
# 创建搜索工具
tavily_tool = TavilySearch(
    max_results=4,
    tavily_api_key="your-tavily-api-key"
)
# 定义结构化输出格式
class WeatherResult(BaseModel):
    """天气查询结果"""
    location: str = Field(description="地点")
    date: str = Field(description="日期")
    temperature: str = Field(description="温度范围")
    weather: str = Field(description="天气状况")
    wind: str = Field(description="风力风向")
    warning: str = Field(description="预警信息，如果没有则为'无'")

# 步骤 1：先绑定工具
bound_llm = llm_tavily.bind_tools([tavily_tool])
# 步骤 2：再绑定结构化输出
structured_llm = bound_llm.with_structured_output(WeatherResult)
# 步骤 3：直接调用，自动完成工具调用和结构化输出
result = structured_llm.invoke("2026 年 3 月 2 日北京天气情况")
# 输出结构化结果
print(f"地点：{result.location}")
print(f"日期：{result.date}")
print(f"温度：{result.temperature}")
print(f"天气：{result.weather}")
print(f"风力：{result.wind}")
print(f"预警：{result.warning}")

地点：北京
日期：2026 年 3 月 2 日
温度：-1℃到 6℃
天气：阴天，有小雪或雨夹雪
风力：北风转南风 2-3 级，夜间南风转北风 1-2 级
预警：大雾黄色预警，能见度小于 1000 米

场景	核心价值	主要用途	典型应用	技术特点
信息提取器	非结构化 → 结构化	数据提取与转换	简历解析、合同分析、发票识别	单向转换、数据标准化
提示词增强	模糊意图 → 明确指令	意图理解与澄清	搜索优化、任务分解、参数提取	双向交互、意图明确化
Tool 联合使用	智能决策 + 精准执行	复杂任务自动化	智能助手、自动化工作流、数据分析	多步骤、工具协作

方式	优点	缺点	适用场景	返回类型
Pydantic BaseModel	类型安全、字段验证、IDE 支持好、可使用 Field	需要定义类	复杂数据结构、需要验证	Pydantic 对象
TypedDict	轻量级、返回纯字典、使用 Annotated 添加描述	类型检查较弱、不支持 Field	简单字典结构	dict
JSON Schema	标准化、跨语言、灵活、支持复杂嵌套	定义繁琐、无 IDE 提示	跨系统对接、复杂嵌套	dict
可选结构化输出	灵活性高、类型明确、易扩展、代码简洁	需要定义包装类	多类型动态响应	Pydantic 对象

LangChain 实战：工具调用与结构化输出

工具调用（Tool Calling）

1. Tool 创建的三种方式

1.1. 直接用 @tool 装饰函数

1.2. 用 @tool + 自定义参数结构（Pydantic）

1.3. 继承 BaseTool 写类

2. 本地自定义工具

2.1 定义工具

微信扫一扫，关注极客日志

更多推荐文章

相关免费在线工具

2.2 绑定工具到模型

2.3 工具调用流程

2.4 AI 响应结构解析

3. 第三方工具集成（Tavily 搜索）

3.1 集成第三方工具

3.2 多轮工具调用

3.3 实际输出示例

结构化输出（Structured Output）

1. Pydantic BaseModel（推荐）

1.1 定义 Pydantic 模型

1.2 绑定结构化输出

1.3 输出示例

2. TypedDict

2.1 定义 TypedDict

2.2 使用 TypedDict

2.3 输出示例

3. JSON Schema

3.1 定义 JSON Schema

3.2 使用 JSON Schema

3.3 输出示例

4. 可选结构化输出（动态类型选择）

4.1 应用场景

4.2 实现方式

4.3 使用示例

4.4 输出示例

4.5 关键特性

4.6 优势

结构化输出的三大实际应用场景

场景 1：作为信息提取器 - 将非结构化文本转化为结构化数据

核心价值

典型应用

实现示例：简历信息提取

场景 2：作为提示词增强 - 帮助 AI 更好理解用户意图

核心价值

典型应用

实现示例：搜索意图理解

场景 3：与 Tool 联合使用 - 结构化输出 + 工具调用的完美组合

核心价值

典型应用

实现示例：智能天气助手（工具调用 + 结构化输出整合）

三大场景对比总结

选择建议

四种结构化输出方式对比总结

参考资源

微信扫一扫，关注极客日志

更多推荐文章

相关免费在线工具

1.1. 直接用 `@tool` 装饰函数

1.2. 用 `@tool` + 自定义参数结构（Pydantic）

1.3. 继承 `BaseTool` 写类