import getpass
import os
os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_API_KEY"] = getpass.getpass()
文档对象 (Documents)
LangChain 中的 Document 对象是处理文本的基本单元,它由两部分组成:
page_content:字符串格式的实际文本内容。
metadata:字典形式的元数据,可存储来源、作者、日期等信息。
单个文档通常代表较大文档的一个切片。以下是创建文档对象的示例:
from langchain_core.documents import Document
documents = [
Document(
page_content="Dogs are great companions, known for their loyalty and friendliness.",
metadata={"source": "mammal-pets-doc", "type": "pet"},
),
Document(
page_content="Cats are independent pets that often enjoy their own space.",
metadata={"source": "mammal-pets-doc", "type": "pet"},
),
Document(
page_content="Goldfish are popular pets for beginners, requiring relatively simple care.",
metadata={"source": "fish-pets-doc", "type": "pet"},
),
Document(
page_content="Parrots are intelligent birds capable of mimicking human speech.",
metadata={"source": "bird-pets-doc", "type": "pet"},
),
Document(
page_content="Rabbits are social animals that need plenty of space to hop around.",
metadata={"source": "mammal-pets-doc", "type": "pet"},
),
]