ReAct 来自论文《ReAct: Synergizing Reasoning and Acting in Language Models》,它提出了一种新的方法,通过结合语言模型中的推理(reasoning)和行动(acting)来解决多样化的语言推理和决策任务。在多种任务上对 ReAct 进行了实验评估,包括问答(HotpotQA)、事实验证(Fever)、基于文本的游戏(ALFWorld)和网页导航(WebShop),并展示了其在少量样本学习设置下相比现有方法的优势。通过一系列的消融实验和分析,探讨了在推理任务中行动的重要性,以及在交互任务中推理的重要性。ReAct 提供了一种更易于人类理解、诊断和控制的决策和推理过程。它的典型流程如下图所示,可以用一个有趣的循环来描述:思考(Thought)→ 行动(Action)→ 观察(Observation),简称 TAO 循环。
我认为良好的 Prompt,要有明确的任务说明,完整的输入说明和输出说明,格式要求,示例,对于 ReAct,还需要有草稿本。以上述问答的 Prompt 为例,它的 Prompt 设计如下。其中 example 中应该给出 Thought 时候,要搜索的实体,然后在 Action 中直接自动提取实体,在 Observation 中给出观察的结果,example 大约在 4-5 个左右。
用交替进行的"思考、行动、观察"三个步骤来解决问答任务。思考可以对当前情况进行推理,而行动必须是以下三种类型:
(1) Search[entity],在维基百科上搜索确切的实体,并返回第一个段落(如果存在)。如果不存在,将返回一些相似的实体以供搜索。
() Lookup[keyword],在上一次成功通过 Search 找到的段落中返回包含关键字的下一句。
() Finish[answer],返回答案并结束任务。
你可以采取必要的步骤。确保你的回应必须严格遵循上述格式,尤其是行动必须是以上三种类型之一。
以下是一些参考示例:
Question: What the elevation the area that the eastern sector of the Colorado orogeny extends into?
Thought : I need to search Colorado orogeny, find the area that the eastern sector of the Colorado orogeny extends into, then find the elevation of the area.
Action : Search[Colorado orogeny]
Observation : The Colorado orogeny was an episode of mountain building (an orogeny) Colorado surrounding areas.
Thought : It does mention the eastern sector. So I need to look up eastern sector.
...
(例子结束)
Question:{question}
{scratchpad}
REACT_INSTRUCTION = """Solve a question answering task with interleaving Thought, Action, Observation steps. Thought can reason about the current situation, and Action must be three types:
(1) Search[entity], which searches the exact entity on Wikipedia and returns the first paragraph if it exists. If not, it will return some similar entities to search.
(2) Lookup[keyword], which returns the next sentence containing keyword in the last passage successfully found by Search.
(3) Finish[answer], which returns the answer and finishes the task.
You may take as many steps as necessary. Ensure that your responses MUST strictly to the above formats, especially Action must be one of the three types.
Here are some examples:
{examples}
(END OF EXAMPLES)
{reflections}
Question: {question}{scratchpad}"""
react_agent_prompt = PromptTemplate(input_variables=["examples", "question", "scratchpad"],
template = REACT_INSTRUCTION)
WEBTHINK_SIMPLE6 = """Question: What is the elevation range for the area that the eastern sector of the Colorado orogeny extends into?
Thought 1: I need to search Colorado orogeny, find the area that the eastern sector of the Colorado orogeny extends into, then find the elevation range of the area.
Action 1: Search[Colorado orogeny]
Observation 1: The Colorado orogeny was an episode of mountain building (an orogeny) in Colorado and surrounding areas.
....
"""
Solve a question answering task with interleaving Thought, Action, Observation steps. Thought can reason about the current situation, and Action can be three types:
...
(END OF EXAMPLES)
Question: The creator of "Wallace and Gromit" also created what animation comedy that matched animated zoo animals with a soundtrack of people talking about their homes?
Thought 1:
初始化 docstore 为 DocstoreExplorer(docstore),其中 dockstore 为 lanchiain 内置的访问 wikipedia 工具。
Solve a question answering task with interleaving Thought, Action, Observation steps. Thought can reason about the current situation, and Action can be three types:
...
(END OF EXAMPLES)
Question: The creator of "Wallace and Gromit" also created what animation comedy that matched animated zoo animals with a soundtrack of people talking about their homes?
Thought 1:
Thought 1: The creator of "Wallace and Gromit"is Nick Park. I need to search for other animation comedies by Nick Park that match this description.
Action 1:
调用 action = self.prompt_agent()action 会被赋值为
Search[Nick Park zoo animals talking about their homes]
更新 scratchpad 为
Thought 1: The creator of "Wallace and Gromit"is Nick Park. I need to search for other animation comedies by Nick Park that match this description.
Action 1: Search[Nick Park zoo animals talking about their homes]
接下来使用正则表达式 pattern = r'^(\w+)[(.+)]$' 提取 Search 这个 Action,提取中括号中的检索字符串。根据 step 方法,判断 Action 为 Search 需要执行 Wikipedia 检索行为,具体的 wikipedia 工具的实现这里不多赘述,可以参考。
Nicholas Wulstan Park (born 6 December 1958) is an English filmm...
整个 step 完成,最终的 scratchpad 为
Thought 1: The creator of "Wallace and Gromit"is Nick Park. I need to search for other animation comedies by Nick Park that match this description.
Action 1: Search[Nick Park zoo animals talking about their homes]
Observation 1: Nicholas Wulstan Park (born 6 December 1958) is an English filmmaker and ...
3.7 迭代 React
循环调用 step,直到满足条件退出。最终的 scratchpad 如下
Thought 1: The creator of "Wallace and Gromit" is Nick Park. I need to search for other animation comedies by Nick Park that match this description.
Action 1: Search[Nick Park zoo animals talking about their homes]
Observation 1: Nicholas Wulstan Park (born 6 December 1958) is an English filmmaker ...
Thought 2: Nick Park also created Creature Comforts, which is the animation comedy that matched animated zoo animals with a soundtrack of people talking about their homes.
Action 2: Finish[Creature Comforts]