Agent学习笔记：OpenAI Function Calling完全指南

ZY lian

11 人赞同了该文章

收起

何为function calling

一些常见的问题

JSON mode

如何使用OpenAI function calling

天气查询的简单示例

通过function calling 实现sql执行

基于function calling构建智能代理：自动获取与深度解析arXiv学术文章

Function calling via open source LLMs

LLM 在执行function calling时经历了什么

参考

写在最开始

当我们在讨论基于大型语言模型（LLM-based）的智能代理（agent）时，我们究竟在谈论什么？根据Lilian W在其文章《LLM Powered Autonomous Agents》中的讨论，一个智能代理需要具备几个核心能力：规划（Planning）、记忆（Memory）、以及工具使用（Tool use）。特别地，工具使用方面的进展，得益于OpenAI在API中提供的function calling功能，为我们开启了新的可能性。

OpenAI function calling，作为智能代理与外部工具交互的基本方式，对于每位从业者来说都是必备技能。随着技术的发展，我们期望的不只是能与我们对话的LLM，而是能够辅助我们使用各种工具、做出决策的智能伙伴。

不过需要特别指出的是，最近OpenAI在Chat Completions API中已经废弃了“函数（function）”的使用，转而采用“工具（tool）”。这一变更旨在拓宽LLM集成的功能范围，为更复杂的交互模式铺平道路，如构建能够相互作用的多代理系统。

尽管如此，由于语言习惯的原因，本文中仍然会使用function calling的术语来描述OpenAI的tool using功能，因为“function calling”的说法已经深入人心了。

核心内容概览

Function Calling的定义：解释什么是function calling，以及它在智能代理工作中的作用。
OpenAI Cookbook示例：提供实际的function calling示例，帮助读者理解其在实际应用中的用途。
开源LLM的Tool Using：探索如何在开源大型语言模型中实现工具使用，以及LLM在tool using的时候经历了什么。
~~评价与训练~~~~：讨论如何评价开源模型的工具使用能力，以及如何训练LLM进行有效的工具使用。~~

鉴于整理笔记的速度远赶不上更新的速度，会将第四部份作为单独的部分整理。

何为function calling

一句话解释：function calling从本质上并不是严格的工具调用，而是作为工具调用的前奏，它通过更加结构化的方式指导LLM输出，为在本地执行具体函数提供了参数，铺平了道路。

具体来说，function calling允许LLM在执行过程中通过指定的参数来调用并执行一个特定的函数。这种方式不仅实现了代码的重用和模块化处理，而且能够从模型中获取更可靠的结构化数据回应。在API调用过程中，开发者可以描述想要执行的功能，并让模型智能地选择输出包含所需参数的JSON对象。这个过程中，Chat Completions API本身不直接执行任何函数调用，而是生成了可以在开发者代码中实现函数调用的JSON。

function calling的应用范围广泛，如

创建智能助手：通过调用外部API回答问题。
转换指令：将自然语言指令转换成API调用指令。
数据提取：从文本中提取结构化数据。

function calling的过程涵盖了从定义函数集、通过模型生成遵循自定义模式的JSON对象字符串，到在代码中解析这个字符串并调用相应函数的全过程。这一连串操作不仅自动化了交互过程，还确保了执行操作的安全性和准确性。

一些常见的问题

JSON mode

json mode 和tool-using 有什么关系？有了json mode 还需要用到tool-using吗？

从json mode 的本质，更多的是在system prompt 增加一句类似“请以json格式输出”之类的话，然后在LLM输出时增加json结果检查和格式转换。在使用时只需要在client.chat.completions.create中增加response_format={ "type": "json_object" } 即可。

那么json mode 什么时候会用到呢？一般在做文本提取，内容提取时可以使用；以RAG场景为例，当我们希望LLM能够帮我们对用户的query进行改写时，我们肯定是希望模型能够返回干净的json格式改写结果，这样的结果可以直接使用，而不是在模型输出一些内容后，如：

"""
好的，以下是我的改写内容：
```
real-rewrite-query
```
"""

其中包含了一些模型喜欢输出的客套话，此时我们需要通过正则匹配等方法将真正希望使用内容提取出来。而这时候json mode可以直接输出需要的内容，而跳过了额外的提取步骤。在json mode 出现之前，这样的处理我们也尝试使用过tool-using 的模式，但有点大材小用了。

很显然， tool-using的真正强大之处并不只是对输出格式进行处理，而是能够让模型从提供的多个tools中选择需要使用的。

如何使用OpenAI function calling

天气查询的简单示例

环境配置

首先，我们需要安装一些必要的Python库。这些库将帮助我们与OpenAI的API进行交互，以及完成一些辅助功能。

!pip install scipy --quiet
!pip install tenacity --quiet
!pip install tiktoken --quiet
!pip install termcolor --quiet
!pip install openai --quiet

os.environ["OPENAI_API_KEY"] = "..."
from openai import OpenAI
from tenacity import retry, wait_random_exponential, stop_after_attempt
from termcolor import colored

client = OpenAI()

工具函数定义

接下来，我们定义一些实用函数，这些函数旨在方便我们向Chat Completions API发送请求，并管理与跟踪对话状态。

@retry(wait=wait_random_exponential(multiplier=1, max=40), stop=stop_after_attempt(3))
def chat_completion_request(messages, tools=None, tool_choice=None, model="gpt-3.5-turbo"):
    try:
        response = client.chat.completions.create(
            model=model,
            messages=messages,
            tools=tools,
            tool_choice=tool_choice,
        )
        return response
    except Exception as e:
        print("Unable to generate ChatCompletion response")
        print(f"Exception: {e}")
        return e


def pretty_print_conversation(messages):
    role_to_color = {
        "system": "red",
        "user": "green",
        "assistant": "blue",
        "function": "magenta",
    }

    for message in messages:
        if message["role"] == "system":
            print(colored(f"system: {message['content']}\n", role_to_color[message["role"]]))
        elif message["role"] == "user":
            print(colored(f"user: {message['content']}\n", role_to_color[message["role"]]))
        elif message["role"] == "assistant" and message.get("function_call"):
            print(colored(f"assistant: {message['function_call']}\n", role_to_color[message["role"]]))
        elif message["role"] == "assistant" and not message.get("function_call"):
            print(colored(f"assistant: {message['content']}\n", role_to_color[message["role"]]))
        elif message["role"] == "function":
            print(colored(f"function ({message['name']}): {message['content']}\n", role_to_color[message["role"]]))

函数规范定义

我们还需要创建一些函数规范，以接口化与假设的天气API的交互。

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_current_weather",
            "description": "Get the current weather",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city and state, e.g. San Francisco, CA",
                    },
                    "format": {
                        "type": "string",
                        "enum": ["celsius", "fahrenheit"],
                        "description": "The temperature unit to use. Infer this from the users location.",
                    },
                },
                "required": ["location", "format"],
            },
        }
    },
    {
        "type": "function",
        "function": {
            "name": "get_n_day_weather_forecast",
            "description": "Get an N-day weather forecast",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city and state, e.g. San Francisco, CA",
                    },
                    "format": {
                        "type": "string",
                        "enum": ["celsius", "fahrenheit"],
                        "description": "The temperature unit to use. Infer this from the users location.",
                    },
                    "num_days": {
                        "type": "integer",
                        "description": "The number of days to forecast",
                    }
                },
                "required": ["location", "format", "num_days"]
            },
        }
    },
]

对话示例

当我们请求当前天气时， LLM会要求澄清问题，如地址等参数信息：

messages = []
messages.append({"role": "system", "content": "Don't make assumptions about what values to plug into functions. Ask for clarification if a user request is ambiguous."})
messages.append({"role": "user", "content": "What's the weather like today"})
chat_response = chat_completion_request(
    messages, tools=tools
)
assistant_message = chat_response.choices[0].message
messages.append(assistant_message)
assistant_message

output:

ChatCompletionMessage(content='Sure, may I know your current location?', role='assistant', function_call=None, tool_calls=None)

当补充缺失的信息后， LLM将生成适当的函数参数

messages.append({"role": "user", "content": "I'm in Shanghai, China."})
chat_response = chat_completion_request(
    messages, tools=tools
)
assistant_message = chat_response.choices[0].message
messages.append(assistant_message)
assistant_message

output:

ChatCompletionMessage(content=None, role='assistant', function_call=None, tool_calls=[ChatCompletionMessageToolCall(id='call_VdqOOMp9pagf5ho39Y2HmYV4', function=Function(arguments='{\n  "location": "San Francisco, CA",\n  "format": "celsius",\n  "num_days": 4\n}', name='get_n_day_weather_forecast'), type='function')])

完整流程如下：

messages = []
messages.append({"role": "system", "content": "Don't make assumptions about what values to plug into functions. Ask for clarification if a user request is ambiguous."})
messages.append({"role": "user", "content": "what is the weather going to be like in Glasgow, Scotland over the next x days"})
messages.append({"role": "user", "content": "in 5 days"})

chat_response = chat_completion_request(
    messages, tools=tools
)
assistant_message = chat_response.choices[0].message
messages.append(assistant_message)
assistant_message

当然，我们也可以强制要求使用特定的函数

messages = []
messages.append({"role": "system", "content": "Don't make assumptions about what values to plug into functions. Ask for clarification if a user request is ambiguous."})
messages.append({"role": "user", "content": "Give me a weather report for Toronto, Canada."})
chat_response = chat_completion_request(
    messages, tools=tools, tool_choice={"type": "function", "function": {"name": "get_n_day_weather_forecast"}}
)
chat_response.choices[0].message

output:

ChatCompletionMessage(content=None, role='assistant', function_call=None, tool_calls=[ChatCompletionMessageToolCall(id='call_0ldecDpV8Vdq8mGPoUewlue3', function=Function(arguments='{\n  "location": "Toronto, Canada",\n  "format": "celsius",\n  "num_days": 1\n}', name='get_n_day_weather_forecast'), type='function')])

并行函数调用

对于一些特定模型，如gpt-4-turbo-preview, gpt-4-0125-preview, gpt-4-1106-preview, gpt-3.5-turbo-0125, and gpt-3.5-turbo-1106支持并行函数调用，允许我们在单个回合中调用多个函数。

messages = []
messages.append({"role": "system", "content": "Don't make assumptions about what values to plug into functions. Ask for clarification if a user request is ambiguous."})
messages.append({"role": "user", "content": "what is the weather going to be like in San Francisco and Glasgow over the next 4 days"})
chat_response = chat_completion_request(
    messages, tools=tools
)

assistant_message = chat_response.choices[0].message.tool_calls
assistant_message

output:

[ChatCompletionMessageToolCall(id='call_tfl8eTCW64sHvHjiiatoYzku', function=Function(arguments='{"location": "San Francisco, CA", "format": "fahrenheit", "num_days": 4}', name='get_n_day_weather_forecast'), type='function'),
 ChatCompletionMessageToolCall(id='call_bAqj55RygP2Y1T85RHqgskku', function=Function(arguments='{"location": "Glasgow, UK", "format": "celsius", "num_days": 4}', name='get_n_day_weather_forecast'), type='function')]

实现本地函数调用

首先我们需要构造两个用于演示的假function

import json

def get_current_weather(location, format="fahrenheit"):
    """
    Simulates getting the current weather for a given location.
    The response is hardcoded for demonstration purposes.

    Args:
        location (str): The city and state, e.g., San Francisco, CA.
        format (str, optional): The temperature unit to use. Defaults to "fahrenheit".

    Returns:
        str: JSON string with the current weather data.
    """
    if "tokyo" in location.lower():
        return json.dumps({"location": "Tokyo", "temperature": "10", "format": format, "description": "Partly Cloudy"})
    elif "san francisco" in location.lower():
        return json.dumps({"location": "San Francisco", "temperature": "72", "format": format, "description": "Sunny"})
    elif "paris" in location.lower():
        return json.dumps({"location": "Paris", "temperature": "22", "format": format, "description": "Rainy"})
    else:
        return json.dumps({"location": location, "temperature": "unknown", "format": format, "description": "Data Unavailable"})

def get_n_day_weather_forecast(location, num_days, format="fahrenheit"):
    """
    Simulates getting an N-day weather forecast for a given location.
    The response is hardcoded for demonstration purposes.

    Args:
        location (str): The city and state, e.g., San Francisco, CA.
        num_days (int): The number of days to forecast.
        format (str, optional): The temperature unit to use. Defaults to "fahrenheit".

    Returns:
        str: JSON string with the N-day weather forecast data.
    """
    # This example just returns a fixed response regardless of the input.
    # In a real scenario, the response would depend on the location, num_days, and format.
    forecast = [
        {"day": 1, "temperature": "22", "format": format, "description": "Sunny"},
        {"day": 2, "temperature": "18", "format": format, "description": "Cloudy"},
        {"day": 3, "temperature": "15", "format": format, "description": "Rainy"}
    ]

    # Return only the forecast for the requested number of days.
    return json.dumps(forecast[:num_days])

available_functions = {
    "get_current_weather": get_current_weather,
    "get_n_day_weather_forecast": get_n_day_weather_forecast,
}

尝试请求天气

messages = []
messages.append({"role": "user", "content": "What's the weather like in Tokyo!"})
chat_response = chat_completion_request(messages, tools=tools, tool_choice="auto")
assistant_message = chat_response.choices[0].message
assistant_message = json.loads(assistant_message.model_dump_json())
assistant_message["content"] = str(assistant_message["tool_calls"][0]["function"])

#a temporary patch but this should be handled differently
# remove "function_call" from assistant message
del assistant_message["function_call"]

assistant_message

"""
{'content': '{\'arguments\': \'{\\n  "location": "Tokyo",\\n  "format": "celsius"\\n}\', \'name\': \'get_current_weather\'}',
 'role': 'assistant',
 'tool_calls': [{'id': 'call_Tz8S1HgvnaBzf6CFZP1u4d1J',
   'function': {'arguments': '{\n  "location": "Tokyo",\n  "format": "celsius"\n}',
    'name': 'get_current_weather'},
   'type': 'function'}]}

"""
messages.append(assistant_message)

可以看到，LLM返回了json格式的参数信息（其实是str）以及需要调用的function名称，拥有这些信息之后就可以调用之前定义的函数了

# get the weather information to pass back to the model
function_name_to_call = assistant_message['tool_calls'][0]['function']['name']
function_arguments = assistant_message['tool_calls'][0]['function']['arguments']
weather = available_functions[function_name_to_call](function_arguments)

将函数执行结果和对话历史返回给LLM

messages.append({"role": "tool",
                 "tool_call_id": assistant_message["tool_calls"][0]["id"],
                 "name": assistant_message["tool_calls"][0]["function"]["name"],
                 "content": weather})

messages
"""
[{'role': 'user', 'content': "What's the weather like in Tokyo!"},
 {'content': '{\'arguments\': \'{\\n  "location": "Tokyo",\\n  "format": "celsius"\\n}\', \'name\': \'get_current_weather\'}',
  'role': 'assistant',
  'tool_calls': [{'id': 'call_Tz8S1HgvnaBzf6CFZP1u4d1J',
    'function': {'arguments': '{\n  "location": "Tokyo",\n  "format": "celsius"\n}',
     'name': 'get_current_weather'},
    'type': 'function'}]},
 {'role': 'tool',
  'tool_call_id': 'call_Tz8S1HgvnaBzf6CFZP1u4d1J',
  'name': 'get_current_weather',
  'content': '{"location": "Tokyo", "temperature": "10", "format": "fahrenheit"}'}]

"""

最终输出结果（for user）

final_response = chat_completion_request(messages, tools=tools)
final_response.choices[0].message.content

output：

'The current weather in Tokyo is partly cloudy with a temperature of 10°C (50°F).'

通过function calling 实现sql执行

下载演示所用的sqlite数据

!wget https://www.sqlitetutorial.net/wp-content/uploads/2018/03/chinook.zip -O chinook.zip
!unzip chinook.zip

连接数据库，和定义一些function

import sqlite3

conn = sqlite3.connect("/content/chinook.db")
print("Opened database successfully")

def get_table_names(conn):
    """Return a list of table names."""
    table_names = []
    tables = conn.execute("SELECT name FROM sqlite_master WHERE type='table';")
    for table in tables.fetchall():
        table_names.append(table[0])
    return table_names


def get_column_names(conn, table_name):
    """Return a list of column names."""
    column_names = []
    columns = conn.execute(f"PRAGMA table_info('{table_name}');").fetchall()
    for col in columns:
        column_names.append(col[1])
    return column_names


def get_database_info(conn):
    """Return a list of dicts containing the table name and columns for each table in the database."""
    table_dicts = []
    for table_name in get_table_names(conn):
        columns_names = get_column_names(conn, table_name)
        table_dicts.append({"table_name": table_name, "column_names": columns_names})
    return table_dicts

获取db 的schema

database_schema_dict = get_database_info(conn)
database_schema_string = "\n".join(
    [
        f"Table: {table['table_name']}\nColumns: {', '.join(table['column_names'])}"
        for table in database_schema_dict
    ]
)

定义tools工具列表

tools = [
    {
        "type": "function",
        "function": {
            "name": "ask_database",
            "description": "Use this function to answer user questions about music. Input should be a fully formed SQL query.",
            "parameters": {
                "type": "object",
                "properties": {
                    "query": {
                        "type": "string",
                        "description": f"""
                                SQL query extracting info to answer the user's question.
                                SQL should be written using this database schema:
                                {database_schema_string}
                                The query should be returned in plain text, not in JSON.
                                """,
                    }
                },
                "required": ["query"],
            },
        }
    }
]

定义用于执行sql的function

def ask_database(conn, query):
    """Function to query SQLite database with a provided SQL query."""
    try:
        results = str(conn.execute(query).fetchall())
    except Exception as e:
        results = f"query failed with error: {e}"
    return results

def execute_function_call(message):
    if message.tool_calls[0].function.name == "ask_database":
        query = json.loads(message.tool_calls[0].function.arguments)["query"]
        results = ask_database(conn, query)
    else:
        results = f"Error: function {message.tool_calls[0].function.name} does not exist"
    return results
messages = []
messages.append({"role": "system", "content": "Answer user questions by generating SQL queries against the Chinook Music Database."})
messages.append({"role": "user", "content": "Hi, who are the top 5 artists by number of tracks?"})
chat_response = chat_completion_request(messages, tools)
assistant_message = chat_response.choices[0].message
messages.append(assistant_message)

if assistant_message.tool_calls:
    results = execute_function_call(assistant_message)
    messages.append({"role": "tool", "tool_call_id": assistant_message.tool_calls[0].id, "name": assistant_message.tool_calls[0].function.name, "content": results})

此时的messages

[{'role': 'system',
  'content': 'Answer user questions by generating SQL queries against the Chinook Music Database.'},
 {'role': 'user',
  'content': 'Hi, who are the top 5 artists by number of tracks?'},
 ChatCompletionMessage(content='Function(arguments=\'{\\n  "query": "SELECT artists.Name, COUNT(tracks.TrackId) AS num_tracks FROM artists JOIN albums ON artists.ArtistId = albums.ArtistId JOIN tracks ON albums.AlbumId = tracks.AlbumId GROUP BY artists.ArtistId ORDER BY num_tracks DESC LIMIT 5"\\n}\', name=\'ask_database\')', role='assistant', function_call=None, tool_calls=[ChatCompletionMessageToolCall(id='call_vSLysEQncbGvMgXGhttIow6v', function=Function(arguments='{\n  "query": "SELECT artists.Name, COUNT(tracks.TrackId) AS num_tracks FROM artists JOIN albums ON artists.ArtistId = albums.ArtistId JOIN tracks ON albums.AlbumId = tracks.AlbumId GROUP BY artists.ArtistId ORDER BY num_tracks DESC LIMIT 5"\n}', name='ask_database'), type='function')]),
 {'role': 'tool',
  'tool_call_id': 'call_vSLysEQncbGvMgXGhttIow6v',
  'name': 'ask_database',
  'content': "[('Iron Maiden', 213), ('U2', 135), ('Led Zeppelin', 114), ('Metallica', 112), ('Deep Purple', 92)]"}]

最终输出：

final_response = chat_completion_request(messages, tools=tools)
final_response.choices[0].message.content
# output
"""
The top 5 artists by number of tracks are:
1. Iron Maiden - 213 tracks
2. U2 - 135 tracks
3. Led Zeppelin - 114 tracks
4. Metallica - 112 tracks
5. Deep Purple - 92 tracks
"""

基于function calling构建智能代理：自动获取与深度解析arXiv学术文章

在这一节中将介绍如何构建一个能够从arXiv上查找论文，下载分析并总结学术论文的Agent。这个Agent不仅可以帮助用户快速了解特定领域的最新研究动态，而且还能深入分析和总结选定文章的核心内容。

Agent核心功能

获取arXiv文章 get_articles

代理通过arxiv库搜索关于特定主题的文章，为用户提供简要的文章摘要和链接。

阅读并总结文章 read_article_and_summarize

利用PyPDF2库读取选中文章的PDF文件，代理能够提炼出文章的主要论点、支撑证据和结论。

环境配置

!pip install scipy --quiet
!pip install tenacity --quiet
!pip install tiktoken==0.3.3 --quiet
!pip install termcolor --quiet
!pip install openai --quiet
!pip install arxiv --quiet
!pip install pandas --quiet
!pip install PyPDF2 --quiet
!pip install tqdm --quiet
import os
import arxiv
import ast
import concurrent
import json
import os
import pandas as pd
import tiktoken
from csv import writer
from IPython.display import display, Markdown, Latex
from openai import OpenAI
from PyPDF2 import PdfReader
from scipy import spatial
from tenacity import retry, wait_random_exponential, stop_after_attempt
from tqdm import tqdm
from termcolor import colored

GPT_MODEL = "gpt-3.5-turbo-0613"
EMBEDDING_MODEL = "text-embedding-ada-002"
client = OpenAI()

所有下载的论文都被存储在本地./data/papers目录中，并且每篇文章的详细信息（包括其嵌入向量）都记录在arxiv_library.csv文件中。

directory = './data/papers'

# Check if the directory already exists
if not os.path.exists(directory):
    # If the directory doesn't exist, create it and any necessary intermediate directories
    os.makedirs(directory)
    print(f"Directory '{directory}' created successfully.")
else:
    # If the directory already exists, print a message indicating it
    print(f"Directory '{directory}' already exists.")

# Set a directory to store downloaded papers
data_dir = os.path.join(os.curdir, "data", "papers")
paper_dir_filepath = "./data/arxiv_library.csv"

# Generate a blank dataframe where we can store downloaded files
df = pd.DataFrame(list())
df.to_csv(paper_dir_filepath)

我们将定义一些utils function用于：

文章获取与存储：通过get_articles函数查询主题相关的文章，系统自动下载文章并记录重要信息及embedding向量。
文章选择与内容提取：根据用户的查询，系统通过计算embedding向量的相似度来选择最相关的文章，并提取出文章的文本内容。
内容分块与总结：长文本被分割成多个较小的块，每个块被独立总结。
汇总总结：所有独立块的总结被汇总成一篇全面的总结，更好地回应用户的查询。

@retry(wait=wait_random_exponential(min=1, max=40), stop=stop_after_attempt(3))
def embedding_request(text):
    response = client.embeddings.create(input=text, model=EMBEDDING_MODEL)
    return response


@retry(wait=wait_random_exponential(min=1, max=40), stop=stop_after_attempt(3))
def get_articles(query, library=paper_dir_filepath, top_k=5):
    """This function gets the top_k articles based on a user's query, sorted by relevance.
    It also downloads the files and stores them in arxiv_library.csv to be retrieved by the read_article_and_summarize.
    """
    client = arxiv.Client()
    search = arxiv.Search(
        query = query,
        max_results = top_k,
        sort_by = arxiv.SortCriterion.SubmittedDate
    )
    result_list = []
    for result in client.results(search):
        result_dict = {}
        result_dict.update({"title": result.title})
        result_dict.update({"summary": result.summary})

        # Taking the first url provided
        result_dict.update({"article_url": [x.href for x in result.links][0]})
        result_dict.update({"pdf_url": [x.href for x in result.links][1]})
        result_list.append(result_dict)

        # Store references in library file
        response = embedding_request(text=result.title)
        file_reference = [
            result.title,
            result.download_pdf(data_dir),
            response.data[0].embedding,
        ]

        # Write to file
        with open(library, "a") as f_object:
            writer_object = writer(f_object)
            writer_object.writerow(file_reference)
            f_object.close()
    return result_list

def strings_ranked_by_relatedness(
    query: str,
    df: pd.DataFrame,
    relatedness_fn=lambda x, y: 1 - spatial.distance.cosine(x, y),
    top_n: int = 100,
) -> list[str]:
    """Returns a list of strings and relatednesses, sorted from most related to least."""
    query_embedding_response = embedding_request(query)
    query_embedding = query_embedding_response.data[0].embedding
    strings_and_relatednesses = [
        (row["filepath"], relatedness_fn(query_embedding, row["embedding"]))
        for i, row in df.iterrows()
    ]
    strings_and_relatednesses.sort(key=lambda x: x[1], reverse=True)
    strings, relatednesses = zip(*strings_and_relatednesses)
    return strings[:top_n]

def read_pdf(filepath):
    """Takes a filepath to a PDF and returns a string of the PDF's contents"""
    # creating a pdf reader object
    reader = PdfReader(filepath)
    pdf_text = ""
    page_number = 0
    for page in reader.pages:
        page_number += 1
        pdf_text += page.extract_text() + f"\nPage Number: {page_number}"
    return pdf_text


# Split a text into smaller chunks of size n, preferably ending at the end of a sentence
def create_chunks(text, n, tokenizer):
    """Returns successive n-sized chunks from provided text."""
    tokens = tokenizer.encode(text)
    i = 0
    while i < len(tokens):
        # Find the nearest end of sentence within a range of 0.5 * n and 1.5 * n tokens
        j = min(i + int(1.5 * n), len(tokens))
        while j > i + int(0.5 * n):
            # Decode the tokens and check for full stop or newline
            chunk = tokenizer.decode(tokens[i:j])
            if chunk.endswith(".") or chunk.endswith("\n"):
                break
            j -= 1
        # If no end of sentence found, use n tokens as the chunk size
        if j == i + int(0.5 * n):
            j = min(i + n, len(tokens))
        yield tokens[i:j]
        i = j


def extract_chunk(content, template_prompt):
    """This function applies a prompt to some input content. In this case it returns a summarized chunk of text"""
    prompt = template_prompt + content
    response = client.chat.completions.create(
        model=GPT_MODEL, messages=[{"role": "user", "content": prompt}], temperature=0
    )
    return response.choices[0].message.content


def summarize_text(query):
    """This function does the following:
    - Reads in the arxiv_library.csv file in including the embeddings
    - Finds the closest file to the user's query
    - Scrapes the text out of the file and chunks it
    - Summarizes each chunk in parallel
    - Does one final summary and returns this to the user"""

    # A prompt to dictate how the recursive summarizations should approach the input paper
    summary_prompt = """Summarize this text from an academic paper. Extract any key points with reasoning.\n\nContent:"""

    # If the library is empty (no searches have been performed yet), we perform one and download the results
    library_df = pd.read_csv(paper_dir_filepath).reset_index()
    if len(library_df) == 0:
        print("No papers searched yet, downloading first.")
        get_articles(query)
        print("Papers downloaded, continuing")
        library_df = pd.read_csv(paper_dir_filepath).reset_index()
    library_df.columns = ["title", "filepath", "embedding"]
    library_df["embedding"] = library_df["embedding"].apply(ast.literal_eval)
    strings = strings_ranked_by_relatedness(query, library_df, top_n=1)
    print("Chunking text from paper")
    pdf_text = read_pdf(strings[0])

    # Initialise tokenizer
    tokenizer = tiktoken.get_encoding("cl100k_base")
    results = ""

    # Chunk up the document into 1500 token chunks
    chunks = create_chunks(pdf_text, 1500, tokenizer)
    text_chunks = [tokenizer.decode(chunk) for chunk in chunks]
    print("Summarizing each chunk of text")

    # Parallel process the summaries
    with concurrent.futures.ThreadPoolExecutor(
        max_workers=len(text_chunks)
    ) as executor:
        futures = [
            executor.submit(extract_chunk, chunk, summary_prompt)
            for chunk in text_chunks
        ]
        with tqdm(total=len(text_chunks)) as pbar:
            for _ in concurrent.futures.as_completed(futures):
                pbar.update(1)
        for future in futures:
            data = future.result()
            results += data

    # Final summary
    print("Summarizing into overall summary")
    response = client.chat.completions.create(
        model=GPT_MODEL,
        messages=[
            {
                "role": "user",
                "content": f"""Write a summary collated from this collection of key points extracted from an academic paper.
                        The summary should highlight the core argument, conclusions and evidence, and answer the user's query.
                        User query: {query}
                        The summary should be structured in bulleted lists following the headings Core Argument, Evidence, and Conclusions.
                        Key points:\n{results}\nSummary:\n""",
            }
        ],
        temperature=0,
    )
    return response

实现一个Conversation类，用于支持与API进行多轮对话

@retry(wait=wait_random_exponential(min=1, max=40), stop=stop_after_attempt(3))
def chat_completion_request(messages, functions=None, model=GPT_MODEL):
    try:
        response = client.chat.completions.create(
            model=model,
            messages=messages,
            functions=functions,
        )
        return response
    except Exception as e:
        print("Unable to generate ChatCompletion response")
        print(f"Exception: {e}")
        return e


class Conversation:
    def __init__(self):
        self.conversation_history = []

    def add_message(self, role, content):
        message = {"role": role, "content": content}
        self.conversation_history.append(message)

    def display_conversation(self, detailed=False):
        role_to_color = {
            "system": "red",
            "user": "green",
            "assistant": "blue",
            "function": "magenta",
        }
        for message in self.conversation_history:
            print(
                colored(
                    f"{message['role']}: {message['content']}\n\n",
                    role_to_color[message["role"]],
                )
            )

完成以上基本工作，接下来是agent的核心内容

定义工具列表 tools

arxiv_functions = [
    {
        "name": "get_articles",
        "description": """Use this function to get academic papers from arXiv to answer user questions.""",
        "parameters": {
            "type": "object",
            "properties": {
                "query": {
                    "type": "string",
                    "description": f"""
                            User query in JSON. Responses should be summarized and should include the article URL reference
                            """,
                }
            },
            "required": ["query"],
        },
    },
    {
        "name": "read_article_and_summarize",
        "description": """Use this function to read whole papers and provide a summary for users.
        You should NEVER call this function before get_articles has been called in the conversation.""",
        "parameters": {
            "type": "object",
            "properties": {
                "query": {
                    "type": "string",
                    "description": f"""
                            Description of the article in plain text based on the user's query
                            """,
                }
            },
            "required": ["query"],
        },
    }
]

定义function用于工具调用

def chat_completion_with_function_execution(messages, functions=[None]):
    """This function makes a ChatCompletion API call with the option of adding functions"""
    response = chat_completion_request(messages, functions)
    full_message = response.choices[0]
    if full_message.finish_reason == "function_call":
        print(f"Function generation requested, calling function")
        return call_arxiv_function(messages, full_message)
    else:
        print(f"Function not required, responding to user")
        return response


def call_arxiv_function(messages, full_message):
    """Function calling function which executes function calls when the model believes it is necessary.
    Currently extended by adding clauses to this if statement."""

    if full_message.message.function_call.name == "get_articles":
        try:
            parsed_output = json.loads(
                full_message.message.function_call.arguments
            )
            print("Getting search results")
            results = get_articles(parsed_output["query"])
        except Exception as e:
            print(parsed_output)
            print(f"Function execution failed")
            print(f"Error message: {e}")
        messages.append(
            {
                "role": "function",
                "name": full_message.message.function_call.name,
                "content": str(results),
            }
        )
        try:
            print("Got search results, summarizing content")
            response = chat_completion_request(messages)
            return response
        except Exception as e:
            print(type(e))
            raise Exception("Function chat request failed")

    elif (
        full_message.message.function_call.name == "read_article_and_summarize"
    ):
        parsed_output = json.loads(
            full_message.message.function_call.arguments
        )
        print("Finding and reading paper")
        summary = summarize_text(parsed_output["query"])
        return summary

    else:
        raise Exception("Function does not exist and cannot be called")

arXiv conversation, Start with a system message

paper_system_message = """You are arXivGPT, a helpful assistant pulls academic papers to answer user questions.
You summarize the papers clearly so the customer can decide which to read to answer their question.
You always provide the article_url and title so the user can understand the name of the paper and click through to access it.
Begin!"""
paper_conversation = Conversation()
paper_conversation.add_message("system", paper_system_message)
# Add a user message
paper_conversation.add_message("user", "Hi, how does PPO reinforcement learning work?")
chat_response = chat_completion_with_function_execution(
    paper_conversation.conversation_history, functions=arxiv_functions
)
assistant_message = chat_response.choices[0].message.content
paper_conversation.add_message("assistant", assistant_message)
display(Markdown(assistant_message))

output:

Function generation requested, calling function
Getting search results
Got search results, summarizing content
I found several papers related to PPO reinforcement learning. Here are a few summaries:

Title: "Bandit Profit-maximization for Targeted Marketing"

Summary: This paper presents near-optimal algorithms for optimizing profit over multiple demand curves, which are dependent on different ancillary variables while maintaining the same price. It is relevant to PPO reinforcement learning as it tackles a sequential profit-maximization problem.
Article URL: Link
Title: "Inferring potential landscapes: A Schrödinger bridge approach to Maximum Caliber"

Summary: This work extends Schrödinger bridges to account for integral constraints along paths, specifically in the context of Maximum Caliber, a Maximum Entropy principle applied in a dynamic context. While not directly related to PPO reinforcement learning, it can provide insights into stochastic dynamics and inference of time-varying potential landscapes.
Article URL: Link
Title: "a-DCF: an architecture agnostic metric with application to spoofing-robust speaker verification"

Summary: This paper proposes an architecture-agnostic detection cost function (a-DCF) for evaluating spoofing-robust automatic speaker verification (ASV) systems. Although it does not focus on PPO reinforcement learning, it provides a metric for evaluating ASV systems in the presence of spoofing attacks.
Article URL: Link
These papers should provide insights into different aspects of reinforcement learning and related topics.

Add another user message to induce our system to use the second tool

paper_conversation.add_message(
    "user",
    "Can you read the PPO sequence generation paper for me and give me a summary",
)
updated_response = chat_completion_with_function_execution(
    paper_conversation.conversation_history, functions=arxiv_functions
)
display(Markdown(updated_response.choices[0].message.content))

output:

Function generation requested, calling function
Finding and reading paper
Chunking text from paper
Summarizing each chunk of text
100%|██████████| 4/4 [00:04<00:00,  1.11s/it]
Summarizing into overall summary
Core Argument:

The paper discusses the potential of using a general-purpose large language model (LLM) to learn the structural biophysics of DNA.
The authors show that fine-tuning a LLM, specifically chatGPT 3.5-turbo, can enhance its ability to analyze and design DNA sequences and their structures.
The study focuses on the formation of secondary structures in DNA, which are governed by base pairing and stacking bonds.
The authors propose a method that involves chaining together models fine-tuned for subtasks and using a chain-of-thought approach to improve the model's performance.
Evidence:

The authors use the NUPACK software suite to provide data for training and validation.
The expert pipeline approach involves using models that have been fine-tuned for subtasks and feeding their outputs into each other.
The models perform better when they explicitly consider the nearest neighbor window and the reverse complement of the sequences.
The pipeline approach, where a separate model determines the reverse complement and feeds it to another model for secondary structure prediction, enhances the accuracy of the predictions.
The performance of the models improves with larger training sets.
Conclusions:

The study demonstrates the potential of using LLMs to learn DNA structural biophysics.
Integrating experimental data and machine learning is important in scientific research.
The expert pipeline approach and breaking down the problem into smaller subtasks improve the performance of the models in DNA sequence analysis.
The combination of chain-of-thought and model pipeline provides the best results in analysis tasks.
The CoT approach, combined with the reverse complement transformation, yields the highest accuracy in design tasks.
The addition of an error checking layer further improves accuracy in design tasks.
Sequence design is more challenging than analysis, but error correction can compensate for the increased difficulty.
Larger training sets benefit design tasks more.
Future research directions include exploring chaining smaller models for performance improvement and using an LLM architecture involving both an encoder and decoder for direct sequence comparison.

Function calling via open source LLMs

在考虑成本和隐私性的背景下，我们可能会倾向于在开源的大型语言模型（LLM）上实现函数调用功能。目前，有几个框架支持以类似OpenAI API的形式调用工具（tools call）：

Xinference
Text Generation Inference (TGI)

而在开源大型语言模型（LLM）方面，支持工具调用的主要有：

Llama-3
Mixtral-8x7B-Instruct-v0.1
qwent
chatGLM-6B
NexusRaven-13B
gorilla-openfunctions-v1
等等

以下以Xinference和chatGLM-6B为例，探索如何通过OpenAI API的形式调用开源模型的函数调用功能。

环境

%pip install -U -q  xinference[transformers] openai langchain
!pip install typing-extensions --upgrade
## Start Local Server

!nohup xinference-local  > xinference.log 2>&1 &

模型加载

!xinference launch -u my-llm --model-name chatglm3 --size-in-billions 6 --model-format pytorch
## Interact with the running model
import openai

messages=[
    {
        "role": "user",
        "content": "Who are you?"
    }
]

client = openai.Client(api_key="empty", base_url=f"http://0.0.0.0:9997/v1")
client.chat.completions.create(
    model="my-llm",
    messages=messages,
)

# ChatCompletion(id='chatda6056ac-da01-11ee-b92e-0242ac1c000c', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content="I am an AI assistant named ChatGLM3-6B, which is developed based on the language model jointly trained by Tsinghua University KEG Lab and Zhipu AI Company in 2023. My job is to provide appropriate answers and support to users' questions and requests.", role='assistant', function_call=None, tool_calls=None))], created=1709541198, model='my-llm', object='chat.completion', system_fingerprint=None, usage=CompletionUsage(completion_tokens=-1, prompt_tokens=-1, total_tokens=-1))

without tool using

completion = client.chat.completions.create(
    model="my-llm",
    messages=[{"role": "user", "content": "What is the weather like in London?"}]
)
# print(handle_response(completion))
print(completion.choices[0].message.content)

"""
London has a temperate climate with warm summers and cool winters. The average temperature during the summer months (June to August) is around 18°C, while the winter months (December to February) are around 6°C. The city experiences heavy rainfall throughout the year, with an annual precipitation of around 350 mm. The average precipitation on the weekends is around 40 mm. London's cloudy skies are common throughout the year, but they are especially prevalent in December and January.

"""

tool using

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_current_weather",
            "description": "Get the current weather in a given location",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city and state, e.g. San Francisco, CA",
                    },
                    "unit": {
                        "type": "string",
                        "enum": ["celsius", "fahrenheit"]},
                },
                "required": ["location"],
            },
        },
    }
]

def get_completion(messages, model="my-llm", temperature=0, max_tokens=500, tools=None, tool_choice=None):
    response = client.chat.completions.create(
        model=model,
        messages=messages,
        temperature=temperature,
        max_tokens=max_tokens,
        tools=tools,
        tool_choice=tool_choice
    )
    return response.choices[0].message
# Defines a dummy function to get the current weather
def get_current_weather(location, unit="fahrenheit"):
    """Get the current weather in a given location"""
    weather = {
        "location": location,
        "temperature": "50",
        "unit": unit,
    }

    return json.dumps(weather)

messages = []
messages.append({"role": "user", "content": "What's the weather like in Boston!"})
assistant_message = get_completion(messages, tools=tools, tool_choice="auto")
assistant_message = json.loads(assistant_message.model_dump_json())
assistant_message["content"] = str(assistant_message["tool_calls"][0]["function"])

#a temporary patch but this should be handled differently
# remove "function_call" from assistant message
del assistant_message["function_call"]

messages.append(assistant_message)


# get the weather information to pass back to the model
weather = get_current_weather(messages[1]["tool_calls"][0]["function"]["arguments"])

messages.append({"role": "tool",
                 "tool_call_id": assistant_message["tool_calls"][0]["id"],
                 "name": assistant_message["tool_calls"][0]["function"]["name"],
                 "content": weather})


final_response = get_completion(messages, tools=tools)

final_response

"""
ChatCompletionMessage(content='The current weather in Boston is 50 degrees Fahrenheit.', role='assistant', function_call=None, tool_calls=[])
"""

LLM 在执行function calling时经历了什么

可惜我们并不能看到openAI的模型在服务器端发生了什么，但是根据开源的模型和推理框架，我们某种程度上，也能对LLM在执行function calling的背后逻辑一探究竟。

这部分内容可以从推理框架和开源模型的源码中找到答案。

根据xinference 的源码：https://github.com/xorbitsai/inference/blob/main/xinference/model/llm/utils.py#L42

我们主要关注ChatGLM3 和Qwen

当我们使用以下假设对话时：

messages=[
    {
      "role": "user",
      "content": "今天北京的天气怎么样？"
    }
]
tools = [
    {
        "name": "get_current_weather",
        "description": "Get the current weather in a given location",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {
                    "type": "string",
                    "description": "The city and state, e.g. San Francisco, CA",
                },
                "unit": {"type": "string"},
            },
            "required": ["location"],
        },
    }
]

根据GLM的官方文档，则最终给到ChatGLM3模型的prompt应该长这样：

<|system|>
Answer the following questions as best as you can. You have access to the following tools:
[
    {
        "name": "get_current_weather",
        "description": "Get the current weather in a given location",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {
                    "type": "string",
                    "description": "The city and state, e.g. San Francisco, CA",
                },
                "unit": {"type": "string"},
            },
            "required": ["location"],
        },
    }
]
<|user|>
今天北京的天气怎么样？
<|assistant|>
好的，让我们来查看今天的天气
<|assistant|>get_current_weather
```python
tool_call(location="beijing", unit="celsius")
```
<|observation|>
{"temperature": 22}
<|assistant|>
根据查询结果，今天北京的气温为 22 摄氏度。

根据xinference 中有关qwen的代码

elif prompt_style.style_name == "QWEN":
            if tools:
                tool_desc = """{name_for_model}: Call this tool to interact with the {name_for_human} API. What is the {name_for_human} API useful for? {description_for_model} Parameters: {parameters} Format the arguments as a JSON object."""

                react_instruction = """Answer the following questions as best you can. You have access to the following APIs:

{tools_text}

Use the following format:

Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [{tools_name_text}]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can be repeated zero or more times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question

Begin!"""
                tools_text = []
                tools_name_text = []
                for func_info in tools:
                    parameters = []
                    required_parameters = func_info["function"]["parameters"].get(
                        "required", []
                    )
                    for name, p in func_info["function"]["parameters"][
                        "properties"
                    ].items():
                        param = dict({"name": name}, **p)
                        if name in required_parameters:
                            param["required"] = True
                        parameters.append(param)

                    name = func_info["function"]["name"]
                    desc = func_info["function"]["description"]
                    tool_string = tool_desc.format(
                        name_for_model=name,
                        name_for_human=name,
                        # Hint: You can add the following format requirements in description:
                        #   "Format the arguments as a JSON object."
                        #   "Enclose the code within triple backticks (`) at the beginning and end of the code."
                        description_for_model=desc,
                        parameters=json.dumps(parameters, ensure_ascii=False),
                    )
                    tools_text.append(tool_string)
                    tools_name_text.append(name)
                tools_text_string = "\n\n".join(tools_text)
                tools_name_text_string = ", ".join(tools_name_text)
                tool_system = react_instruction.format(
                    tools_text=tools_text_string,
                    tools_name_text=tools_name_text_string,
                )
            else:
                tool_system = ""

            ret = f"<|im_start|>system\n{prompt_style.system_prompt}<|im_end|>"
            for message in chat_history:
                role = get_role(message["role"])
                content = message["content"]

                ret += prompt_style.intra_message_sep
                if tools:
                    if role == "user":
                        if tool_system:
                            content = tool_system + f"\n\nQuestion: {content}"
                            tool_system = ""
                        else:
                            content = f"Question: {content}"
                    elif role == "assistant":
                        tool_calls = message.get("tool_calls")
                        if tool_calls:
                            func_call = tool_calls[0]["function"]
                            f_name, f_args = (
                                func_call["name"],
                                func_call["arguments"],
                            )
                            content = f"Thought: I can use {f_name}.\nAction: {f_name}\nAction Input: {f_args}"
                        elif content:
                            content = f"Thought: I now know the final answer.\nFinal answer: {content}"
                    elif role == "tool":
                        role = "function"
                        content = f"Observation: {content}"
                    else:
                        raise Exception(f"Unsupported message role: {role}")
                if content:
                    content = content.lstrip("\n").rstrip()
                    ret += f"<|im_start|>{role}\n{content}<|im_end|>"
                else:
                    ret += f"<|im_start|>{role}\n"
            return ret

会稍微复杂一些，利用了react的COT方式（代码中的react_instruction），要求模型以一系列的Thought（思考）、Action（行动）、Action Input（行动输入）和Observation（观察结果）步骤，最终给出问题的答案，以增加正确性。

假设使用以下工具列表和对话历史：

# 工具列表
tools = [
    {
        "function": {
            "name": "geo_lookup",
            "description": "Retrieves geographical information.",
            "parameters": {
                "required": ["query"],
                "properties": {
                    "query": {
                        "type": "string",
                        "description": "The query to lookup."
                    }
                }
            }
        }
    }
]

chat_history = [
    {
        "role": "user",
        "content": "What is the population of Tokyo?"
    },
    {
        "role": "assistant",
        "tool_calls": [
            {
                "function": "geo_lookup",
                "arguments": {
                    "query": "Tokyo"
                }
            }
        ],
        "content": "The population of Tokyo is about 14 million."
    }
]

可以推断出，最终输入Qwen的prompt应该长这样：

system

Answer the following questions as best you can. You have access to the following APIs:

geo_lookup: Call this tool to interact with the geo_lookup API. What is the geo_lookup API useful for? Retrieves geographical information. Parameters: [{"name": "query", "type": "string", "description": "The query to lookup.", "required": true}] Format the arguments as a JSON object.

Use the following format:

Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [geo_lookup]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can be repeated zero or more times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question

Begin!

user
Question: What is the population of Tokyo?
assistant
Thought: I can use geo_lookup to find the information.
Action: geo_lookup
Action Input: {"query": "Tokyo"}
Observation: The population of Tokyo is about 14 million.
Thought: I now know the final answer.
Final answer: The population of Tokyo is about 14 million.

在这个例子中，用户询问东京的人口数量。助手利用geo_lookup工具进行查询，具体的行动步骤包括：

思考：助手决定可以使用geo_lookup工具来查找信息。
行动：实际调用geo_lookup工具。
行动输入：向工具传递的参数，即查询"Tokyo"。
观察：观察到的结果，这里是东京的人口大约为1400万。
最终思考：基于观察结果，助手得出了最终答案。
最终答案：向用户提供的答案，即东京的人口数量。

更详细的内容，建议看这篇知乎文章：Qwen Function Calling 的对话模板及训练方法总结; 以及qwen的官方文档：ReAct Prompting 示例

参考

openAI function calling

How to call functions with chat models

How_to_call_functions_for_knowledge_retrieval

Qwen Function Calling 的对话模板及训练方法总结

qwen的官方文档

GLM的官方文档

tool-using via Groq API

Json mode in Groq

OpenAI JSON Mode & Seeding

OpenAI API Guide: Using JSON Mode

发布于 2024-04-28 15:50・IP 属地上海