Harnessing OpenAI Function Calling for Advanced AI Interaction
Written on
Introduction to Function Calling
OpenAI's GPT-3.5 and GPT-4 have introduced an innovative feature known as "Function Calling." This functionality empowers the AI to execute specific actions based on user input expressed in natural language, facilitating smooth integration with various applications. This article delves into the capabilities of using natural language as an interface through Function Calling.
Understanding the Mechanism of Function Calling
Function Calling enables the AI to invoke designated functions in response to user queries. For example, if a user requests, “Schedule a meeting for next Friday and send an email to confirm,” the AI can trigger a function that accesses Gmail to send the email. Similarly, a user might say, “Log an expense of 600 yen for a taxi ride on 5/20,” prompting the AI to register that expense. This capability marks a significant evolution from traditional applications.
In an API context, you can define functions for models like gpt-3.5-turbo-0613 and gpt-4-0613, allowing the model to intelligently generate a JSON object with the necessary arguments to execute those functions. Notably, while the Chat Completions API doesn't directly invoke the function, it produces JSON that you can utilize to call the function in your own code.
The latest versions (gpt-3.5-turbo-0613 and gpt-4-0613) are finely tuned to recognize when a function should be called based on user input, responding with JSON that aligns with the function's specifications. However, this functionality carries potential risks, and it is advisable to implement user confirmation mechanisms before executing actions that could significantly affect users (e.g., sending emails, making purchases).
Function Calling allows for more consistent retrieval of structured data from the model. For example, you can:
- Develop chatbots that respond to inquiries by invoking external APIs (similar to ChatGPT Plugins), such as defining functions like send_email(to: string, body: string) or get_current_weather(location: string, unit: 'celsius' | 'fahrenheit').
- Transform natural language into API requests, e.g., converting “Who are my top customers?” into get_customers(min_revenue: int, created_before: string, limit: int) for internal API calls.
- Extract structured information from text, e.g., creating functions like extract_data(name: string, birthday: string) or sql_query(query: string).
The standard procedure for using Function Calling involves the following steps:
- Invoke the model with the user query and a predefined set of functions.
- If the model opts to call a function, the response will be a JSON object formatted according to your custom schema (be mindful that the model may produce invalid JSON or fabricate parameters).
- Parse this string into JSON in your code, then execute the function with the provided arguments if they exist.
- Call the model again, appending the function's response as a new message, allowing the model to summarize the results for the user.
The Impact of Function Calling
The introduction of Function Calling significantly enhances user experience. By enabling users to simply express requests like “Do this for me,” the AI can leverage the appropriate tools to fulfill those tasks, bringing us closer to a future where AI takes on more responsibilities on our behalf.
Function Calling in Practice
To illustrate the capabilities of Function Calling, a demonstration integrating Slack's Webhook and Google Spreadsheet was created. Imagine automating the logging of expenses in a Google Spreadsheet while also notifying a Slack channel. This can be accomplished by defining two functions: one to write to the spreadsheet and another to post to Slack.
Here’s how you can leverage the OpenAI API using Python:
The code for these functions is as follows:
# Function to write expenses to Google Spreadsheet
def write_expense_to_spreadsheet(date, amount, description):
credentials = ServiceAccountCredentials.from_json_keyfile_name("service_account.json", scope)
client = gspread.authorize(credentials)
spreadsheet_id = "Your Spreadsheet ID"
sheet = client.open_by_key(spreadsheet_id).sheet1
# Add expense information to the spreadsheet
sheet.append_row([date, amount, description])
return "Expense recorded in spreadsheet successfully."
# Function to post a message to Slack via a webhook
WEBHOOK_URL = "Your Webhook URL"
def post_message_to_slack_via_webhook(message):
# Prepare the data to be sent
payload = {'text': message}
# Send a POST request to the Webhook URL
response = requests.post(
WEBHOOK_URL, data=json.dumps(payload),
headers={'Content-Type': 'application/json'}
)
# Verify the response
if response.status_code == 200:
return "Message sent to Slack successfully."else:
return f"Failed to send message: {response.content}"
Having established the functions you wish to utilize, the next step is to implement the processing logic in GPT.
def run_conversation():
# STEP1: Get user input for the model
user_input = input("user_input:")
response = openai.ChatCompletion.create(
model="gpt-3.5-turbo-0613",
messages=[
{"role": "system", "content": "You are the best assistant ever!"},
{"role": "user", "content": user_input}],
functions=[
{
"name": "write_expense_to_spreadsheet",
"description": "Send user input and function information to the model",
"parameters": {
"type": "object",
"properties": {
"date": {"type": "string", "format": "date"},
"amount": {"type": "string"},
"description": {"type": "string"},
},
"required": ["date", "amount", "description"],
},
},
{
"name": "post_message_to_slack_via_webhook",
"description": "Post a message to Slack",
"parameters": {
"type": "object",
"properties": {
"message": {"type": "string"},},
"required": ["message"],
},
}
],
function_call="auto",
)
message = response["choices"][0]["message"]
# Function calling logic
if message.get("function_call"):
function_name = message["function_call"]["name"]
arguments = json.loads(message["function_call"]["arguments"])
print(arguments)
if function_name == "write_expense_to_spreadsheet":
function_response = write_expense_to_spreadsheet(
date=arguments.get("date"),
amount=arguments.get("amount"),
description=arguments.get("description"),
)
elif function_name == "post_message_to_slack_via_webhook":
function_response = post_message_to_slack_via_webhook(
message=arguments.get("message"),)
else:
raise NotImplementedError()
second_response = openai.ChatCompletion.create(
model="gpt-3.5-turbo-0613",
messages=[
{"role": "user", "content": user_input},
message,
{
"role": "function",
"name": function_name,
"content": str(function_response),
},
],
)
return second_response
else:
return response
print("response:", run_conversation()["choices"][0]["message"]["content"], "nn")
Conclusion
The advent of Function Calling has greatly expanded the possibilities for employing natural language as an interface. While there remains much to discover, this article offers a foundational understanding of how Function Calling operates and its potential to enhance AI interactions.
Frequently Asked Questions
What is OpenAI’s function calling feature?
OpenAI’s function calling feature allows developers to define functions, with the model generating a JSON output that includes the necessary arguments. It does not execute functions directly but provides the JSON that can be used to invoke functions within your code.
How does OpenAI’s function calling feature work?
Developers specify functions as part of the Chat Completion request. The model generates a JSON output that can be utilized to execute the specified function in your code. The API is called repetitively until a “stop” condition is reached. When the API completes with a “function_call” reason, the designated function is executed, and the results are relayed back to the API.
What are the applications of OpenAI’s function calling feature?
OpenAI’s function calling feature has numerous applications, including the creation of chatbots that respond to inquiries by utilizing external APIs, converting natural language into structured JSON data, extracting organized information from text, and addressing complex challenges that require multi-step reasoning and various calculations.