Skip to content

Commit be94419

Browse files
authored
feat: openai#1760 Add SIP support for realtime agent runner (openai#1993)
1 parent 1466ddb commit be94419

File tree

10 files changed

+535
-2
lines changed

10 files changed

+535
-2
lines changed
Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,55 @@
1+
# Twilio SIP Realtime Example
2+
3+
This example shows how to handle OpenAI Realtime SIP calls with the Agents SDK. Incoming calls are accepted through the Realtime Calls API, a triage agent answers with a fixed greeting, and handoffs route the caller to specialist agents (FAQ lookup and record updates) similar to the realtime UI demo.
4+
5+
## Prerequisites
6+
7+
- Python 3.9+
8+
- An OpenAI API key with Realtime API access
9+
- A configured webhook secret for your OpenAI project
10+
- A Twilio account with a phone number and Elastic SIP Trunking enabled
11+
- A public HTTPS endpoint for local development (for example, [ngrok](https://ngrok.com/))
12+
13+
## Configure OpenAI
14+
15+
1. In [platform settings](https://platform.openai.com/settings) select your project.
16+
2. Create a webhook pointing to `https://<your-public-host>/openai/webhook` with "realtime.call.incoming" event type and note the signing secret. The example verifies each webhook with `OPENAI_WEBHOOK_SECRET`.
17+
18+
## Configure Twilio Elastic SIP Trunking
19+
20+
1. Create (or edit) an Elastic SIP trunk.
21+
2. On the **Origination** tab, add an origination SIP URI of `sip:proj_<your_project_id>@sip.api.openai.com;transport=tls` so Twilio sends inbound calls to OpenAI. (The Termination tab always ends with `.pstn.twilio.com`, so leave it unchanged.)
22+
3. Add at least one phone number to the trunk so inbound calls are forwarded to OpenAI.
23+
24+
## Setup
25+
26+
1. Install dependencies:
27+
```bash
28+
uv pip install -r examples/realtime/twilio-sip/requirements.txt
29+
```
30+
2. Export required environment variables:
31+
```bash
32+
export OPENAI_API_KEY="sk-..."
33+
export OPENAI_WEBHOOK_SECRET="whsec_..."
34+
```
35+
3. (Optional) Adjust the multi-agent logic in `examples/realtime/twilio_sip/agents.py` if you want
36+
to change the specialist agents or tools.
37+
4. Run the FastAPI server:
38+
```bash
39+
uv run uvicorn examples.realtime.twilio_sip.server:app --host 0.0.0.0 --port 8000
40+
```
41+
5. Expose the server publicly (example with ngrok):
42+
```bash
43+
ngrok http 8000
44+
```
45+
46+
## Test a Call
47+
48+
1. Place a call to the Twilio number attached to the SIP trunk.
49+
2. Twilio sends the call to `sip.api.openai.com`; OpenAI fires `realtime.call.incoming`, which this example accepts.
50+
3. The triage agent greets the caller, then either keeps the conversation or hands off to:
51+
- **FAQ Agent** – answers common questions via `faq_lookup_tool`.
52+
- **Records Agent** – writes short notes using `update_customer_record`.
53+
4. The background task attaches to the call and logs transcripts plus basic events in the console.
54+
55+
You can edit `server.py` to change instructions, add tools, or integrate with internal systems once the SIP session is active.
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
"""OpenAI Realtime SIP example package."""
Lines changed: 87 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,87 @@
1+
"""Realtime agent definitions shared by the Twilio SIP example."""
2+
3+
from __future__ import annotations
4+
5+
import asyncio
6+
7+
from agents import function_tool
8+
from agents.extensions.handoff_prompt import RECOMMENDED_PROMPT_PREFIX
9+
from agents.realtime import RealtimeAgent, realtime_handoff
10+
11+
# --- Tools -----------------------------------------------------------------
12+
13+
14+
WELCOME_MESSAGE = "Hello, this is ABC customer service. How can I help you today?"
15+
16+
17+
@function_tool(
18+
name_override="faq_lookup_tool", description_override="Lookup frequently asked questions."
19+
)
20+
async def faq_lookup_tool(question: str) -> str:
21+
"""Fetch FAQ answers for the caller."""
22+
23+
await asyncio.sleep(3)
24+
25+
q = question.lower()
26+
if "plan" in q or "wifi" in q or "wi-fi" in q:
27+
return "We provide complimentary Wi-Fi. Join the ABC-Customer network." # demo data
28+
if "billing" in q or "invoice" in q:
29+
return "Your latest invoice is available in the ABC portal under Billing > History."
30+
if "hours" in q or "support" in q:
31+
return "Human support agents are available 24/7; transfer to the specialist if needed."
32+
return "I'm not sure about that. Let me transfer you back to the triage agent."
33+
34+
35+
@function_tool
36+
async def update_customer_record(customer_id: str, note: str) -> str:
37+
"""Record a short note about the caller."""
38+
39+
await asyncio.sleep(1)
40+
return f"Recorded note for {customer_id}: {note}"
41+
42+
43+
# --- Agents ----------------------------------------------------------------
44+
45+
46+
faq_agent = RealtimeAgent(
47+
name="FAQ Agent",
48+
handoff_description="Handles frequently asked questions and general account inquiries.",
49+
instructions=f"""{RECOMMENDED_PROMPT_PREFIX}
50+
You are an FAQ specialist. Always rely on the faq_lookup_tool for answers and keep replies
51+
concise. If the caller needs hands-on help, transfer back to the triage agent.
52+
""",
53+
tools=[faq_lookup_tool],
54+
)
55+
56+
records_agent = RealtimeAgent(
57+
name="Records Agent",
58+
handoff_description="Updates customer records with brief notes and confirmation numbers.",
59+
instructions=f"""{RECOMMENDED_PROMPT_PREFIX}
60+
You handle structured updates. Confirm the customer's ID, capture their request in a short
61+
note, and use the update_customer_record tool. For anything outside data updates, return to the
62+
triage agent.
63+
""",
64+
tools=[update_customer_record],
65+
)
66+
67+
triage_agent = RealtimeAgent(
68+
name="Triage Agent",
69+
handoff_description="Greets callers and routes them to the most appropriate specialist.",
70+
instructions=(
71+
f"{RECOMMENDED_PROMPT_PREFIX} "
72+
"Always begin the call by saying exactly: '"
73+
f"{WELCOME_MESSAGE}' "
74+
"before collecting details. Once the greeting is complete, gather context and hand off to "
75+
"the FAQ or Records agents when appropriate."
76+
),
77+
handoffs=[faq_agent, realtime_handoff(records_agent)],
78+
)
79+
80+
faq_agent.handoffs.append(triage_agent)
81+
records_agent.handoffs.append(triage_agent)
82+
83+
84+
def get_starting_agent() -> RealtimeAgent:
85+
"""Return the agent used to start each realtime call."""
86+
87+
return triage_agent
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
fastapi>=0.120.0
2+
openai>=2.2,<3
3+
uvicorn[standard]>=0.38.0
Lines changed: 211 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,211 @@
1+
"""Minimal FastAPI server for handling OpenAI Realtime SIP calls with Twilio."""
2+
3+
from __future__ import annotations
4+
5+
import asyncio
6+
import logging
7+
import os
8+
9+
import websockets
10+
from fastapi import FastAPI, HTTPException, Request, Response
11+
from openai import APIStatusError, AsyncOpenAI, InvalidWebhookSignatureError
12+
13+
from agents.realtime.config import RealtimeSessionModelSettings
14+
from agents.realtime.items import (
15+
AssistantAudio,
16+
AssistantMessageItem,
17+
AssistantText,
18+
InputText,
19+
UserMessageItem,
20+
)
21+
from agents.realtime.model_inputs import RealtimeModelSendRawMessage
22+
from agents.realtime.openai_realtime import OpenAIRealtimeSIPModel
23+
from agents.realtime.runner import RealtimeRunner
24+
25+
from .agents import WELCOME_MESSAGE, get_starting_agent
26+
27+
logging.basicConfig(level=logging.INFO)
28+
29+
logger = logging.getLogger("twilio_sip_example")
30+
31+
32+
def _get_env(name: str) -> str:
33+
value = os.getenv(name)
34+
if not value:
35+
raise RuntimeError(f"Missing environment variable: {name}")
36+
return value
37+
38+
39+
OPENAI_API_KEY = _get_env("OPENAI_API_KEY")
40+
OPENAI_WEBHOOK_SECRET = _get_env("OPENAI_WEBHOOK_SECRET")
41+
42+
client = AsyncOpenAI(api_key=OPENAI_API_KEY, webhook_secret=OPENAI_WEBHOOK_SECRET)
43+
44+
# Build the multi-agent graph (triage + specialist agents) from agents.py.
45+
assistant_agent = get_starting_agent()
46+
47+
app = FastAPI()
48+
49+
# Track background tasks so repeated webhooks do not spawn duplicates.
50+
active_call_tasks: dict[str, asyncio.Task[None]] = {}
51+
52+
53+
async def accept_call(call_id: str) -> None:
54+
"""Accept the incoming SIP call and configure the realtime session."""
55+
56+
# The starting agent uses static instructions, so we can forward them directly to the accept
57+
# call payload. If someone swaps in a dynamic prompt, fall back to a sensible default.
58+
instructions_payload = (
59+
assistant_agent.instructions
60+
if isinstance(assistant_agent.instructions, str)
61+
else "You are a helpful triage agent for ABC customer service."
62+
)
63+
64+
try:
65+
# AsyncOpenAI does not yet expose high-level helpers like client.realtime.calls.accept, so
66+
# we call the REST endpoint directly via client.post(). Keep this until the SDK grows an
67+
# async helper.
68+
await client.post(
69+
f"/realtime/calls/{call_id}/accept",
70+
body={
71+
"type": "realtime",
72+
"model": "gpt-realtime",
73+
"instructions": instructions_payload,
74+
},
75+
cast_to=dict,
76+
)
77+
except APIStatusError as exc:
78+
if exc.status_code == 404:
79+
# Twilio occasionally retries webhooks after the caller hangs up; treat as a no-op so
80+
# the webhook still returns 200.
81+
logger.warning(
82+
"Call %s no longer exists when attempting accept (404). Skipping.", call_id
83+
)
84+
return
85+
86+
detail = exc.message
87+
if exc.response is not None:
88+
try:
89+
detail = exc.response.text
90+
except Exception: # noqa: BLE001
91+
detail = str(exc.response)
92+
93+
logger.error("Failed to accept call %s: %s %s", call_id, exc.status_code, detail)
94+
raise HTTPException(status_code=500, detail="Failed to accept call") from exc
95+
96+
logger.info("Accepted call %s", call_id)
97+
98+
99+
async def observe_call(call_id: str) -> None:
100+
"""Attach to the realtime session and log conversation events."""
101+
102+
runner = RealtimeRunner(assistant_agent, model=OpenAIRealtimeSIPModel())
103+
104+
try:
105+
initial_model_settings: RealtimeSessionModelSettings = {
106+
"turn_detection": {
107+
"type": "semantic_vad",
108+
"interrupt_response": True,
109+
}
110+
}
111+
async with await runner.run(
112+
model_config={
113+
"call_id": call_id,
114+
"initial_model_settings": initial_model_settings,
115+
}
116+
) as session:
117+
# Trigger an initial greeting so callers hear the agent right away.
118+
# Issue a response.create immediately after the WebSocket attaches so the model speaks
119+
# before the caller says anything. Using the raw client message ensures zero latency
120+
# and avoids threading the greeting through history.
121+
await session.model.send_event(
122+
RealtimeModelSendRawMessage(
123+
message={
124+
"type": "response.create",
125+
"other_data": {
126+
"response": {
127+
"instructions": (
128+
"Say exactly '"
129+
f"{WELCOME_MESSAGE}"
130+
"' now before continuing the conversation."
131+
)
132+
}
133+
},
134+
}
135+
)
136+
)
137+
138+
async for event in session:
139+
if event.type == "history_added":
140+
item = event.item
141+
if isinstance(item, UserMessageItem):
142+
for user_content in item.content:
143+
if isinstance(user_content, InputText) and user_content.text:
144+
logger.info("Caller: %s", user_content.text)
145+
elif isinstance(item, AssistantMessageItem):
146+
for assistant_content in item.content:
147+
if (
148+
isinstance(assistant_content, AssistantText)
149+
and assistant_content.text
150+
):
151+
logger.info("Assistant (text): %s", assistant_content.text)
152+
elif (
153+
isinstance(assistant_content, AssistantAudio)
154+
and assistant_content.transcript
155+
):
156+
logger.info(
157+
"Assistant (audio transcript): %s",
158+
assistant_content.transcript,
159+
)
160+
elif event.type == "error":
161+
logger.error("Realtime session error: %s", event.error)
162+
163+
except websockets.exceptions.ConnectionClosedError:
164+
# Callers hanging up causes the WebSocket to close without a frame; log at info level so it
165+
# does not surface as an error.
166+
logger.info("Realtime WebSocket closed for call %s", call_id)
167+
except Exception as exc: # noqa: BLE001 - demo logging only
168+
logger.exception("Error while observing call %s", call_id, exc_info=exc)
169+
finally:
170+
logger.info("Call %s ended", call_id)
171+
active_call_tasks.pop(call_id, None)
172+
173+
174+
def _track_call_task(call_id: str) -> None:
175+
existing = active_call_tasks.get(call_id)
176+
if existing:
177+
if not existing.done():
178+
logger.info(
179+
"Call %s already has an active observer; ignoring duplicate webhook delivery.",
180+
call_id,
181+
)
182+
return
183+
# Remove completed tasks so a new observer can start for a fresh call.
184+
active_call_tasks.pop(call_id, None)
185+
186+
task = asyncio.create_task(observe_call(call_id))
187+
active_call_tasks[call_id] = task
188+
189+
190+
@app.post("/openai/webhook")
191+
async def openai_webhook(request: Request) -> Response:
192+
body = await request.body()
193+
194+
try:
195+
event = client.webhooks.unwrap(body, request.headers)
196+
except InvalidWebhookSignatureError as exc:
197+
raise HTTPException(status_code=400, detail="Invalid webhook signature") from exc
198+
199+
if event.type == "realtime.call.incoming":
200+
call_id = event.data.call_id
201+
await accept_call(call_id)
202+
_track_call_task(call_id)
203+
return Response(status_code=200)
204+
205+
# Ignore other webhook event types for brevity.
206+
return Response(status_code=200)
207+
208+
209+
@app.get("/")
210+
async def healthcheck() -> dict[str, str]:
211+
return {"status": "ok"}

src/agents/realtime/model.py

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -139,6 +139,13 @@ class RealtimeModelConfig(TypedDict):
139139
is played to the user.
140140
"""
141141

142+
call_id: NotRequired[str]
143+
"""Attach to an existing realtime call instead of creating a new session.
144+
145+
When provided, the transport connects using the `call_id` query string parameter rather than a
146+
model name. This is used for SIP-originated calls that are accepted via the Realtime Calls API.
147+
"""
148+
142149

143150
class RealtimeModel(abc.ABC):
144151
"""Interface for connecting to a realtime model and sending/receiving events."""

0 commit comments

Comments
 (0)