AsyncAPI 2.6 description of Retell AI's publicly documented WebSocket surfaces. All events are sourced from the official Retell AI documentation (https://docs.retellai.com) and cover: * Custom LLM WebSocket - bidirectional channel between Retell's voice infrastructure and a developer-operated LLM server. Retell connects out to the developer's server using the call_id as a path parameter. Source: https://docs.retellai.com/api-references/llm-websocket https://docs.retellai.com/integrate-llm/setup-websocket-server * Audio WebSocket (deprecated) - bidirectional audio/control channel between a frontend client and Retell, hosted at wss://api.retellai.com/audio-websocket/{call_id}. Source: https://docs.retellai.com/api-references/audio-websocket The Web Call experience is delivered through the RetellWebClient SDK and does not expose a publicly documented WebSocket protocol; it is therefore not modeled here. Only events documented by Retell AI are included - no fabricated fields.
Custom LLM WebSocket channel. Retell AI opens a connection to the developer's WebSocket server and exchanges JSON messages identified by `interaction_type` (Retell to LLM) or `response_type` (LLM to Retell).
audio/{call_id}
publishclientToRetell
Messages sent from the frontend client to Retell.
Audio WebSocket channel (deprecated). Carries raw audio bytes from the frontend microphone to Retell, and a mix of raw audio bytes plus JSON / string control events from Retell back to the frontend.
Messages
✉
RetellPingPong
Ping Pong (Retell -> LLM)
Heartbeat ping sent by Retell to the LLM server.
✉
RetellCallDetails
Call Details (Retell -> LLM)
Initial call metadata sent by Retell when `call_details` is enabled via the LLM-side config message.
✉
RetellUpdateOnly
Update Only (Retell -> LLM)
Transcript / turn-taking update that does not require an LLM response.
✉
RetellResponseRequired
Response Required (Retell -> LLM)
Retell expects the LLM server to produce a response.
✉
RetellReminderRequired
Reminder Required (Retell -> LLM)
Retell signals that a reminder response is required after silence.
✉
LlmConfig
Config (LLM -> Retell)
LLM server-side configuration sent to Retell.
✉
LlmUpdateAgent
Update Agent (LLM -> Retell)
Update agent-level runtime settings.
✉
LlmPingPong
Ping Pong (LLM -> Retell)
Heartbeat ping sent from the LLM server to Retell.
✉
LlmResponse
Response (LLM -> Retell)
Streamed agent response chunk.
✉
LlmAgentInterrupt
Agent Interrupt (LLM -> Retell)
Agent-initiated interruption.
✉
LlmToolCallInvocation
Tool Call Invocation (LLM -> Retell)
LLM reports a tool / function call invocation.
✉
LlmToolCallResult
Tool Call Result (LLM -> Retell)
LLM reports the result of a tool / function call.
✉
LlmMetadata
Metadata (LLM -> Retell)
Custom metadata sent by the LLM server to Retell.
✉
ClientAudioFrame
Client Audio Frame (Frontend -> Retell)
Raw microphone audio bytes streamed in 20-250ms chunks.
✉
AgentAudioFrame
Agent Audio Frame (Retell -> Frontend)
Raw binary agent audio response bytes, emitted when `enable_audio_alignment=false`.
✉
AudioClear
Clear (Retell -> Frontend)
String literal "clear" sent when the user interrupts the agent so the client can flush any buffered agent audio.
✉
AudioUpdate
Update (Retell -> Frontend)
Live call update containing transcript and optional turn-taking info.
✉
AudioAlignment
Audio Alignment (Retell -> Frontend)
Emitted when `enable_audio_alignment=true`. JSON envelope containing base64-encoded agent audio aligned with the corresponding text.
✉
AudioMetadata
Metadata (Retell -> Frontend)
Custom metadata forwarded from the LLM server to the frontend.
Developer-hosted Custom LLM WebSocket server. Retell AI connects to this server using the call_id as a trailing path parameter (see the `custom_llm` channel). The endpoint format documented by Retell is `wss://your_domain_name/llm-websocket/{call_id}` (or `ws://localhost:3000/llm-websocket/{call_id}` during local testing).
Retell-hosted Audio WebSocket (deprecated). Frontend clients connect to `wss://api.retellai.com/audio-websocket/{call_id}` to stream microphone audio to Retell and receive agent audio plus control events.