Azure OpenAI Service - Streaming and Realtime APIs
Version 2025-04-01-preview
AsyncAPI 2.6 description of the asynchronous and streaming surfaces of the Azure OpenAI Service (part of Microsoft Foundry Models): * The **Realtime API** over a WebSocket connection used for low-latency, bidirectional, multimodal (audio + text + function-calling) conversations with GPT-class realtime-enabled models. * The **Chat Completions streaming API** over HTTP + Server-Sent Events used for incremental delivery of `chat.completion.chunk` deltas while a chat completion is being generated. The Azure Realtime API follows the OpenAI Realtime API specification. Azure deviates from the OpenAI reference in that the `model` field inside `input_audio_transcription` configuration must reference a deployed Azure model (a deployment name such as `my-gpt-4o-transcribe-deployment`) rather than the raw OpenAI model identifier.
View SpecView on GitHubAILLMGenerative AIAzureOpenAIFoundation ModelsChat CompletionsEmbeddingsAsyncAPIWebhooksEvents
Channels
/openai/realtime#client
publishsendRealtimeClientEvent
Publish any Realtime client event to the server.
Aggregate stream of client events sent by the application to the Azure OpenAI Realtime service over the WebSocket connection.
/openai/realtime#server
subscribereceiveRealtimeServerEvent
Subscribe to any Realtime server event from the service.
Aggregate stream of server events emitted by the Azure OpenAI Realtime service to the connected client.
Azure OpenAI chat completions endpoint. When the request body contains `"stream": true`, the response is delivered as a `text/event-stream` stream of `chat.completion.chunk` events, terminated by a `data: [DONE]` sentinel.
Messages
✉
SessionUpdate
session.update
Update the session configuration on the server.
✉
InputAudioBufferAppend
input_audio_buffer.append
Append base64-encoded audio bytes to the user's input audio buffer.
✉
InputAudioBufferCommit
input_audio_buffer.commit
Commit the user input audio buffer, creating a new user message.
✉
InputAudioBufferClear
input_audio_buffer.clear
Clear the audio bytes in the input audio buffer.
✉
OutputAudioBufferClear
output_audio_buffer.clear
Clear the model output audio buffer (WebRTC-only on OpenAI; included for parity).
✉
ConversationItemCreate
conversation.item.create
Add a new item (message, function call, function output) to the conversation.
✉
ConversationItemRetrieve
conversation.item.retrieve
Retrieve the server's representation of an item in the conversation.
✉
ConversationItemTruncate
conversation.item.truncate
Truncate a previous assistant message's audio.
✉
ConversationItemDelete
conversation.item.delete
Remove an item from the conversation history.
✉
ResponseCreate
response.create
Trigger a model response (out-of-band or normal).
✉
ResponseCancel
response.cancel
Cancel an in-progress response.
✉
TranscriptionSessionUpdate
transcription_session.update
Update the configuration of a realtime transcription session.
✉
Error
error
An error occurred on the server.
✉
SessionCreated
session.created
A new realtime session was created.
✉
SessionUpdated
session.updated
A session was updated.
✉
TranscriptionSessionUpdated
transcription_session.updated
A transcription session was updated.
✉
ConversationCreated
conversation.created
A conversation was created.
✉
ConversationItemAdded
conversation.item.added
An item was added to the conversation.
✉
ConversationItemCreated
conversation.item.created
A conversation item was created (in response to `conversation.item.create`).
Azure OpenAI Realtime WebSocket endpoint. The connection is authenticated with either an Azure OpenAI `api-key` header or a Microsoft Entra ID bearer token via the `Authorization` header.
Azure OpenAI data-plane HTTPS endpoint. Streaming responses for chat completions are returned as `text/event-stream` Server-Sent Events when the request body sets `"stream": true`.