OpenAI · AsyncAPI Specification

OpenAI Realtime API

Version 2024-10-01

The OpenAI Realtime API provides low-latency, bidirectional, event-driven communication with multimodal models that natively support speech-to-speech, text, and audio in a single conversation. This AsyncAPI document describes the **WebSocket** transport for the Realtime API, including all documented client-to-server events and server-to-client events. The Realtime API is currently in beta. Clients must include the `OpenAI-Beta: realtime=v1` header when connecting. Connection URL: wss://api.openai.com/v1/realtime?model={model} Events flow over a single full-duplex WebSocket connection. Every event has a top-level `type` and most events also carry an `event_id` correlation id.

View Spec View on GitHub AIArtificial IntelligenceLarge Language ModelsT1AsyncAPIWebhooksEvents

Channels

session.update
publish sendSessionUpdate
Update session configuration.
Send by the client to update the session's default configuration (modalities, instructions, voice, audio formats, turn detection, tools, tool_choice, temperature, max_response_output_tokens).
input_audio_buffer.append
publish sendInputAudioBufferAppend
Append audio bytes to the input buffer.
Send by the client to append base64-encoded audio bytes to the input audio buffer. The default audio format is `pcm16` at 24 kHz.
input_audio_buffer.commit
publish sendInputAudioBufferCommit
Commit the input audio buffer.
Send by the client to commit the input audio buffer to the conversation as a user message. Required in non-VAD modes before requesting a response.
input_audio_buffer.clear
publish sendInputAudioBufferClear
Clear the input audio buffer.
Send by the client to clear the input audio buffer without committing it.
conversation.item.create
publish sendConversationItemCreate
Insert a conversation item.
Send by the client to insert a conversation item (a message, function_call, or function_call_output) into the conversation history.
conversation.item.truncate
publish sendConversationItemTruncate
Truncate an in-progress assistant item.
Send by the client to truncate the assistant audio of an in-progress response item. Used for interruption: audio after `audio_end_ms` is discarded and any text after that point is cleared.
conversation.item.delete
publish sendConversationItemDelete
Delete a conversation item.
Send by the client to delete a conversation item by id.
response.create
publish sendResponseCreate
Trigger a model response.
Send by the client to instruct the model to generate a response. Optionally overrides the session configuration for this single response.
response.cancel
publish sendResponseCancel
Cancel an in-progress response.
Send by the client to cancel an in-progress response.
error
subscribe receiveError
Receive an error event.
Server-emitted error envelope. Sent whenever a client event is invalid or the server encounters a problem processing a request.
session.created
subscribe receiveSessionCreated
Receive session.created.
Emitted by the server immediately after the WebSocket connection is authenticated. Contains the initial session configuration.
session.updated
subscribe receiveSessionUpdated
Receive session.updated.
Emitted after the server applies a `session.update` from the client.
conversation.created
subscribe receiveConversationCreated
Receive conversation.created.
Emitted by the server when a new conversation is created on the session.
conversation.item.created
subscribe receiveConversationItemCreated
Receive conversation.item.created.
Emitted when a new conversation item has been added (either by the client or by the model generating a response).
conversation.item.input_audio_transcription.completed
subscribe receiveInputAudioTranscriptionCompleted
Receive input_audio_transcription.completed.
Emitted when input audio transcription for a user audio item has completed (requires `input_audio_transcription` enabled on the session).
conversation.item.input_audio_transcription.failed
subscribe receiveInputAudioTranscriptionFailed
Receive input_audio_transcription.failed.
Emitted when input audio transcription fails for a user audio item.
conversation.item.truncated
subscribe receiveConversationItemTruncated
Receive conversation.item.truncated.
Emitted after the server applies a `conversation.item.truncate` request from the client.
conversation.item.deleted
subscribe receiveConversationItemDeleted
Receive conversation.item.deleted.
Emitted after the server applies a `conversation.item.delete` request.
input_audio_buffer.committed
subscribe receiveInputAudioBufferCommitted
Receive input_audio_buffer.committed.
Emitted when the input audio buffer is committed (either explicitly by the client via `input_audio_buffer.commit`, or implicitly by server VAD).
input_audio_buffer.cleared
subscribe receiveInputAudioBufferCleared
Receive input_audio_buffer.cleared.
Emitted after the server clears the input audio buffer.
input_audio_buffer.speech_started
subscribe receiveSpeechStarted
Receive input_audio_buffer.speech_started.
Emitted in server VAD mode when speech is detected starting in the input audio buffer.
input_audio_buffer.speech_stopped
subscribe receiveSpeechStopped
Receive input_audio_buffer.speech_stopped.
Emitted in server VAD mode when speech is detected stopping in the input audio buffer.
response.created
subscribe receiveResponseCreated
Receive response.created.
Emitted when the server begins generating a response after a `response.create` (explicit) or after server VAD commits a user turn.
response.done
subscribe receiveResponseDone
Receive response.done.
Emitted when a response has finished (status `completed`, `cancelled`, `failed`, or `incomplete`). Carries usage and final output items.
response.output_item.added
subscribe receiveResponseOutputItemAdded
Receive response.output_item.added.
Emitted when a new output item is added to a response.
response.output_item.done
subscribe receiveResponseOutputItemDone
Receive response.output_item.done.
Emitted when an output item on a response is complete.
response.content_part.added
subscribe receiveResponseContentPartAdded
Receive response.content_part.added.
Emitted when a new content part (text, audio, or transcript) is added to an output item.
response.content_part.done
subscribe receiveResponseContentPartDone
Receive response.content_part.done.
Emitted when a content part on an output item is complete.
response.text.delta
subscribe receiveResponseTextDelta
Receive response.text.delta.
Streaming text delta for a `text` content part on an assistant item.
response.text.done
subscribe receiveResponseTextDone
Receive response.text.done.
Emitted when a `text` content part is fully generated.
response.audio_transcript.delta
subscribe receiveResponseAudioTranscriptDelta
Receive response.audio_transcript.delta.
Streaming transcript delta for an `audio` content part on an assistant item.
response.audio_transcript.done
subscribe receiveResponseAudioTranscriptDone
Receive response.audio_transcript.done.
Emitted when the transcript for an `audio` content part is fully generated.
response.audio.delta
subscribe receiveResponseAudioDelta
Receive response.audio.delta.
Streaming base64-encoded audio delta for an `audio` content part on an assistant item.
response.audio.done
subscribe receiveResponseAudioDone
Receive response.audio.done.
Emitted when an `audio` content part is fully generated. No final base64 payload is included; clients reassemble from the deltas.
response.function_call_arguments.delta
subscribe receiveResponseFunctionCallArgumentsDelta
Receive response.function_call_arguments.delta.
Streaming delta for a tool/function call's `arguments` string.
response.function_call_arguments.done
subscribe receiveResponseFunctionCallArgumentsDone
Receive response.function_call_arguments.done.
Emitted when the `arguments` string for a function call is complete.
rate_limits.updated
subscribe receiveRateLimitsUpdated
Receive rate_limits.updated.
Emitted periodically with the current rate limit state for the connection (requests and tokens, remaining and reset_seconds).

Messages

SessionUpdate
session.update
Update session configuration.
InputAudioBufferAppend
input_audio_buffer.append
Append audio bytes to the input buffer.
InputAudioBufferCommit
input_audio_buffer.commit
Commit the input audio buffer.
InputAudioBufferClear
input_audio_buffer.clear
Clear the input audio buffer.
ConversationItemCreate
conversation.item.create
Insert a conversation item.
ConversationItemTruncate
conversation.item.truncate
Truncate an assistant item's audio.
ConversationItemDelete
conversation.item.delete
Delete a conversation item.
ResponseCreate
response.create
Trigger a model response.
ResponseCancel
response.cancel
Cancel an in-progress response.
Error
error
Server error.
SessionCreated
session.created
Session has been created.
SessionUpdated
session.updated
Session configuration updated.
ConversationCreated
conversation.created
Conversation created.
ConversationItemCreated
conversation.item.created
Conversation item created.
InputAudioTranscriptionCompleted
conversation.item.input_audio_transcription.completed
Input audio transcription completed.
InputAudioTranscriptionFailed
conversation.item.input_audio_transcription.failed
Input audio transcription failed.
ConversationItemTruncated
conversation.item.truncated
Conversation item truncated.
ConversationItemDeleted
conversation.item.deleted
Conversation item deleted.
InputAudioBufferCommitted
input_audio_buffer.committed
Input audio buffer committed.
InputAudioBufferCleared
input_audio_buffer.cleared
Input audio buffer cleared.
InputAudioBufferSpeechStarted
input_audio_buffer.speech_started
VAD speech started.
InputAudioBufferSpeechStopped
input_audio_buffer.speech_stopped
VAD speech stopped.
ResponseCreated
response.created
Response generation started.
ResponseDone
response.done
Response generation finished.
ResponseOutputItemAdded
response.output_item.added
New output item added to response.
ResponseOutputItemDone
response.output_item.done
Output item on response complete.
ResponseContentPartAdded
response.content_part.added
Content part added to output item.
ResponseContentPartDone
response.content_part.done
Content part on output item complete.
ResponseTextDelta
response.text.delta
Text delta for assistant message.
ResponseTextDone
response.text.done
Text content part complete.
ResponseAudioTranscriptDelta
response.audio_transcript.delta
Transcript delta for audio content part.
ResponseAudioTranscriptDone
response.audio_transcript.done
Transcript for audio content part complete.
ResponseAudioDelta
response.audio.delta
Base64 audio delta for audio content part.
ResponseAudioDone
response.audio.done
Audio content part complete.
ResponseFunctionCallArgumentsDelta
response.function_call_arguments.delta
Function-call arguments delta.
ResponseFunctionCallArgumentsDone
response.function_call_arguments.done
Function-call arguments complete.
RateLimitsUpdated
rate_limits.updated
Current rate limit state.

Servers

wss
production api.openai.com/v1/realtime
OpenAI Realtime WebSocket endpoint. The `model` query parameter selects the underlying realtime-capable model (for example `gpt-4o-realtime-preview-2024-10-01`).

AsyncAPI Specification

Raw ↑
asyncapi: '2.6.0'
info:
  title: OpenAI Realtime API
  version: '2024-10-01'
  description: |
    The OpenAI Realtime API provides low-latency, bidirectional, event-driven
    communication with multimodal models that natively support speech-to-speech,
    text, and audio in a single conversation. This AsyncAPI document describes
    the **WebSocket** transport for the Realtime API, including all documented
    client-to-server events and server-to-client events.

    The Realtime API is currently in beta. Clients must include the
    `OpenAI-Beta: realtime=v1` header when connecting.

    Connection URL:

        wss://api.openai.com/v1/realtime?model={model}

    Events flow over a single full-duplex WebSocket connection. Every event has
    a top-level `type` and most events also carry an `event_id` correlation id.
  contact:
    name: OpenAI
    url: https://platform.openai.com/docs/api-reference/realtime
  license:
    name: Proprietary
    url: https://openai.com/policies/terms-of-use
  termsOfService: https://openai.com/policies/terms-of-use
  x-source:
    derivedFrom:
      - https://platform.openai.com/docs/guides/realtime
      - https://platform.openai.com/docs/api-reference/realtime-client-events
      - https://platform.openai.com/docs/api-reference/realtime-server-events
      - https://github.com/openai/openai-realtime-api-beta

defaultContentType: application/json

servers:
  production:
    url: api.openai.com/v1/realtime
    protocol: wss
    description: |
      OpenAI Realtime WebSocket endpoint. The `model` query parameter selects
      the underlying realtime-capable model (for example
      `gpt-4o-realtime-preview-2024-10-01`).
    variables:
      model:
        description: Realtime model identifier appended as a query string parameter.
        default: gpt-4o-realtime-preview-2024-10-01
        examples:
          - gpt-4o-realtime-preview
          - gpt-4o-realtime-preview-2024-10-01
          - gpt-4o-mini-realtime-preview
    security:
      - bearerAuth: []
        openAiBeta: []
    bindings:
      ws:
        bindingVersion: '0.1.0'
        headers:
          type: object
          required:
            - Authorization
            - OpenAI-Beta
          properties:
            Authorization:
              type: string
              description: Bearer token in the form `Bearer <OPENAI_API_KEY>`.
            OpenAI-Beta:
              type: string
              description: Beta opt-in header. Must be set to `realtime=v1`.
              enum:
                - realtime=v1
            OpenAI-Organization:
              type: string
              description: Optional organization identifier for billing.
            OpenAI-Project:
              type: string
              description: Optional project identifier for billing.
        query:
          type: object
          required:
            - model
          properties:
            model:
              type: string
              description: Realtime model identifier.

channels:

  # ---------------------------------------------------------------------------
  # Client -> Server events
  # ---------------------------------------------------------------------------

  session.update:
    description: |
      Send by the client to update the session's default configuration
      (modalities, instructions, voice, audio formats, turn detection, tools,
      tool_choice, temperature, max_response_output_tokens).
    publish:
      operationId: sendSessionUpdate
      summary: Update session configuration.
      message:
        $ref: '#/components/messages/SessionUpdate'

  input_audio_buffer.append:
    description: |
      Send by the client to append base64-encoded audio bytes to the input
      audio buffer. The default audio format is `pcm16` at 24 kHz.
    publish:
      operationId: sendInputAudioBufferAppend
      summary: Append audio bytes to the input buffer.
      message:
        $ref: '#/components/messages/InputAudioBufferAppend'

  input_audio_buffer.commit:
    description: |
      Send by the client to commit the input audio buffer to the conversation
      as a user message. Required in non-VAD modes before requesting a response.
    publish:
      operationId: sendInputAudioBufferCommit
      summary: Commit the input audio buffer.
      message:
        $ref: '#/components/messages/InputAudioBufferCommit'

  input_audio_buffer.clear:
    description: |
      Send by the client to clear the input audio buffer without committing it.
    publish:
      operationId: sendInputAudioBufferClear
      summary: Clear the input audio buffer.
      message:
        $ref: '#/components/messages/InputAudioBufferClear'

  conversation.item.create:
    description: |
      Send by the client to insert a conversation item (a message,
      function_call, or function_call_output) into the conversation history.
    publish:
      operationId: sendConversationItemCreate
      summary: Insert a conversation item.
      message:
        $ref: '#/components/messages/ConversationItemCreate'

  conversation.item.truncate:
    description: |
      Send by the client to truncate the assistant audio of an in-progress
      response item. Used for interruption: audio after `audio_end_ms` is
      discarded and any text after that point is cleared.
    publish:
      operationId: sendConversationItemTruncate
      summary: Truncate an in-progress assistant item.
      message:
        $ref: '#/components/messages/ConversationItemTruncate'

  conversation.item.delete:
    description: |
      Send by the client to delete a conversation item by id.
    publish:
      operationId: sendConversationItemDelete
      summary: Delete a conversation item.
      message:
        $ref: '#/components/messages/ConversationItemDelete'

  response.create:
    description: |
      Send by the client to instruct the model to generate a response.
      Optionally overrides the session configuration for this single response.
    publish:
      operationId: sendResponseCreate
      summary: Trigger a model response.
      message:
        $ref: '#/components/messages/ResponseCreate'

  response.cancel:
    description: |
      Send by the client to cancel an in-progress response.
    publish:
      operationId: sendResponseCancel
      summary: Cancel an in-progress response.
      message:
        $ref: '#/components/messages/ResponseCancel'

  # ---------------------------------------------------------------------------
  # Server -> Client events
  # ---------------------------------------------------------------------------

  error:
    description: |
      Server-emitted error envelope. Sent whenever a client event is invalid
      or the server encounters a problem processing a request.
    subscribe:
      operationId: receiveError
      summary: Receive an error event.
      message:
        $ref: '#/components/messages/Error'

  session.created:
    description: |
      Emitted by the server immediately after the WebSocket connection is
      authenticated. Contains the initial session configuration.
    subscribe:
      operationId: receiveSessionCreated
      summary: Receive session.created.
      message:
        $ref: '#/components/messages/SessionCreated'

  session.updated:
    description: |
      Emitted after the server applies a `session.update` from the client.
    subscribe:
      operationId: receiveSessionUpdated
      summary: Receive session.updated.
      message:
        $ref: '#/components/messages/SessionUpdated'

  conversation.created:
    description: |
      Emitted by the server when a new conversation is created on the
      session.
    subscribe:
      operationId: receiveConversationCreated
      summary: Receive conversation.created.
      message:
        $ref: '#/components/messages/ConversationCreated'

  conversation.item.created:
    description: |
      Emitted when a new conversation item has been added (either by the
      client or by the model generating a response).
    subscribe:
      operationId: receiveConversationItemCreated
      summary: Receive conversation.item.created.
      message:
        $ref: '#/components/messages/ConversationItemCreated'

  conversation.item.input_audio_transcription.completed:
    description: |
      Emitted when input audio transcription for a user audio item has
      completed (requires `input_audio_transcription` enabled on the session).
    subscribe:
      operationId: receiveInputAudioTranscriptionCompleted
      summary: Receive input_audio_transcription.completed.
      message:
        $ref: '#/components/messages/InputAudioTranscriptionCompleted'

  conversation.item.input_audio_transcription.failed:
    description: |
      Emitted when input audio transcription fails for a user audio item.
    subscribe:
      operationId: receiveInputAudioTranscriptionFailed
      summary: Receive input_audio_transcription.failed.
      message:
        $ref: '#/components/messages/InputAudioTranscriptionFailed'

  conversation.item.truncated:
    description: |
      Emitted after the server applies a `conversation.item.truncate` request
      from the client.
    subscribe:
      operationId: receiveConversationItemTruncated
      summary: Receive conversation.item.truncated.
      message:
        $ref: '#/components/messages/ConversationItemTruncated'

  conversation.item.deleted:
    description: |
      Emitted after the server applies a `conversation.item.delete` request.
    subscribe:
      operationId: receiveConversationItemDeleted
      summary: Receive conversation.item.deleted.
      message:
        $ref: '#/components/messages/ConversationItemDeleted'

  input_audio_buffer.committed:
    description: |
      Emitted when the input audio buffer is committed (either explicitly by
      the client via `input_audio_buffer.commit`, or implicitly by server VAD).
    subscribe:
      operationId: receiveInputAudioBufferCommitted
      summary: Receive input_audio_buffer.committed.
      message:
        $ref: '#/components/messages/InputAudioBufferCommitted'

  input_audio_buffer.cleared:
    description: |
      Emitted after the server clears the input audio buffer.
    subscribe:
      operationId: receiveInputAudioBufferCleared
      summary: Receive input_audio_buffer.cleared.
      message:
        $ref: '#/components/messages/InputAudioBufferCleared'

  input_audio_buffer.speech_started:
    description: |
      Emitted in server VAD mode when speech is detected starting in the
      input audio buffer.
    subscribe:
      operationId: receiveSpeechStarted
      summary: Receive input_audio_buffer.speech_started.
      message:
        $ref: '#/components/messages/InputAudioBufferSpeechStarted'

  input_audio_buffer.speech_stopped:
    description: |
      Emitted in server VAD mode when speech is detected stopping in the
      input audio buffer.
    subscribe:
      operationId: receiveSpeechStopped
      summary: Receive input_audio_buffer.speech_stopped.
      message:
        $ref: '#/components/messages/InputAudioBufferSpeechStopped'

  response.created:
    description: |
      Emitted when the server begins generating a response after a
      `response.create` (explicit) or after server VAD commits a user turn.
    subscribe:
      operationId: receiveResponseCreated
      summary: Receive response.created.
      message:
        $ref: '#/components/messages/ResponseCreated'

  response.done:
    description: |
      Emitted when a response has finished (status `completed`, `cancelled`,
      `failed`, or `incomplete`). Carries usage and final output items.
    subscribe:
      operationId: receiveResponseDone
      summary: Receive response.done.
      message:
        $ref: '#/components/messages/ResponseDone'

  response.output_item.added:
    description: |
      Emitted when a new output item is added to a response.
    subscribe:
      operationId: receiveResponseOutputItemAdded
      summary: Receive response.output_item.added.
      message:
        $ref: '#/components/messages/ResponseOutputItemAdded'

  response.output_item.done:
    description: |
      Emitted when an output item on a response is complete.
    subscribe:
      operationId: receiveResponseOutputItemDone
      summary: Receive response.output_item.done.
      message:
        $ref: '#/components/messages/ResponseOutputItemDone'

  response.content_part.added:
    description: |
      Emitted when a new content part (text, audio, or transcript) is added
      to an output item.
    subscribe:
      operationId: receiveResponseContentPartAdded
      summary: Receive response.content_part.added.
      message:
        $ref: '#/components/messages/ResponseContentPartAdded'

  response.content_part.done:
    description: |
      Emitted when a content part on an output item is complete.
    subscribe:
      operationId: receiveResponseContentPartDone
      summary: Receive response.content_part.done.
      message:
        $ref: '#/components/messages/ResponseContentPartDone'

  response.text.delta:
    description: |
      Streaming text delta for a `text` content part on an assistant item.
    subscribe:
      operationId: receiveResponseTextDelta
      summary: Receive response.text.delta.
      message:
        $ref: '#/components/messages/ResponseTextDelta'

  response.text.done:
    description: |
      Emitted when a `text` content part is fully generated.
    subscribe:
      operationId: receiveResponseTextDone
      summary: Receive response.text.done.
      message:
        $ref: '#/components/messages/ResponseTextDone'

  response.audio_transcript.delta:
    description: |
      Streaming transcript delta for an `audio` content part on an
      assistant item.
    subscribe:
      operationId: receiveResponseAudioTranscriptDelta
      summary: Receive response.audio_transcript.delta.
      message:
        $ref: '#/components/messages/ResponseAudioTranscriptDelta'

  response.audio_transcript.done:
    description: |
      Emitted when the transcript for an `audio` content part is fully
      generated.
    subscribe:
      operationId: receiveResponseAudioTranscriptDone
      summary: Receive response.audio_transcript.done.
      message:
        $ref: '#/components/messages/ResponseAudioTranscriptDone'

  response.audio.delta:
    description: |
      Streaming base64-encoded audio delta for an `audio` content part on an
      assistant item.
    subscribe:
      operationId: receiveResponseAudioDelta
      summary: Receive response.audio.delta.
      message:
        $ref: '#/components/messages/ResponseAudioDelta'

  response.audio.done:
    description: |
      Emitted when an `audio` content part is fully generated. No final
      base64 payload is included; clients reassemble from the deltas.
    subscribe:
      operationId: receiveResponseAudioDone
      summary: Receive response.audio.done.
      message:
        $ref: '#/components/messages/ResponseAudioDone'

  response.function_call_arguments.delta:
    description: |
      Streaming delta for a tool/function call's `arguments` string.
    subscribe:
      operationId: receiveResponseFunctionCallArgumentsDelta
      summary: Receive response.function_call_arguments.delta.
      message:
        $ref: '#/components/messages/ResponseFunctionCallArgumentsDelta'

  response.function_call_arguments.done:
    description: |
      Emitted when the `arguments` string for a function call is complete.
    subscribe:
      operationId: receiveResponseFunctionCallArgumentsDone
      summary: Receive response.function_call_arguments.done.
      message:
        $ref: '#/components/messages/ResponseFunctionCallArgumentsDone'

  rate_limits.updated:
    description: |
      Emitted periodically with the current rate limit state for the
      connection (requests and tokens, remaining and reset_seconds).
    subscribe:
      operationId: receiveRateLimitsUpdated
      summary: Receive rate_limits.updated.
      message:
        $ref: '#/components/messages/RateLimitsUpdated'

components:

  securitySchemes:
    bearerAuth:
      type: http
      scheme: bearer
      bearerFormat: OpenAI API Key
      description: |
        Bearer token authentication using an OpenAI API key. The header
        `Authorization: Bearer <OPENAI_API_KEY>` must be sent on the
        WebSocket upgrade request.
    openAiBeta:
      type: apiKey
      in: header
      name: OpenAI-Beta
      description: |
        Beta opt-in header. Must be set to `realtime=v1` for the Realtime
        API while it remains in beta.

  messages:

    # ----- Client -> Server -------------------------------------------------

    SessionUpdate:
      name: SessionUpdate
      title: session.update
      summary: Update session configuration.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/SessionUpdateEvent'

    InputAudioBufferAppend:
      name: InputAudioBufferAppend
      title: input_audio_buffer.append
      summary: Append audio bytes to the input buffer.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/InputAudioBufferAppendEvent'

    InputAudioBufferCommit:
      name: InputAudioBufferCommit
      title: input_audio_buffer.commit
      summary: Commit the input audio buffer.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/InputAudioBufferCommitEvent'

    InputAudioBufferClear:
      name: InputAudioBufferClear
      title: input_audio_buffer.clear
      summary: Clear the input audio buffer.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/InputAudioBufferClearEvent'

    ConversationItemCreate:
      name: ConversationItemCreate
      title: conversation.item.create
      summary: Insert a conversation item.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/ConversationItemCreateEvent'

    ConversationItemTruncate:
      name: ConversationItemTruncate
      title: conversation.item.truncate
      summary: Truncate an assistant item's audio.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/ConversationItemTruncateEvent'

    ConversationItemDelete:
      name: ConversationItemDelete
      title: conversation.item.delete
      summary: Delete a conversation item.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/ConversationItemDeleteEvent'

    ResponseCreate:
      name: ResponseCreate
      title: response.create
      summary: Trigger a model response.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/ResponseCreateEvent'

    ResponseCancel:
      name: ResponseCancel
      title: response.cancel
      summary: Cancel an in-progress response.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/ResponseCancelEvent'

    # ----- Server -> Client -------------------------------------------------

    Error:
      name: Error
      title: error
      summary: Server error.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/ErrorEvent'

    SessionCreated:
      name: SessionCreated
      title: session.created
      summary: Session has been created.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/SessionCreatedEvent'

    SessionUpdated:
      name: SessionUpdated
      title: session.updated
      summary: Session configuration updated.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/SessionUpdatedEvent'

    ConversationCreated:
      name: ConversationCreated
      title: conversation.created
      summary: Conversation created.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/ConversationCreatedEvent'

    ConversationItemCreated:
      name: ConversationItemCreated
      title: conversation.item.created
      summary: Conversation item created.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/ConversationItemCreatedEvent'

    InputAudioTranscriptionCompleted:
      name: InputAudioTranscriptionCompleted
      title: conversation.item.input_audio_transcription.completed
      summary: Input audio transcription completed.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/InputAudioTranscriptionCompletedEvent'

    InputAudioTranscriptionFailed:
      name: InputAudioTranscriptionFailed
      title: conversation.item.input_audio_transcription.failed
      summary: Input audio transcription failed.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/InputAudioTranscriptionFailedEvent'

    ConversationItemTruncated:
      name: ConversationItemTruncated
      title: conversation.item.truncated
      summary: Conversation item truncated.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/ConversationItemTruncatedEvent'

    ConversationItemDeleted:
      name: ConversationItemDeleted
      title: conversation.item.deleted
      summary: Conversation item deleted.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/ConversationItemDeletedEvent'

    InputAudioBufferCommitted:
      name: InputAudioBufferCommitted
      title: input_audio_buffer.committed
      summary: Input audio buffer committed.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/InputAudioBufferCommittedEvent'

    InputAudioBufferCleared:
      name: InputAudioBufferCleared
      title: input_audio_buffer.cleared
      summary: Input audio buffer cleared.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/InputAudioBufferClearedEvent'

    InputAudioBufferSpeechStarted:
      name: InputAudioBufferSpeechStarted
      title: input_audio_buffer.speech_started
      summary: VAD speech started.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/InputAudioBufferSpeechStartedEvent'

    InputAudioBufferSpeechStopped:
      name: InputAudioBufferSpeechStopped
      title: input_audio_buffer.speech_stopped
      summary: VAD speech stopped.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/InputAudioBufferSpeechStoppedEvent'

    ResponseCreated:
      name: ResponseCreated
      title: response.created
      summary: Response generation started.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/ResponseCreatedEvent'

    ResponseDone:
      name: ResponseDone
      title: response.done
      summary: Response generation finished.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/ResponseDoneEvent'

    ResponseOutputItemAdded:
      name: ResponseOutputItemAdded
      title: response.output_item.added
      summary: New output item added to response.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/ResponseOutputItemAddedEvent'

    ResponseOutputItemDone:
      name: ResponseOutputItemDone
      title: response.output_item.done
      summary: Output item on response complete.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/ResponseOutputItemDoneEvent'

    ResponseContentPartAdded:
      name: ResponseContentPartAdded
      title: response.content_part.added
      summary: Content part added to output item.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/ResponseContentPartAddedEvent'

    ResponseContentPartDone:
      name: ResponseContentPartDone
      title: response.content_part.done
      summary: Content part on output item complete.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/ResponseContentPartDoneEvent'

    ResponseTextDelta:
      name: ResponseTextDelta
      title: response.text.delta
      summary: Text delta for assistant message.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/ResponseTextDeltaEvent'

    ResponseTextDone:
      name: ResponseTextDone
      title: response.text.done
      summary: Text content part complete.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/ResponseTextDoneEvent'

    ResponseAudioTranscriptDelta:
      name: ResponseAudioTranscriptDelta
      title: response.audio_transcript.delta
      summary: Transcript delta for audio content part.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/ResponseAudioTranscriptDeltaEvent'

    ResponseAudioTranscriptDone:
      name: ResponseAudioTranscriptDone
      title: response.audio_transcript.done
      summary: Transcript for audio content part complete.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/ResponseAudioTranscriptDoneEvent'

    ResponseAudioDelta:
      name: ResponseAudioDelta
      title: response.audio.delta
      summary: Base64 audio delta for audio content part.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/ResponseAudioDeltaEvent'

    ResponseAudioDone:
      name: ResponseAudioDone
      title: response.audio.done
      summary: Audio content part complete.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/ResponseAudioDoneEvent'

    ResponseFunctionCallArgumentsDelta:
      name: ResponseFunctionCallArgumentsDelta
      title: response.function_call_arguments.delta
      summary: Function-call arguments delta.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/ResponseFunctionCallArgumentsDeltaEvent'

    ResponseFunctionCallArgumentsDone:
      name: ResponseFunctionCallArgumentsDone
      title: response.function_call_arguments.done
      summary: Function-call arguments complete.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/ResponseFunctionCallArgumentsDoneEvent'

    RateLimitsUpdated:
      name: RateLimitsUpdated
      title: rate_limits.updated
      summary: Current rate limit state.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/RateLimitsUpdatedEvent'

  schemas:

    # ----- Shared resources ------------------------------------------------

    AudioFormat:
      type: string
      enum:
        - pcm16
        - g711_ulaw
        - g711_alaw
      description: Supported realtime audio codecs.

    Voice:
      type: string
      enum:
        - alloy
        - ash
        - ballad
        - coral
        - echo
        - sage
        - shimmer
        - verse
      description: Realtime model voice.

    Modality:
      type: string
      enum:
        - text
        - audio

    TurnDetection:
      type: object
      description: Server-side voice activity detection config. Set to `null` to disable.
      nullable: true
      properties:
        type:
          type: string
          enum:
            - server_vad
        threshold:
          type: number
          minimum: 0
          maximum: 1
          description: Activation threshold (default 0.5).
        prefix_padding_ms:
          type: integer
          description: Audio (ms) before speech start to include (default 300).
        silence_duration_ms:
          type: integer
          description: Silence (ms) before a turn is considered ended (default 200).
      required:
        - type

    InputAudioTranscription:
      type: object
      description: |
        Input audio transcription config. Set to `null` to disable. When
        enabled, the server emits `conversation.item.input_audio_transcription.completed`
        for each user audio item.
      nullable: true
      properties:
        model:
          type: string
          enum:
            - whisper-1
      required:
        - model

    ToolDefinition:
      type: object
      properties:
        type:
          type: string
          enum:
            - function
        name:
          type: string
        description:
          type: string
        parameters:
          type: object
          description: JSON Schema describing the tool's parameters.
      required:
        - name
        - parameters

    ToolChoice:
      oneOf:
        - type: string
          enum:
            - auto
            - none
            - required
        - type: object
          properties:
            type:
              type: string
              enum:
                - function
            name:
              type: string
          required:
            - type
            - name

    MaxResponseOutputTokens:
      oneOf:
        - type: integer
          minimum: 1
        - type: string
          enum:
            - inf
      description: Max output tokens per response, or `inf` for unlimited.

    Session:
      type: object
      description: Server-side session configuration.
      properties:
        id:
          type: string
        object:
          type: string
          enum:
            - realtime.session
        model:
          type: string
        modalities:
          type: array
          items:
            $ref: '#/components/schemas/Modality'
        instructions:
          type: string
        voice:
          $ref: '#/components/schemas/Voice'
        input_audio_format:
          $ref: '#/components/schemas/AudioFormat'
        output_audio_format:
          $ref: '#/components/schemas/AudioFormat'
        input_audio_transcription:
          $ref: '#/components/schemas/InputAudioTranscription'
        turn_detection:
          $ref: '#/components/schemas/TurnDetection'
        tools:
          type: array
          items:
            $ref: '#/components/schemas/ToolDefinition'
        tool_choice:
          $ref: '#/components/schemas/ToolChoice'
        temperature:
          type: number
        max_response_output_tokens:
          $ref: '#/components/schemas/MaxResponseOutputTokens'

    SessionPatch:
      type: object
      description: |
        Subset of session fields that may be supplied on `session.update`.
        Only included properties are modified.
      properties:
        modalities:
          type: array
          items:
            $ref: '#/components/schemas/Modality'
        instructions:
          type: string
        voice:
          $ref: '#/components/schemas/Voice'
        input_audio_format:
          $ref: '#/components/schemas/AudioFormat'
        output_audio_format:
          $ref: '#/components/schemas/AudioFormat'
        input_audio_transcription:
          $ref: '#/components/schemas/InputAudioTranscription'
        turn_detection:
          $ref: '#/components/schemas/TurnDetection'
        tools:
          type: array
          items:
            $ref: '#/components/schemas/ToolDefinition'
        tool_choice:
          $ref: '#/components/schemas/ToolChoice'
        temperature:
          type: number
        max_response_output_tokens:
          $ref: '#/components/schemas/MaxResponseOutputTokens'

    Conversation:
      type: object
      properties:
        id:
          type: string
        object:
          type: string
          enum:
            - realtime.conversation

    ItemStatus:
      type: string
      enum:
        - in_progress
        - completed
        - incomplete

    InputTextContent:
      type: object
      properties:
        type:
          type: string
          enum:
            - input_text
        text:
          type: string
      required:
        - type
        - text

    InputAudioContent:
      type: object
      properties:
        type:
          type: string
          enum:
            - input_audio
        audio:
          type: string
          description: Base64-encoded audio bytes in the session's `input_audio_format`.
        transcript:
          type: string
          nullable: true
      required:
        - type

    TextContent:
      type: object
      properties:
        type:
          type: string
          enum:
            - text
        text:
          type: string
      required:
        - type
        - text

    AudioContent:
      type: object
      properties:
        type:
          type: string
          enum:
            - audio
        audio:
          type: string
          description: Base64-encoded audio bytes in the session's `output_audio_format`.
        transcript:
          type: string
          nullable: true
      required:
        - type

    ContentPart:


# --- truncated at 32 KB (53 KB total) ---
# Full source: https://raw.githubusercontent.com/api-evangelist/openai/refs/heads/main/asyncapi/openai-realtime-asyncapi.yml