OpenAI · AsyncAPI Specification

OpenAI Realtime API

Version 2024-10-01

The OpenAI Realtime API provides low-latency, bidirectional, event-driven communication with multimodal models that natively support speech-to-speech, text, and audio in a single conversation. This AsyncAPI document describes the **WebSocket** transport for the Realtime API, including all documented client-to-server events and server-to-client events. The Realtime API is currently in beta. Clients must include the `OpenAI-Beta: realtime=v1` header when connecting. Connection URL: wss://api.openai.com/v1/realtime?model={model} Events flow over a single full-duplex WebSocket connection. Every event has a top-level `type` and most events also carry an `event_id` correlation id.

View Spec View on GitHub AIArtificial IntelligenceLarge Language ModelsT1AsyncAPIWebhooksEvents

Channels

session.update

publish sendSessionUpdate

Update session configuration.

Send by the client to update the session's default configuration (modalities, instructions, voice, audio formats, turn detection, tools, tool_choice, temperature, max_response_output_tokens).

input_audio_buffer.append

publish sendInputAudioBufferAppend

Append audio bytes to the input buffer.

Send by the client to append base64-encoded audio bytes to the input audio buffer. The default audio format is `pcm16` at 24 kHz.

input_audio_buffer.commit

publish sendInputAudioBufferCommit

Commit the input audio buffer.

Send by the client to commit the input audio buffer to the conversation as a user message. Required in non-VAD modes before requesting a response.

input_audio_buffer.clear

publish sendInputAudioBufferClear

Clear the input audio buffer.

Send by the client to clear the input audio buffer without committing it.

conversation.item.create

publish sendConversationItemCreate

Insert a conversation item.

Send by the client to insert a conversation item (a message, function_call, or function_call_output) into the conversation history.

conversation.item.truncate

publish sendConversationItemTruncate

Truncate an in-progress assistant item.

Send by the client to truncate the assistant audio of an in-progress response item. Used for interruption: audio after `audio_end_ms` is discarded and any text after that point is cleared.

conversation.item.delete

publish sendConversationItemDelete

Delete a conversation item.

Send by the client to delete a conversation item by id.

response.create

publish sendResponseCreate

Trigger a model response.

Send by the client to instruct the model to generate a response. Optionally overrides the session configuration for this single response.

response.cancel

publish sendResponseCancel

Cancel an in-progress response.

Send by the client to cancel an in-progress response.

error

subscribe receiveError

Receive an error event.

Server-emitted error envelope. Sent whenever a client event is invalid or the server encounters a problem processing a request.

session.created

subscribe receiveSessionCreated

Receive session.created.

Emitted by the server immediately after the WebSocket connection is authenticated. Contains the initial session configuration.

session.updated

subscribe receiveSessionUpdated

Receive session.updated.

Emitted after the server applies a `session.update` from the client.

conversation.created

subscribe receiveConversationCreated

Receive conversation.created.

Emitted by the server when a new conversation is created on the session.

conversation.item.created

subscribe receiveConversationItemCreated

Receive conversation.item.created.

Emitted when a new conversation item has been added (either by the client or by the model generating a response).

conversation.item.input_audio_transcription.completed

subscribe receiveInputAudioTranscriptionCompleted

Receive input_audio_transcription.completed.

Emitted when input audio transcription for a user audio item has completed (requires `input_audio_transcription` enabled on the session).

conversation.item.input_audio_transcription.failed

subscribe receiveInputAudioTranscriptionFailed

Receive input_audio_transcription.failed.

Emitted when input audio transcription fails for a user audio item.

conversation.item.truncated

subscribe receiveConversationItemTruncated

Receive conversation.item.truncated.

Emitted after the server applies a `conversation.item.truncate` request from the client.

conversation.item.deleted

subscribe receiveConversationItemDeleted

Receive conversation.item.deleted.

Emitted after the server applies a `conversation.item.delete` request.

input_audio_buffer.committed

subscribe receiveInputAudioBufferCommitted

Receive input_audio_buffer.committed.

Emitted when the input audio buffer is committed (either explicitly by the client via `input_audio_buffer.commit`, or implicitly by server VAD).

input_audio_buffer.cleared

subscribe receiveInputAudioBufferCleared

Receive input_audio_buffer.cleared.

Emitted after the server clears the input audio buffer.

input_audio_buffer.speech_started

subscribe receiveSpeechStarted

Receive input_audio_buffer.speech_started.

Emitted in server VAD mode when speech is detected starting in the input audio buffer.

input_audio_buffer.speech_stopped

subscribe receiveSpeechStopped

Receive input_audio_buffer.speech_stopped.

Emitted in server VAD mode when speech is detected stopping in the input audio buffer.

response.created

subscribe receiveResponseCreated

Receive response.created.

Emitted when the server begins generating a response after a `response.create` (explicit) or after server VAD commits a user turn.

response.done

subscribe receiveResponseDone

Receive response.done.

Emitted when a response has finished (status `completed`, `cancelled`, `failed`, or `incomplete`). Carries usage and final output items.

response.output_item.added

subscribe receiveResponseOutputItemAdded

Receive response.output_item.added.

Emitted when a new output item is added to a response.

response.output_item.done

subscribe receiveResponseOutputItemDone

Receive response.output_item.done.

Emitted when an output item on a response is complete.

response.content_part.added

subscribe receiveResponseContentPartAdded

Receive response.content_part.added.

Emitted when a new content part (text, audio, or transcript) is added to an output item.

response.content_part.done

subscribe receiveResponseContentPartDone

Receive response.content_part.done.

Emitted when a content part on an output item is complete.

response.text.delta

subscribe receiveResponseTextDelta

Receive response.text.delta.

Streaming text delta for a `text` content part on an assistant item.

response.text.done

subscribe receiveResponseTextDone

Receive response.text.done.

Emitted when a `text` content part is fully generated.

response.audio_transcript.delta

subscribe receiveResponseAudioTranscriptDelta

Receive response.audio_transcript.delta.

Streaming transcript delta for an `audio` content part on an assistant item.

response.audio_transcript.done

subscribe receiveResponseAudioTranscriptDone

Receive response.audio_transcript.done.

Emitted when the transcript for an `audio` content part is fully generated.

response.audio.delta

subscribe receiveResponseAudioDelta

Receive response.audio.delta.

Streaming base64-encoded audio delta for an `audio` content part on an assistant item.

response.audio.done

subscribe receiveResponseAudioDone

Receive response.audio.done.

Emitted when an `audio` content part is fully generated. No final base64 payload is included; clients reassemble from the deltas.

response.function_call_arguments.delta

subscribe receiveResponseFunctionCallArgumentsDelta

Receive response.function_call_arguments.delta.

Streaming delta for a tool/function call's `arguments` string.

response.function_call_arguments.done

subscribe receiveResponseFunctionCallArgumentsDone

Receive response.function_call_arguments.done.

Emitted when the `arguments` string for a function call is complete.

rate_limits.updated

subscribe receiveRateLimitsUpdated

Receive rate_limits.updated.

Emitted periodically with the current rate limit state for the connection (requests and tokens, remaining and reset_seconds).

Messages

✉

SessionUpdate

session.update

Update session configuration.

✉

InputAudioBufferAppend

input_audio_buffer.append

Append audio bytes to the input buffer.

✉

InputAudioBufferCommit

input_audio_buffer.commit

Commit the input audio buffer.

✉

InputAudioBufferClear

input_audio_buffer.clear

Clear the input audio buffer.

✉

ConversationItemCreate

conversation.item.create

Insert a conversation item.

✉

ConversationItemTruncate

conversation.item.truncate

Truncate an assistant item's audio.

✉

ConversationItemDelete

conversation.item.delete

Delete a conversation item.

✉

ResponseCreate

response.create

Trigger a model response.

✉

ResponseCancel

response.cancel

Cancel an in-progress response.

✉

Error

error

Server error.

✉

SessionCreated

session.created

Session has been created.

✉

SessionUpdated

session.updated

Session configuration updated.

✉

ConversationCreated

conversation.created

Conversation created.

✉

ConversationItemCreated

conversation.item.created

Conversation item created.

✉

InputAudioTranscriptionCompleted

conversation.item.input_audio_transcription.completed

Input audio transcription completed.

✉

InputAudioTranscriptionFailed

conversation.item.input_audio_transcription.failed

Input audio transcription failed.

✉

ConversationItemTruncated

conversation.item.truncated

Conversation item truncated.

✉

ConversationItemDeleted

conversation.item.deleted

Conversation item deleted.

✉

InputAudioBufferCommitted

input_audio_buffer.committed

Input audio buffer committed.

✉

InputAudioBufferCleared

input_audio_buffer.cleared

Input audio buffer cleared.

✉

InputAudioBufferSpeechStarted

input_audio_buffer.speech_started

VAD speech started.

✉

InputAudioBufferSpeechStopped

input_audio_buffer.speech_stopped

VAD speech stopped.

✉

ResponseCreated

response.created

Response generation started.

✉

ResponseDone

response.done

Response generation finished.

✉

ResponseOutputItemAdded

response.output_item.added

New output item added to response.

✉

ResponseOutputItemDone

response.output_item.done

Output item on response complete.

✉

ResponseContentPartAdded

response.content_part.added

Content part added to output item.

✉

ResponseContentPartDone

response.content_part.done

Content part on output item complete.

✉

ResponseTextDelta

response.text.delta

Text delta for assistant message.

✉

ResponseTextDone

response.text.done

Text content part complete.

✉

ResponseAudioTranscriptDelta

response.audio_transcript.delta

Transcript delta for audio content part.

✉

ResponseAudioTranscriptDone

response.audio_transcript.done

Transcript for audio content part complete.

✉

ResponseAudioDelta

response.audio.delta

Base64 audio delta for audio content part.

✉

ResponseAudioDone

response.audio.done

Audio content part complete.

✉

ResponseFunctionCallArgumentsDelta

response.function_call_arguments.delta

Function-call arguments delta.

✉

ResponseFunctionCallArgumentsDone

response.function_call_arguments.done

Function-call arguments complete.

✉

RateLimitsUpdated

rate_limits.updated

Current rate limit state.

Servers

wss

production api.openai.com/v1/realtime

OpenAI Realtime WebSocket endpoint. The `model` query parameter selects the underlying realtime-capable model (for example `gpt-4o-realtime-preview-2024-10-01`).

AsyncAPI Specification

asyncapi: '2.6.0'
info:
  title: OpenAI Realtime API
  version: '2024-10-01'
  description: |
    The OpenAI Realtime API provides low-latency, bidirectional, event-driven
    communication with multimodal models that natively support speech-to-speech,
    text, and audio in a single conversation. This AsyncAPI document describes
    the **WebSocket** transport for the Realtime API, including all documented
    client-to-server events and server-to-client events.

    The Realtime API is currently in beta. Clients must include the
    `OpenAI-Beta: realtime=v1` header when connecting.

    Connection URL:

        wss://api.openai.com/v1/realtime?model={model}

    Events flow over a single full-duplex WebSocket connection. Every event has
    a top-level `type` and most events also carry an `event_id` correlation id.
  contact:
    name: OpenAI
    url: https://platform.openai.com/docs/api-reference/realtime
  license:
    name: Proprietary
    url: https://openai.com/policies/terms-of-use
  termsOfService: https://openai.com/policies/terms-of-use
  x-source:
    derivedFrom:
      - https://platform.openai.com/docs/guides/realtime
      - https://platform.openai.com/docs/api-reference/realtime-client-events
      - https://platform.openai.com/docs/api-reference/realtime-server-events
      - https://github.com/openai/openai-realtime-api-beta

defaultContentType: application/json

servers:
  production:
    url: api.openai.com/v1/realtime
    protocol: wss
    description: |
      OpenAI Realtime WebSocket endpoint. The `model` query parameter selects
      the underlying realtime-capable model (for example
      `gpt-4o-realtime-preview-2024-10-01`).
    variables:
      model:
        description: Realtime model identifier appended as a query string parameter.
        default: gpt-4o-realtime-preview-2024-10-01
        examples:
          - gpt-4o-realtime-preview
          - gpt-4o-realtime-preview-2024-10-01
          - gpt-4o-mini-realtime-preview
    security:
      - bearerAuth: []
        openAiBeta: []
    bindings:
      ws:
        bindingVersion: '0.1.0'
        headers:
          type: object
          required:
            - Authorization
            - OpenAI-Beta
          properties:
            Authorization:
              type: string
              description: Bearer token in the form `Bearer <OPENAI_API_KEY>`.
            OpenAI-Beta:
              type: string
              description: Beta opt-in header. Must be set to `realtime=v1`.
              enum:
                - realtime=v1
            OpenAI-Organization:
              type: string
              description: Optional organization identifier for billing.
            OpenAI-Project:
              type: string
              description: Optional project identifier for billing.
        query:
          type: object
          required:
            - model
          properties:
            model:
              type: string
              description: Realtime model identifier.

channels:

  # ---------------------------------------------------------------------------
  # Client -> Server events
  # ---------------------------------------------------------------------------

  session.update:
    description: |
      Send by the client to update the session's default configuration
      (modalities, instructions, voice, audio formats, turn detection, tools,
      tool_choice, temperature, max_response_output_tokens).
    publish:
      operationId: sendSessionUpdate
      summary: Update session configuration.
      message:
        $ref: '#/components/messages/SessionUpdate'

  input_audio_buffer.append:
    description: |
      Send by the client to append base64-encoded audio bytes to the input
      audio buffer. The default audio format is `pcm16` at 24 kHz.
    publish:
      operationId: sendInputAudioBufferAppend
      summary: Append audio bytes to the input buffer.
      message:
        $ref: '#/components/messages/InputAudioBufferAppend'

  input_audio_buffer.commit:
    description: |
      Send by the client to commit the input audio buffer to the conversation
      as a user message. Required in non-VAD modes before requesting a response.
    publish:
      operationId: sendInputAudioBufferCommit
      summary: Commit the input audio buffer.
      message:
        $ref: '#/components/messages/InputAudioBufferCommit'

  input_audio_buffer.clear:
    description: |
      Send by the client to clear the input audio buffer without committing it.
    publish:
      operationId: sendInputAudioBufferClear
      summary: Clear the input audio buffer.
      message:
        $ref: '#/components/messages/InputAudioBufferClear'

  conversation.item.create:
    description: |
      Send by the client to insert a conversation item (a message,
      function_call, or function_call_output) into the conversation history.
    publish:
      operationId: sendConversationItemCreate
      summary: Insert a conversation item.
      message:
        $ref: '#/components/messages/ConversationItemCreate'

  conversation.item.truncate:
    description: |
      Send by the client to truncate the assistant audio of an in-progress
      response item. Used for interruption: audio after `audio_end_ms` is
      discarded and any text after that point is cleared.
    publish:
      operationId: sendConversationItemTruncate
      summary: Truncate an in-progress assistant item.
      message:
        $ref: '#/components/messages/ConversationItemTruncate'

  conversation.item.delete:
    description: |
      Send by the client to delete a conversation item by id.
    publish:
      operationId: sendConversationItemDelete
      summary: Delete a conversation item.
      message:
        $ref: '#/components/messages/ConversationItemDelete'

  response.create:
    description: |
      Send by the client to instruct the model to generate a response.
      Optionally overrides the session configuration for this single response.
    publish:
      operationId: sendResponseCreate
      summary: Trigger a model response.
      message:
        $ref: '#/components/messages/ResponseCreate'

  response.cancel:
    description: |
      Send by the client to cancel an in-progress response.
    publish:
      operationId: sendResponseCancel
      summary: Cancel an in-progress response.
      message:
        $ref: '#/components/messages/ResponseCancel'

  # ---------------------------------------------------------------------------
  # Server -> Client events
  # ---------------------------------------------------------------------------

  error:
    description: |
      Server-emitted error envelope. Sent whenever a client event is invalid
      or the server encounters a problem processing a request.
    subscribe:
      operationId: receiveError
      summary: Receive an error event.
      message:
        $ref: '#/components/messages/Error'

  session.created:
    description: |
      Emitted by the server immediately after the WebSocket connection is
      authenticated. Contains the initial session configuration.
    subscribe:
      operationId: receiveSessionCreated
      summary: Receive session.created.
      message:
        $ref: '#/components/messages/SessionCreated'

  session.updated:
    description: |
      Emitted after the server applies a `session.update` from the client.
    subscribe:
      operationId: receiveSessionUpdated
      summary: Receive session.updated.
      message:
        $ref: '#/components/messages/SessionUpdated'

  conversation.created:
    description: |
      Emitted by the server when a new conversation is created on the
      session.
    subscribe:
      operationId: receiveConversationCreated
      summary: Receive conversation.created.
      message:
        $ref: '#/components/messages/ConversationCreated'

  conversation.item.created:
    description: |
      Emitted when a new conversation item has been added (either by the
      client or by the model generating a response).
    subscribe:
      operationId: receiveConversationItemCreated
      summary: Receive conversation.item.created.
      message:
        $ref: '#/components/messages/ConversationItemCreated'

  conversation.item.input_audio_transcription.completed:
    description: |
      Emitted when input audio transcription for a user audio item has
      completed (requires `input_audio_transcription` enabled on the session).
    subscribe:
      operationId: receiveInputAudioTranscriptionCompleted
      summary: Receive input_audio_transcription.completed.
      message:
        $ref: '#/components/messages/InputAudioTranscriptionCompleted'

  conversation.item.input_audio_transcription.failed:
    description: |
      Emitted when input audio transcription fails for a user audio item.
    subscribe:
      operationId: receiveInputAudioTranscriptionFailed
      summary: Receive input_audio_transcription.failed.
      message:
        $ref: '#/components/messages/InputAudioTranscriptionFailed'

  conversation.item.truncated:
    description: |
      Emitted after the server applies a `conversation.item.truncate` request
      from the client.
    subscribe:
      operationId: receiveConversationItemTruncated
      summary: Receive conversation.item.truncated.
      message:
        $ref: '#/components/messages/ConversationItemTruncated'

  conversation.item.deleted:
    description: |
      Emitted after the server applies a `conversation.item.delete` request.
    subscribe:
      operationId: receiveConversationItemDeleted
      summary: Receive conversation.item.deleted.
      message:
        $ref: '#/components/messages/ConversationItemDeleted'

  input_audio_buffer.committed:
    description: |
      Emitted when the input audio buffer is committed (either explicitly by
      the client via `input_audio_buffer.commit`, or implicitly by server VAD).
    subscribe:
      operationId: receiveInputAudioBufferCommitted
      summary: Receive input_audio_buffer.committed.
      message:
        $ref: '#/components/messages/InputAudioBufferCommitted'

  input_audio_buffer.cleared:
    description: |
      Emitted after the server clears the input audio buffer.
    subscribe:
      operationId: receiveInputAudioBufferCleared
      summary: Receive input_audio_buffer.cleared.
      message:
        $ref: '#/components/messages/InputAudioBufferCleared'

  input_audio_buffer.speech_started:
    description: |
      Emitted in server VAD mode when speech is detected starting in the
      input audio buffer.
    subscribe:
      operationId: receiveSpeechStarted
      summary: Receive input_audio_buffer.speech_started.
      message:
        $ref: '#/components/messages/InputAudioBufferSpeechStarted'

  input_audio_buffer.speech_stopped:
    description: |
      Emitted in server VAD mode when speech is detected stopping in the
      input audio buffer.
    subscribe:
      operationId: receiveSpeechStopped
      summary: Receive input_audio_buffer.speech_stopped.
      message:
        $ref: '#/components/messages/InputAudioBufferSpeechStopped'

  response.created:
    description: |
      Emitted when the server begins generating a response after a
      `response.create` (explicit) or after server VAD commits a user turn.
    subscribe:
      operationId: receiveResponseCreated
      summary: Receive response.created.
      message:
        $ref: '#/components/messages/ResponseCreated'

  response.done:
    description: |
      Emitted when a response has finished (status `completed`, `cancelled`,
      `failed`, or `incomplete`). Carries usage and final output items.
    subscribe:
      operationId: receiveResponseDone
      summary: Receive response.done.
      message:
        $ref: '#/components/messages/ResponseDone'

  response.output_item.added:
    description: |
      Emitted when a new output item is added to a response.
    subscribe:
      operationId: receiveResponseOutputItemAdded
      summary: Receive response.output_item.added.
      message:
        $ref: '#/components/messages/ResponseOutputItemAdded'

  response.output_item.done:
    description: |
      Emitted when an output item on a response is complete.
    subscribe:
      operationId: receiveResponseOutputItemDone
      summary: Receive response.output_item.done.
      message:
        $ref: '#/components/messages/ResponseOutputItemDone'

  response.content_part.added:
    description: |
      Emitted when a new content part (text, audio, or transcript) is added
      to an output item.
    subscribe:
      operationId: receiveResponseContentPartAdded
      summary: Receive response.content_part.added.
      message:
        $ref: '#/components/messages/ResponseContentPartAdded'

  response.content_part.done:
    description: |
      Emitted when a content part on an output item is complete.
    subscribe:
      operationId: receiveResponseContentPartDone
      summary: Receive response.content_part.done.
      message:
        $ref: '#/components/messages/ResponseContentPartDone'

  response.text.delta:
    description: |
      Streaming text delta for a `text` content part on an assistant item.
    subscribe:
      operationId: receiveResponseTextDelta
      summary: Receive response.text.delta.
      message:
        $ref: '#/components/messages/ResponseTextDelta'

  response.text.done:
    description: |
      Emitted when a `text` content part is fully generated.
    subscribe:
      operationId: receiveResponseTextDone
      summary: Receive response.text.done.
      message:
        $ref: '#/components/messages/ResponseTextDone'

  response.audio_transcript.delta:
    description: |
      Streaming transcript delta for an `audio` content part on an
      assistant item.
    subscribe:
      operationId: receiveResponseAudioTranscriptDelta
      summary: Receive response.audio_transcript.delta.
      message:
        $ref: '#/components/messages/ResponseAudioTranscriptDelta'

  response.audio_transcript.done:
    description: |
      Emitted when the transcript for an `audio` content part is fully
      generated.
    subscribe:
      operationId: receiveResponseAudioTranscriptDone
      summary: Receive response.audio_transcript.done.
      message:
        $ref: '#/components/messages/ResponseAudioTranscriptDone'

  response.audio.delta:
    description: |
      Streaming base64-encoded audio delta for an `audio` content part on an
      assistant item.
    subscribe:
      operationId: receiveResponseAudioDelta
      summary: Receive response.audio.delta.
      message:
        $ref: '#/components/messages/ResponseAudioDelta'

  response.audio.done:
    description: |
      Emitted when an `audio` content part is fully generated. No final
      base64 payload is included; clients reassemble from the deltas.
    subscribe:
      operationId: receiveResponseAudioDone
      summary: Receive response.audio.done.
      message:
        $ref: '#/components/messages/ResponseAudioDone'

  response.function_call_arguments.delta:
    description: |
      Streaming delta for a tool/function call's `arguments` string.
    subscribe:
      operationId: receiveResponseFunctionCallArgumentsDelta
      summary: Receive response.function_call_arguments.delta.
      message:
        $ref: '#/components/messages/ResponseFunctionCallArgumentsDelta'

  response.function_call_arguments.done:
    description: |
      Emitted when the `arguments` string for a function call is complete.
    subscribe:
      operationId: receiveResponseFunctionCallArgumentsDone
      summary: Receive response.function_call_arguments.done.
      message:
        $ref: '#/components/messages/ResponseFunctionCallArgumentsDone'

  rate_limits.updated:
    description: |
      Emitted periodically with the current rate limit state for the
      connection (requests and tokens, remaining and reset_seconds).
    subscribe:
      operationId: receiveRateLimitsUpdated
      summary: Receive rate_limits.updated.
      message:
        $ref: '#/components/messages/RateLimitsUpdated'

components:

  securitySchemes:
    bearerAuth:
      type: http
      scheme: bearer
      bearerFormat: OpenAI API Key
      description: |
        Bearer token authentication using an OpenAI API key. The header
        `Authorization: Bearer <OPENAI_API_KEY>` must be sent on the
        WebSocket upgrade request.
    openAiBeta:
      type: apiKey
      in: header
      name: OpenAI-Beta
      description: |
        Beta opt-in header. Must be set to `realtime=v1` for the Realtime
        API while it remains in beta.

  messages:

    # ----- Client -> Server -------------------------------------------------

    SessionUpdate:
      name: SessionUpdate
      title: session.update
      summary: Update session configuration.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/SessionUpdateEvent'

    InputAudioBufferAppend:
      name: InputAudioBufferAppend
      title: input_audio_buffer.append
      summary: Append audio bytes to the input buffer.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/InputAudioBufferAppendEvent'

    InputAudioBufferCommit:
      name: InputAudioBufferCommit
      title: input_audio_buffer.commit
      summary: Commit the input audio buffer.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/InputAudioBufferCommitEvent'

    InputAudioBufferClear:
      name: InputAudioBufferClear
      title: input_audio_buffer.clear
      summary: Clear the input audio buffer.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/InputAudioBufferClearEvent'

    ConversationItemCreate:
      name: ConversationItemCreate
      title: conversation.item.create
      summary: Insert a conversation item.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/ConversationItemCreateEvent'

    ConversationItemTruncate:
      name: ConversationItemTruncate
      title: conversation.item.truncate
      summary: Truncate an assistant item's audio.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/ConversationItemTruncateEvent'

    ConversationItemDelete:
      name: ConversationItemDelete
      title: conversation.item.delete
      summary: Delete a conversation item.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/ConversationItemDeleteEvent'

    ResponseCreate:
      name: ResponseCreate
      title: response.create
      summary: Trigger a model response.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/ResponseCreateEvent'

    ResponseCancel:
      name: ResponseCancel
      title: response.cancel
      summary: Cancel an in-progress response.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/ResponseCancelEvent'

    # ----- Server -> Client -------------------------------------------------

    Error:
      name: Error
      title: error
      summary: Server error.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/ErrorEvent'

    SessionCreated:
      name: SessionCreated
      title: session.created
      summary: Session has been created.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/SessionCreatedEvent'

    SessionUpdated:
      name: SessionUpdated
      title: session.updated
      summary: Session configuration updated.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/SessionUpdatedEvent'

    ConversationCreated:
      name: ConversationCreated
      title: conversation.created
      summary: Conversation created.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/ConversationCreatedEvent'

    ConversationItemCreated:
      name: ConversationItemCreated
      title: conversation.item.created
      summary: Conversation item created.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/ConversationItemCreatedEvent'

    InputAudioTranscriptionCompleted:
      name: InputAudioTranscriptionCompleted
      title: conversation.item.input_audio_transcription.completed
      summary: Input audio transcription completed.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/InputAudioTranscriptionCompletedEvent'

    InputAudioTranscriptionFailed:
      name: InputAudioTranscriptionFailed
      title: conversation.item.input_audio_transcription.failed
      summary: Input audio transcription failed.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/InputAudioTranscriptionFailedEvent'

    ConversationItemTruncated:
      name: ConversationItemTruncated
      title: conversation.item.truncated
      summary: Conversation item truncated.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/ConversationItemTruncatedEvent'

    ConversationItemDeleted:
      name: ConversationItemDeleted
      title: conversation.item.deleted
      summary: Conversation item deleted.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/ConversationItemDeletedEvent'

    InputAudioBufferCommitted:
      name: InputAudioBufferCommitted
      title: input_audio_buffer.committed
      summary: Input audio buffer committed.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/InputAudioBufferCommittedEvent'

    InputAudioBufferCleared:
      name: InputAudioBufferCleared
      title: input_audio_buffer.cleared
      summary: Input audio buffer cleared.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/InputAudioBufferClearedEvent'

    InputAudioBufferSpeechStarted:
      name: InputAudioBufferSpeechStarted
      title: input_audio_buffer.speech_started
      summary: VAD speech started.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/InputAudioBufferSpeechStartedEvent'

    InputAudioBufferSpeechStopped:
      name: InputAudioBufferSpeechStopped
      title: input_audio_buffer.speech_stopped
      summary: VAD speech stopped.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/InputAudioBufferSpeechStoppedEvent'

    ResponseCreated:
      name: ResponseCreated
      title: response.created
      summary: Response generation started.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/ResponseCreatedEvent'

    ResponseDone:
      name: ResponseDone
      title: response.done
      summary: Response generation finished.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/ResponseDoneEvent'

    ResponseOutputItemAdded:
      name: ResponseOutputItemAdded
      title: response.output_item.added
      summary: New output item added to response.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/ResponseOutputItemAddedEvent'

    ResponseOutputItemDone:
      name: ResponseOutputItemDone
      title: response.output_item.done
      summary: Output item on response complete.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/ResponseOutputItemDoneEvent'

    ResponseContentPartAdded:
      name: ResponseContentPartAdded
      title: response.content_part.added
      summary: Content part added to output item.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/ResponseContentPartAddedEvent'

    ResponseContentPartDone:
      name: ResponseContentPartDone
      title: response.content_part.done
      summary: Content part on output item complete.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/ResponseContentPartDoneEvent'

    ResponseTextDelta:
      name: ResponseTextDelta
      title: response.text.delta
      summary: Text delta for assistant message.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/ResponseTextDeltaEvent'

    ResponseTextDone:
      name: ResponseTextDone
      title: response.text.done
      summary: Text content part complete.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/ResponseTextDoneEvent'

    ResponseAudioTranscriptDelta:
      name: ResponseAudioTranscriptDelta
      title: response.audio_transcript.delta
      summary: Transcript delta for audio content part.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/ResponseAudioTranscriptDeltaEvent'

    ResponseAudioTranscriptDone:
      name: ResponseAudioTranscriptDone
      title: response.audio_transcript.done
      summary: Transcript for audio content part complete.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/ResponseAudioTranscriptDoneEvent'

    ResponseAudioDelta:
      name: ResponseAudioDelta
      title: response.audio.delta
      summary: Base64 audio delta for audio content part.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/ResponseAudioDeltaEvent'

    ResponseAudioDone:
      name: ResponseAudioDone
      title: response.audio.done
      summary: Audio content part complete.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/ResponseAudioDoneEvent'

    ResponseFunctionCallArgumentsDelta:
      name: ResponseFunctionCallArgumentsDelta
      title: response.function_call_arguments.delta
      summary: Function-call arguments delta.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/ResponseFunctionCallArgumentsDeltaEvent'

    ResponseFunctionCallArgumentsDone:
      name: ResponseFunctionCallArgumentsDone
      title: response.function_call_arguments.done
      summary: Function-call arguments complete.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/ResponseFunctionCallArgumentsDoneEvent'

    RateLimitsUpdated:
      name: RateLimitsUpdated
      title: rate_limits.updated
      summary: Current rate limit state.
      contentType: application/json
      payload:
        $ref: '#/components/schemas/RateLimitsUpdatedEvent'

  schemas:

    # ----- Shared resources ------------------------------------------------

    AudioFormat:
      type: string
      enum:
        - pcm16
        - g711_ulaw
        - g711_alaw
      description: Supported realtime audio codecs.

    Voice:
      type: string
      enum:
        - alloy
        - ash
        - ballad
        - coral
        - echo
        - sage
        - shimmer
        - verse
      description: Realtime model voice.

    Modality:
      type: string
      enum:
        - text
        - audio

    TurnDetection:
      type: object
      description: Server-side voice activity detection config. Set to `null` to disable.
      nullable: true
      properties:
        type:
          type: string
          enum:
            - server_vad
        threshold:
          type: number
          minimum: 0
          maximum: 1
          description: Activation threshold (default 0.5).
        prefix_padding_ms:
          type: integer
          description: Audio (ms) before speech start to include (default 300).
        silence_duration_ms:
          type: integer
          description: Silence (ms) before a turn is considered ended (default 200).
      required:
        - type

    InputAudioTranscription:
      type: object
      description: |
        Input audio transcription config. Set to `null` to disable. When
        enabled, the server emits `conversation.item.input_audio_transcription.completed`
        for each user audio item.
      nullable: true
      properties:
        model:
          type: string
          enum:
            - whisper-1
      required:
        - model

    ToolDefinition:
      type: object
      properties:
        type:
          type: string
          enum:
            - function
        name:
          type: string
        description:
          type: string
        parameters:
          type: object
          description: JSON Schema describing the tool's parameters.
      required:
        - name
        - parameters

    ToolChoice:
      oneOf:
        - type: string
          enum:
            - auto
            - none
            - required
        - type: object
          properties:
            type:
              type: string
              enum:
                - function
            name:
              type: string
          required:
            - type
            - name

    MaxResponseOutputTokens:
      oneOf:
        - type: integer
          minimum: 1
        - type: string
          enum:
            - inf
      description: Max output tokens per response, or `inf` for unlimited.

    Session:
      type: object
      description: Server-side session configuration.
      properties:
        id:
          type: string
        object:
          type: string
          enum:
            - realtime.session
        model:
          type: string
        modalities:
          type: array
          items:
            $ref: '#/components/schemas/Modality'
        instructions:
          type: string
        voice:
          $ref: '#/components/schemas/Voice'
        input_audio_format:
          $ref: '#/components/schemas/AudioFormat'
        output_audio_format:
          $ref: '#/components/schemas/AudioFormat'
        input_audio_transcription:
          $ref: '#/components/schemas/InputAudioTranscription'
        turn_detection:
          $ref: '#/components/schemas/TurnDetection'
        tools:
          type: array
          items:
            $ref: '#/components/schemas/ToolDefinition'
        tool_choice:
          $ref: '#/components/schemas/ToolChoice'
        temperature:
          type: number
        max_response_output_tokens:
          $ref: '#/components/schemas/MaxResponseOutputTokens'

    SessionPatch:
      type: object
      description: |
        Subset of session fields that may be supplied on `session.update`.
        Only included properties are modified.
      properties:
        modalities:
          type: array
          items:
            $ref: '#/components/schemas/Modality'
        instructions:
          type: string
        voice:
          $ref: '#/components/schemas/Voice'
        input_audio_format:
          $ref: '#/components/schemas/AudioFormat'
        output_audio_format:
          $ref: '#/components/schemas/AudioFormat'
        input_audio_transcription:
          $ref: '#/components/schemas/InputAudioTranscription'
        turn_detection:
          $ref: '#/components/schemas/TurnDetection'
        tools:
          type: array
          items:
            $ref: '#/components/schemas/ToolDefinition'
        tool_choice:
          $ref: '#/components/schemas/ToolChoice'
        temperature:
          type: number
        max_response_output_tokens:
          $ref: '#/components/schemas/MaxResponseOutputTokens'

    Conversation:
      type: object
      properties:
        id:
          type: string
        object:
          type: string
          enum:
            - realtime.conversation

    ItemStatus:
      type: string
      enum:
        - in_progress
        - completed
        - incomplete

    InputTextContent:
      type: object
      properties:
        type:
          type: string
          enum:
            - input_text
        text:
          type: string
      required:
        - type
        - text

    InputAudioContent:
      type: object
      properties:
        type:
          type: string
          enum:
            - input_audio
        audio:
          type: string
          description: Base64-encoded audio bytes in the session's `input_audio_format`.
        transcript:
          type: string
          nullable: true
      required:
        - type

    TextContent:
      type: object
      properties:
        type:
          type: string
          enum:
            - text
        text:
          type: string
      required:
        - type
        - text

    AudioContent:
      type: object
      properties:
        type:
          type: string
          enum:
            - audio
        audio:
          type: string
          description: Base64-encoded audio bytes in the session's `output_audio_format`.
        transcript:
          type: string
          nullable: true
      required:
        - type

    ContentPart:


# --- truncated at 32 KB (53 KB total) ---
# Full source: https://raw.githubusercontent.com/api-evangelist/openai/refs/heads/main/asyncapi/openai-realtime-asyncapi.yml