DeepL · AsyncAPI Specification
DeepL Voice API - WebSocket Streaming
Version 1.0.0
WebSocket streaming API for real-time voice transcription and translation. After obtaining a streaming URL and token via the REST API (POST /v3/voice/realtime), establish a WebSocket connection to stream audio data and receive real-time transcriptions and translations. This is a local mirror modelling the publicly documented surface; the authoritative AsyncAPI document is published by DeepL at https://developers.deepl.com/api-reference/voice.asyncapi.yaml.
View Spec
View on GitHub
Artificial IntelligenceDeep LearningGlossariesLocalizationMachine LearningMachine TranslationTranslationAsyncAPIWebhooksEvents
Channels
voiceStream
Bidirectional channel for streaming source audio chunks to DeepL and receiving incremental source-language transcriptions, translated transcriptions, and (closed beta) synthesized translated audio.
Messages
SourceMediaChunk
SourceMediaChunk
Chunk of audio data from the client.
EndOfSourceMedia
EndOfSourceMedia
Client signals it has finished sending source audio.
SourceTranscriptUpdate
SourceTranscriptUpdate
Incremental source-language transcription (concluded + tentative segments).
TargetTranscriptUpdate
TargetTranscriptUpdate
Incremental target-language translation (concluded + tentative segments).
TargetMediaChunk
TargetMediaChunk
Synthesized translated audio (closed beta).
EndOfSourceTranscript
EndOfSourceTranscript
EndOfTargetTranscript
EndOfTargetTranscript
EndOfTargetMedia
EndOfTargetMedia
EndOfStream
EndOfStream
Server indicates all processing is complete; safe to close the connection.
ErrorMessage
ErrorMessage
Processing error reported by the server.
Servers
wss
production
DeepL Voice API WebSocket endpoint.