Real-time Streaming
Available now

Real-time Audio Streaming

Stream audio over WebSocket and receive live transcripts in English and African languages. Ideal for live events, calls, virtual meetings, and interactive voice applications.

How streaming works

The OrbitalsAI SDKs expose dedicated streaming modules for WebSocket-based real-time transcription:

  • Python: orbitalsai.streaming — async/sync clients, event handlers, audio conversion helpers.
  • JavaScript/TypeScript: orbitalsaiStreamingClient, event callbacks, browser and Node.js support.

For batch transcription, see Python SDK or JavaScript SDK.

Installation

Python

Install the core SDK, then add optional extras depending on how you plan to stream:

Basic streaming (raw PCM):

bash
bash
pip install orbitalsai

With audio conversion & microphone helpers:

bash
bash
# Audio conversion utilities (files)
pip install "orbitalsai[audio]"
# All streaming + audio extras
pip install "orbitalsai[all]"
Dependencies
The streaming module uses WebSockets and NumPy under the hood. Extras like orbitalsai[audio] and orbitalsai[all] pull in optional packages such assounddevice, soundfile, and librosa.

JavaScript / TypeScript

bash
bash
npm install orbitalsai
Runtime support
The JavaScript SDK supports Node.js 16+ and modern browsers. Streaming uses the native WebSocket API.

Async streaming from raw PCM (recommended)

Use AsyncStreamingClient for non-blocking streaming of raw PCM16 mono audio. This is the lowest-latency way to integrate real-time transcription.

streaming_async_pcm.py
python
import asyncio
from orbitalsai.streaming import (
AsyncStreamingClient,
StreamingConfig,
PrintingEventHandlers,
)
async def main():
# Configure streaming session
config = StreamingConfig(
language="english",
sample_rate=16000,
chunk_size=8000, # 500ms at 16kHz
interim_results=True, # Get partial transcripts
)
async with AsyncStreamingClient(api_key="your_api_key_here", config=config) as client:
await client.connect(PrintingEventHandlers())
# Stream raw PCM16 mono little-endian data
with open("audio.pcm", "rb") as f:
while chunk := f.read(config.bytes_per_chunk):
await client.send_audio(chunk)
# Optional: pace to real time
await asyncio.sleep(config.chunk_duration_ms / 1000.0)
# Ask server to flush remaining audio and send final transcripts
await client.flush()
if __name__ == "__main__":
asyncio.run(main())
PCM requirements
The streaming API expects PCM16 mono little-endian audio. If your source audio is MP3/WAV/etc., use AudioConverter as shown below.

Stream from audio files (MP3, WAV, M4A, FLAC, OGG)

Use AudioConverter to convert common audio formats into PCM16 and split them into appropriately sized chunks before streaming.

streaming_file_async.py
python
import asyncio
import os
from orbitalsai.streaming import (
AsyncStreamingClient,
StreamingConfig,
PrintingEventHandlers,
AudioConverter,
)
API_KEY = os.getenv("ORBITALSAI_API_KEY", "your_api_key_here")
async def stream_file(file_path: str, language: str = "english") -> None:
config = StreamingConfig(
language=language,
sample_rate=16000,
chunk_size=8000,
interim_results=True,
)
# Load and convert audio file to PCM16 at 16kHz
audio_bytes, _ = AudioConverter.from_file(
file_path,
target_sample_rate=16000,
)
# Split into 500ms chunks
chunks = AudioConverter.split_chunks(audio_bytes, chunk_size=config.chunk_size)
async with AsyncStreamingClient(api_key=API_KEY, config=config) as client:
await client.connect(PrintingEventHandlers())
for chunk in chunks:
await client.send_audio(chunk)
await asyncio.sleep(config.chunk_duration_ms / 1000.0)
await client.flush()
if __name__ == "__main__":
asyncio.run(stream_file("speech.mp3", language="english"))

Stream from microphone

Combine AsyncStreamingClient with sounddevice to stream live audio from a microphone. This mirrors the streaming_microphone.py example from the SDK.

streaming_microphone.py
python
import asyncio
import os
import sys
import numpy as np
import sounddevice as sd
from orbitalsai.streaming import (
AsyncStreamingClient,
StreamingConfig,
PrintingEventHandlers,
)
API_KEY = os.getenv("ORBITALSAI_API_KEY", "your_api_key_here")
async def stream_microphone(
language: str = "english",
duration: int = 30,
device: int | None = None,
) -> None:
sample_rate = 16000
blocksize = 8000 # 500ms chunks
config = StreamingConfig(
language=language,
sample_rate=sample_rate,
chunk_size=blocksize,
interim_results=True,
)
audio_queue: asyncio.Queue[bytes] = asyncio.Queue()
# Called in a separate thread by sounddevice
def audio_callback(indata, frames, time_info, status) -> None:
if status:
print(f"Audio status: {status}", file=sys.stderr)
audio_queue.put_nowait(indata.tobytes())
async with AsyncStreamingClient(api_key=API_KEY, config=config) as client:
await client.connect(PrintingEventHandlers(show_partials=True))
with sd.InputStream(
samplerate=sample_rate,
channels=1,
dtype="int16",
blocksize=blocksize,
device=device,
callback=audio_callback,
):
end_time = asyncio.get_event_loop().time() + duration
while asyncio.get_event_loop().time() < end_time:
try:
audio_data = await asyncio.wait_for(audio_queue.get(), timeout=1.0)
except asyncio.TimeoutError:
continue
await client.send_audio(audio_data)
# Flush remaining audio and wait for final transcripts
await client.flush()
await asyncio.sleep(2.0)
if __name__ == "__main__":
asyncio.run(stream_microphone(duration=30))
Local environment only
Microphone streaming requires local access to audio devices and is typically run from a desktop or server environment where sounddevice is available.

Streaming configuration

StreamingConfig controls audio, language, and connection behavior for each streaming session.

streaming_config.py
python
from orbitalsai.streaming import StreamingConfig
config = StreamingConfig(
# Audio
sample_rate=16000, # 8000–48000 Hz (16kHz recommended)
chunk_size=8000, # Samples per chunk (~500ms at 16kHz)
encoding="pcm_s16le", # PCM16 mono little-endian
# Language
language="english", # english, hausa, igbo, yoruba
auto_detect_language=False,
# Connection & retries
max_retries=5,
retry_delay=1.0, # Initial retry delay (seconds)
max_retry_delay=60.0,
keepalive_interval=15.0, # WebSocket ping interval
connection_timeout=30.0, # Connection timeout (seconds)
# Processing
interim_results=True, # Receive partial transcripts
auto_flush=True, # Auto-flush on silence
)
FieldTypeDescription
sample_rateintAudio sample rate in Hz (between 8000 and 48000).
chunk_sizeintNumber of samples per chunk (at 16kHz, 8000 ≈ 500ms).
languagestrTarget transcription language (english, hausa, igbo, yoruba).
max_retriesintMaximum number of reconnection attempts after an unexpected disconnect.
retry_delayfloatInitial delay (in seconds) between reconnection attempts (exponential backoff).
connection_timeoutfloatTimeout (in seconds) for establishing the WebSocket connection.
interim_resultsboolWhether to receive partial (non-final) transcripts as the user speaks.
auto_flushboolWhether the server should automatically flush on silence.

Error handling for streaming

The streaming module raises rich exception types and exposes them to your handlers so you can respond to authentication issues, connection drops, or exhausted credits.

streaming_errors.py
python
import asyncio
from orbitalsai.streaming import (
AsyncStreamingClient,
StreamingEventHandlers,
)
from orbitalsai.streaming.exceptions import (
ConnectionError,
AuthenticationError,
InsufficientCreditsError,
ReconnectionFailedError,
SessionClosedError,
)
class MyHandlers(StreamingEventHandlers):
def on_error(self, error: Exception) -> None:
if isinstance(error, AuthenticationError):
print("Invalid API key – check your ORBITALSAI_API_KEY.")
elif isinstance(error, InsufficientCreditsError):
print("Credits exhausted – please top up your account.")
elif isinstance(error, ReconnectionFailedError):
print(f"Failed to reconnect after {error.attempts} attempts.")
else:
print(f"Streaming error: {error}")
async def main() -> None:
try:
async with AsyncStreamingClient(api_key="your_api_key_here") as client:
await client.connect(MyHandlers())
# ... send audio with client.send_audio(...)
await client.flush()
except ConnectionError as e:
print(f"Connection failed: {e}")
except SessionClosedError:
print("Session was closed; stop sending audio.")
if __name__ == "__main__":
asyncio.run(main())
More details
See the Python SDK README for a complete list of streaming exception classes and examples.

JavaScript / TypeScript Streaming

The OrbitalsAI npm package provides StreamingClient for real-time transcription in Node.js and browsers. It connects over WebSocket and sends raw PCM16 mono audio.

Async streaming from raw PCM

Use StreamingClient to stream PCM16 mono little-endian audio and receive partial and final transcripts via event callbacks.

streaming_async.js
javascript
import { StreamingClient } from "orbitalsai";
const client = new StreamingClient({
apiKey: "your_api_key_here",
language: "english",
sampleRate: 16000,
});
// Handle incoming transcripts
client.on("partial", (data) => {
console.log("Partial:", data.text);
});
client.on("final", (data) => {
console.log("Final:", data.text, "Cost:", data.cost);
});
client.on("error", (err) => {
console.error("Streaming error:", err);
});
await client.connect();
// Stream raw PCM16 mono data (e.g., from a file or audio buffer)
const pcmChunk = /* Uint8Array or ArrayBuffer of PCM16 */;
await client.sendAudio(pcmChunk);
// Flush remaining audio and receive final transcripts
await client.flush();
await client.disconnect();
PCM requirements
The streaming API expects PCM16 mono little-endian audio at 16 kHz. Use the Web Audio API or a library like librosa (Node.js) to convert your audio if needed.

Stream from microphone (browser)

Capture microphone audio with the Web Audio API, convert to PCM16, and stream in real time. This example works in modern browsers.

streaming_microphone_browser.js
javascript
import { StreamingClient } from "orbitalsai";
async function streamFromMicrophone() {
const stream = await navigator.mediaDevices.getUserMedia({ audio: true });
const audioContext = new AudioContext({ sampleRate: 16000 });
const source = audioContext.createMediaStreamSource(stream);
const processor = audioContext.createScriptProcessor(4096, 1, 1);
const client = new StreamingClient({
apiKey: "your_api_key_here",
language: "english",
sampleRate: 16000,
});
client.on("partial", (data) => console.log("Partial:", data.text));
client.on("final", (data) => console.log("Final:", data.text));
await client.connect();
processor.onaudioprocess = (e) => {
const float32 = e.inputBuffer.getChannelData(0);
const pcm16 = new Int16Array(float32.length);
for (let i = 0; i < float32.length; i++) {
const s = Math.max(-1, Math.min(1, float32[i]));
pcm16[i] = s < 0 ? s * 0x8000 : s * 0x7fff;
}
client.sendAudio(pcm16.buffer);
};
source.connect(processor);
processor.connect(audioContext.destination);
// Stop after 30 seconds
setTimeout(async () => {
stream.getTracks().forEach((t) => t.stop());
await client.flush();
await client.disconnect();
}, 30000);
}
streamFromMicrophone();
Browser permissions
Microphone access requires a secure context (HTTPS or localhost) and user permission.

StreamingClient options

Pass configuration when creating a StreamingClient:

streaming_config.js
javascript
const client = new StreamingClient({
apiKey: "your_api_key_here",
// Audio
sampleRate: 16000, // 8000–48000 Hz (16kHz recommended)
// Language
language: "english", // english, hausa, igbo, yoruba
// Connection
baseUrl: "https://api.orbitalsai.com", // Optional, for custom endpoints
});
OptionTypeDescription
apiKeystringYour OrbitalsAI API key (required).
sampleRatenumberAudio sample rate in Hz (default: 16000).
languagestringTarget language (english, hausa, igbo, yoruba).
baseUrlstringAPI base URL for custom endpoints.

Error handling (JavaScript)

Listen for the error event and handle connection failures, authentication issues, or exhausted credits.

streaming_errors.js
javascript
import { StreamingClient, AuthenticationError } from "orbitalsai";
const client = new StreamingClient({
apiKey: process.env.ORBITALS_API_KEY,
language: "english",
});
client.on("error", (err) => {
if (err instanceof AuthenticationError) {
console.error("Invalid API key – check ORBITALS_API_KEY.");
} else if (err.message?.includes("credits")) {
console.error("Credits exhausted – please top up your account.");
} else {
console.error("Streaming error:", err);
}
});
try {
await client.connect();
// ... send audio
await client.flush();
} catch (err) {
console.error("Connection failed:", err);
} finally {
await client.disconnect();
}
npm package
See the orbitalsai npm package for full API reference, types, and changelog.

Streaming use cases

Call Center Transcription

Real-time transcription of customer service calls in African languages for quality assurance, training, and compliance.

Live Event Captioning

Provide live captions for conferences, webinars, and broadcasts in real-time, making content accessible to broader audiences.

Virtual Meeting Assistants

Create AI assistants that transcribe and summarize virtual meetings in real-time, capturing action items and key decisions.

Voice Assistants

Build voice-controlled applications with low-latency speech recognition for African language speakers.