Real-time Streaming

Available now

Real-time Audio Streaming

Stream audio over WebSocket and receive live transcripts in English and African languages. Ideal for live events, calls, virtual meetings, and interactive voice applications.

How streaming works

The OrbitalsAI SDKs expose dedicated streaming modules for WebSocket-based real-time transcription:

Python: orbitalsai.streaming — async/sync clients, event handlers, audio conversion helpers.
JavaScript/TypeScript: orbitalsai — StreamingClient, event callbacks, browser and Node.js support.

For batch transcription, see Python SDK or JavaScript SDK.

Installation

Python

Install the core SDK, then add optional extras depending on how you plan to stream:

Basic streaming (raw PCM):

bash

pip install orbitalsai

With audio conversion & microphone helpers:

bash

# Audio conversion utilities (files)
pip install "orbitalsai[audio]"
# All streaming + audio extras
pip install "orbitalsai[all]"

Dependencies

The streaming module uses WebSockets and NumPy under the hood. Extras like orbitalsai[audio] and orbitalsai[all] pull in optional packages such assounddevice, soundfile, and librosa.

JavaScript / TypeScript

bash

npm install orbitalsai

Runtime support

The JavaScript SDK supports Node.js 16+ and modern browsers. Streaming uses the native WebSocket API.

Async streaming from raw PCM (recommended)

Use AsyncStreamingClient for non-blocking streaming of raw PCM16 mono audio. This is the lowest-latency way to integrate real-time transcription.

streaming_async_pcm.py

python

import asyncio
from orbitalsai.streaming import (
    AsyncStreamingClient,
    StreamingConfig,
    PrintingEventHandlers,
)
async def main():
    # Configure streaming session
    config = StreamingConfig(
        language="english",
        sample_rate=16000,
        chunk_size=8000,      # 500ms at 16kHz
        interim_results=True, # Get partial transcripts
    )
    async with AsyncStreamingClient(api_key="your_api_key_here", config=config) as client:
        await client.connect(PrintingEventHandlers())
        # Stream raw PCM16 mono little-endian data
        with open("audio.pcm", "rb") as f:
            while chunk := f.read(config.bytes_per_chunk):
                await client.send_audio(chunk)
                # Optional: pace to real time
                await asyncio.sleep(config.chunk_duration_ms / 1000.0)
        # Ask server to flush remaining audio and send final transcripts
        await client.flush()
if __name__ == "__main__":
    asyncio.run(main())

PCM requirements

The streaming API expects PCM16 mono little-endian audio. If your source audio is MP3/WAV/etc., use AudioConverter as shown below.

Stream from audio files (MP3, WAV, M4A, FLAC, OGG)

Use AudioConverter to convert common audio formats into PCM16 and split them into appropriately sized chunks before streaming.

streaming_file_async.py

python

import asyncio
import os
from orbitalsai.streaming import (
    AsyncStreamingClient,
    StreamingConfig,
    PrintingEventHandlers,
    AudioConverter,
)
API_KEY = os.getenv("ORBITALSAI_API_KEY", "your_api_key_here")
async def stream_file(file_path: str, language: str = "english") -> None:
    config = StreamingConfig(
        language=language,
        sample_rate=16000,
        chunk_size=8000,
        interim_results=True,
    )
    # Load and convert audio file to PCM16 at 16kHz
    audio_bytes, _ = AudioConverter.from_file(
        file_path,
        target_sample_rate=16000,
    )
    # Split into 500ms chunks
    chunks = AudioConverter.split_chunks(audio_bytes, chunk_size=config.chunk_size)
    async with AsyncStreamingClient(api_key=API_KEY, config=config) as client:
        await client.connect(PrintingEventHandlers())
        for chunk in chunks:
            await client.send_audio(chunk)
            await asyncio.sleep(config.chunk_duration_ms / 1000.0)
        await client.flush()
if __name__ == "__main__":
    asyncio.run(stream_file("speech.mp3", language="english"))

Stream from microphone

Combine AsyncStreamingClient with sounddevice to stream live audio from a microphone. This mirrors the streaming_microphone.py example from the SDK.

streaming_microphone.py

python

import asyncio
import os
import sys
import numpy as np
import sounddevice as sd
from orbitalsai.streaming import (
    AsyncStreamingClient,
    StreamingConfig,
    PrintingEventHandlers,
)
API_KEY = os.getenv("ORBITALSAI_API_KEY", "your_api_key_here")
async def stream_microphone(
    language: str = "english",
    duration: int = 30,
    device: int | None = None,
) -> None:
    sample_rate = 16000
    blocksize = 8000  # 500ms chunks
    config = StreamingConfig(
        language=language,
        sample_rate=sample_rate,
        chunk_size=blocksize,
        interim_results=True,
    )
    audio_queue: asyncio.Queue[bytes] = asyncio.Queue()
    # Called in a separate thread by sounddevice
    def audio_callback(indata, frames, time_info, status) -> None:
        if status:
            print(f"Audio status: {status}", file=sys.stderr)
        audio_queue.put_nowait(indata.tobytes())
    async with AsyncStreamingClient(api_key=API_KEY, config=config) as client:
        await client.connect(PrintingEventHandlers(show_partials=True))
        with sd.InputStream(
            samplerate=sample_rate,
            channels=1,
            dtype="int16",
            blocksize=blocksize,
            device=device,
            callback=audio_callback,
        ):
            end_time = asyncio.get_event_loop().time() + duration
            while asyncio.get_event_loop().time() < end_time:
                try:
                    audio_data = await asyncio.wait_for(audio_queue.get(), timeout=1.0)
                except asyncio.TimeoutError:
                    continue
                await client.send_audio(audio_data)
        # Flush remaining audio and wait for final transcripts
        await client.flush()
        await asyncio.sleep(2.0)
if __name__ == "__main__":
    asyncio.run(stream_microphone(duration=30))

Local environment only

Microphone streaming requires local access to audio devices and is typically run from a desktop or server environment where sounddevice is available.

Streaming configuration

StreamingConfig controls audio, language, and connection behavior for each streaming session.

streaming_config.py

python

from orbitalsai.streaming import StreamingConfig
config = StreamingConfig(
    # Audio
    sample_rate=16000,        # 8000–48000 Hz (16kHz recommended)
    chunk_size=8000,          # Samples per chunk (~500ms at 16kHz)
    encoding="pcm_s16le",     # PCM16 mono little-endian
    # Language
    language="english",       # english, hausa, igbo, yoruba
    auto_detect_language=False,
    # Connection & retries
    max_retries=5,
    retry_delay=1.0,          # Initial retry delay (seconds)
    max_retry_delay=60.0,
    keepalive_interval=15.0,  # WebSocket ping interval
    connection_timeout=30.0,  # Connection timeout (seconds)
    # Processing
    interim_results=True,     # Receive partial transcripts
    auto_flush=True,          # Auto-flush on silence
)

Field	Type	Description
`sample_rate`	int	Audio sample rate in Hz (between 8000 and 48000).
`chunk_size`	int	Number of samples per chunk (at 16kHz, 8000 ≈ 500ms).
`language`	str	Target transcription language (english, hausa, igbo, yoruba).
`max_retries`	int	Maximum number of reconnection attempts after an unexpected disconnect.
`retry_delay`	float	Initial delay (in seconds) between reconnection attempts (exponential backoff).
`connection_timeout`	float	Timeout (in seconds) for establishing the WebSocket connection.
`interim_results`	bool	Whether to receive partial (non-final) transcripts as the user speaks.
`auto_flush`	bool	Whether the server should automatically flush on silence.

Error handling for streaming

The streaming module raises rich exception types and exposes them to your handlers so you can respond to authentication issues, connection drops, or exhausted credits.

streaming_errors.py

python

import asyncio
from orbitalsai.streaming import (
    AsyncStreamingClient,
    StreamingEventHandlers,
)
from orbitalsai.streaming.exceptions import (
    ConnectionError,
    AuthenticationError,
    InsufficientCreditsError,
    ReconnectionFailedError,
    SessionClosedError,
)
class MyHandlers(StreamingEventHandlers):
    def on_error(self, error: Exception) -> None:
        if isinstance(error, AuthenticationError):
            print("Invalid API key – check your ORBITALSAI_API_KEY.")
        elif isinstance(error, InsufficientCreditsError):
            print("Credits exhausted – please top up your account.")
        elif isinstance(error, ReconnectionFailedError):
            print(f"Failed to reconnect after {error.attempts} attempts.")
        else:
            print(f"Streaming error: {error}")
async def main() -> None:
    try:
        async with AsyncStreamingClient(api_key="your_api_key_here") as client:
            await client.connect(MyHandlers())
            # ... send audio with client.send_audio(...)
            await client.flush()
    except ConnectionError as e:
        print(f"Connection failed: {e}")
    except SessionClosedError:
        print("Session was closed; stop sending audio.")
if __name__ == "__main__":
    asyncio.run(main())

More details

See the Python SDK README for a complete list of streaming exception classes and examples.

JavaScript / TypeScript Streaming

The OrbitalsAI npm package provides StreamingClient for real-time transcription in Node.js and browsers. It connects over WebSocket and sends raw PCM16 mono audio.

Async streaming from raw PCM

Use StreamingClient to stream PCM16 mono little-endian audio and receive partial and final transcripts via event callbacks.

streaming_async.js

javascript

import { StreamingClient } from "orbitalsai";
const client = new StreamingClient({
  apiKey: "your_api_key_here",
  language: "english",
  sampleRate: 16000,
});
// Handle incoming transcripts
client.on("partial", (data) => {
  console.log("Partial:", data.text);
});
client.on("final", (data) => {
  console.log("Final:", data.text, "Cost:", data.cost);
});
client.on("error", (err) => {
  console.error("Streaming error:", err);
});
await client.connect();
// Stream raw PCM16 mono data (e.g., from a file or audio buffer)
const pcmChunk = /* Uint8Array or ArrayBuffer of PCM16 */;
await client.sendAudio(pcmChunk);
// Flush remaining audio and receive final transcripts
await client.flush();
await client.disconnect();

PCM requirements

The streaming API expects PCM16 mono little-endian audio at 16 kHz. Use the Web Audio API or a library like librosa (Node.js) to convert your audio if needed.

Stream from microphone (browser)

Capture microphone audio with the Web Audio API, convert to PCM16, and stream in real time. This example works in modern browsers.

streaming_microphone_browser.js

javascript

import { StreamingClient } from "orbitalsai";
async function streamFromMicrophone() {
  const stream = await navigator.mediaDevices.getUserMedia({ audio: true });
  const audioContext = new AudioContext({ sampleRate: 16000 });
  const source = audioContext.createMediaStreamSource(stream);
  const processor = audioContext.createScriptProcessor(4096, 1, 1);
  const client = new StreamingClient({
    apiKey: "your_api_key_here",
    language: "english",
    sampleRate: 16000,
  });
  client.on("partial", (data) => console.log("Partial:", data.text));
  client.on("final", (data) => console.log("Final:", data.text));
  await client.connect();
  processor.onaudioprocess = (e) => {
    const float32 = e.inputBuffer.getChannelData(0);
    const pcm16 = new Int16Array(float32.length);
    for (let i = 0; i < float32.length; i++) {
      const s = Math.max(-1, Math.min(1, float32[i]));
      pcm16[i] = s < 0 ? s * 0x8000 : s * 0x7fff;
    }
    client.sendAudio(pcm16.buffer);
  };
  source.connect(processor);
  processor.connect(audioContext.destination);
  // Stop after 30 seconds
  setTimeout(async () => {
    stream.getTracks().forEach((t) => t.stop());
    await client.flush();
    await client.disconnect();
  }, 30000);
}
streamFromMicrophone();

Browser permissions

Microphone access requires a secure context (HTTPS or localhost) and user permission.

StreamingClient options

Pass configuration when creating a StreamingClient:

streaming_config.js

javascript

const client = new StreamingClient({
  apiKey: "your_api_key_here",
  // Audio
  sampleRate: 16000,           // 8000–48000 Hz (16kHz recommended)
  // Language
  language: "english",         // english, hausa, igbo, yoruba
  // Connection
  baseUrl: "https://api.orbitalsai.com",  // Optional, for custom endpoints
});

Option	Type	Description
`apiKey`	string	Your OrbitalsAI API key (required).
`sampleRate`	number	Audio sample rate in Hz (default: 16000).
`language`	string	Target language (english, hausa, igbo, yoruba).
`baseUrl`	string	API base URL for custom endpoints.

Error handling (JavaScript)

Listen for the error event and handle connection failures, authentication issues, or exhausted credits.

streaming_errors.js

javascript

import { StreamingClient, AuthenticationError } from "orbitalsai";
const client = new StreamingClient({
  apiKey: process.env.ORBITALS_API_KEY,
  language: "english",
});
client.on("error", (err) => {
  if (err instanceof AuthenticationError) {
    console.error("Invalid API key – check ORBITALS_API_KEY.");
  } else if (err.message?.includes("credits")) {
    console.error("Credits exhausted – please top up your account.");
  } else {
    console.error("Streaming error:", err);
  }
});
try {
  await client.connect();
  // ... send audio
  await client.flush();
} catch (err) {
  console.error("Connection failed:", err);
} finally {
  await client.disconnect();
}

npm package

See the orbitalsai npm package for full API reference, types, and changelog.

Streaming use cases

Call Center Transcription

Real-time transcription of customer service calls in African languages for quality assurance, training, and compliance.

Live Event Captioning

Provide live captions for conferences, webinars, and broadcasts in real-time, making content accessible to broader audiences.

Virtual Meeting Assistants

Create AI assistants that transcribe and summarize virtual meetings in real-time, capturing action items and key decisions.

Voice Assistants

Build voice-controlled applications with low-latency speech recognition for African language speakers.

Real-time Audio Streaming

How streaming works

Installation

Python

Dependencies

JavaScript / TypeScript

Runtime support

Async streaming from raw PCM (recommended)

PCM requirements

Stream from audio files (MP3, WAV, M4A, FLAC, OGG)

Stream from microphone

Local environment only

Streaming configuration

Error handling for streaming

More details

JavaScript / TypeScript Streaming

Async streaming from raw PCM

PCM requirements

Stream from microphone (browser)

Browser permissions

StreamingClient options

Error handling (JavaScript)

npm package

Streaming use cases

Call Center Transcription

Live Event Captioning

Virtual Meeting Assistants

Voice Assistants

Related Documentation

Pre-recorded Transcription

Supported Languages