Real-time Audio Streaming
Stream audio over WebSocket and receive live transcripts in English and African languages. Ideal for live events, calls, virtual meetings, and interactive voice applications.
How streaming works
The OrbitalsAI SDKs expose dedicated streaming modules for WebSocket-based real-time transcription:
- Python:
orbitalsai.streaming— async/sync clients, event handlers, audio conversion helpers. - JavaScript/TypeScript:
orbitalsai—StreamingClient, event callbacks, browser and Node.js support.
For batch transcription, see Python SDK or JavaScript SDK.
Installation
Python
Install the core SDK, then add optional extras depending on how you plan to stream:
Basic streaming (raw PCM):
pip install orbitalsaiWith audio conversion & microphone helpers:
# Audio conversion utilities (files)pip install "orbitalsai[audio]"# All streaming + audio extraspip install "orbitalsai[all]"Dependencies
orbitalsai[audio] and orbitalsai[all] pull in optional packages such assounddevice, soundfile, and librosa.JavaScript / TypeScript
npm install orbitalsaiRuntime support
Async streaming from raw PCM (recommended)
Use AsyncStreamingClient for non-blocking streaming of raw PCM16 mono audio. This is the lowest-latency way to integrate real-time transcription.
import asynciofrom orbitalsai.streaming import ( AsyncStreamingClient, StreamingConfig, PrintingEventHandlers,)async def main(): # Configure streaming session config = StreamingConfig( language="english", sample_rate=16000, chunk_size=8000, # 500ms at 16kHz interim_results=True, # Get partial transcripts ) async with AsyncStreamingClient(api_key="your_api_key_here", config=config) as client: await client.connect(PrintingEventHandlers()) # Stream raw PCM16 mono little-endian data with open("audio.pcm", "rb") as f: while chunk := f.read(config.bytes_per_chunk): await client.send_audio(chunk) # Optional: pace to real time await asyncio.sleep(config.chunk_duration_ms / 1000.0) # Ask server to flush remaining audio and send final transcripts await client.flush()if __name__ == "__main__": asyncio.run(main())PCM requirements
AudioConverter as shown below.Stream from audio files (MP3, WAV, M4A, FLAC, OGG)
Use AudioConverter to convert common audio formats into PCM16 and split them into appropriately sized chunks before streaming.
import asyncioimport osfrom orbitalsai.streaming import ( AsyncStreamingClient, StreamingConfig, PrintingEventHandlers, AudioConverter,)API_KEY = os.getenv("ORBITALSAI_API_KEY", "your_api_key_here")async def stream_file(file_path: str, language: str = "english") -> None: config = StreamingConfig( language=language, sample_rate=16000, chunk_size=8000, interim_results=True, ) # Load and convert audio file to PCM16 at 16kHz audio_bytes, _ = AudioConverter.from_file( file_path, target_sample_rate=16000, ) # Split into 500ms chunks chunks = AudioConverter.split_chunks(audio_bytes, chunk_size=config.chunk_size) async with AsyncStreamingClient(api_key=API_KEY, config=config) as client: await client.connect(PrintingEventHandlers()) for chunk in chunks: await client.send_audio(chunk) await asyncio.sleep(config.chunk_duration_ms / 1000.0) await client.flush()if __name__ == "__main__": asyncio.run(stream_file("speech.mp3", language="english"))Stream from microphone
Combine AsyncStreamingClient with sounddevice to stream live audio from a microphone. This mirrors the streaming_microphone.py example from the SDK.
import asyncioimport osimport sysimport numpy as npimport sounddevice as sdfrom orbitalsai.streaming import ( AsyncStreamingClient, StreamingConfig, PrintingEventHandlers,)API_KEY = os.getenv("ORBITALSAI_API_KEY", "your_api_key_here")async def stream_microphone( language: str = "english", duration: int = 30, device: int | None = None,) -> None: sample_rate = 16000 blocksize = 8000 # 500ms chunks config = StreamingConfig( language=language, sample_rate=sample_rate, chunk_size=blocksize, interim_results=True, ) audio_queue: asyncio.Queue[bytes] = asyncio.Queue() # Called in a separate thread by sounddevice def audio_callback(indata, frames, time_info, status) -> None: if status: print(f"Audio status: {status}", file=sys.stderr) audio_queue.put_nowait(indata.tobytes()) async with AsyncStreamingClient(api_key=API_KEY, config=config) as client: await client.connect(PrintingEventHandlers(show_partials=True)) with sd.InputStream( samplerate=sample_rate, channels=1, dtype="int16", blocksize=blocksize, device=device, callback=audio_callback, ): end_time = asyncio.get_event_loop().time() + duration while asyncio.get_event_loop().time() < end_time: try: audio_data = await asyncio.wait_for(audio_queue.get(), timeout=1.0) except asyncio.TimeoutError: continue await client.send_audio(audio_data) # Flush remaining audio and wait for final transcripts await client.flush() await asyncio.sleep(2.0)if __name__ == "__main__": asyncio.run(stream_microphone(duration=30))Local environment only
sounddevice is available.Streaming configuration
StreamingConfig controls audio, language, and connection behavior for each streaming session.
from orbitalsai.streaming import StreamingConfigconfig = StreamingConfig( # Audio sample_rate=16000, # 8000–48000 Hz (16kHz recommended) chunk_size=8000, # Samples per chunk (~500ms at 16kHz) encoding="pcm_s16le", # PCM16 mono little-endian # Language language="english", # english, hausa, igbo, yoruba auto_detect_language=False, # Connection & retries max_retries=5, retry_delay=1.0, # Initial retry delay (seconds) max_retry_delay=60.0, keepalive_interval=15.0, # WebSocket ping interval connection_timeout=30.0, # Connection timeout (seconds) # Processing interim_results=True, # Receive partial transcripts auto_flush=True, # Auto-flush on silence)| Field | Type | Description |
|---|---|---|
sample_rate | int | Audio sample rate in Hz (between 8000 and 48000). |
chunk_size | int | Number of samples per chunk (at 16kHz, 8000 ≈ 500ms). |
language | str | Target transcription language (english, hausa, igbo, yoruba). |
max_retries | int | Maximum number of reconnection attempts after an unexpected disconnect. |
retry_delay | float | Initial delay (in seconds) between reconnection attempts (exponential backoff). |
connection_timeout | float | Timeout (in seconds) for establishing the WebSocket connection. |
interim_results | bool | Whether to receive partial (non-final) transcripts as the user speaks. |
auto_flush | bool | Whether the server should automatically flush on silence. |
Error handling for streaming
The streaming module raises rich exception types and exposes them to your handlers so you can respond to authentication issues, connection drops, or exhausted credits.
import asynciofrom orbitalsai.streaming import ( AsyncStreamingClient, StreamingEventHandlers,)from orbitalsai.streaming.exceptions import ( ConnectionError, AuthenticationError, InsufficientCreditsError, ReconnectionFailedError, SessionClosedError,)class MyHandlers(StreamingEventHandlers): def on_error(self, error: Exception) -> None: if isinstance(error, AuthenticationError): print("Invalid API key – check your ORBITALSAI_API_KEY.") elif isinstance(error, InsufficientCreditsError): print("Credits exhausted – please top up your account.") elif isinstance(error, ReconnectionFailedError): print(f"Failed to reconnect after {error.attempts} attempts.") else: print(f"Streaming error: {error}")async def main() -> None: try: async with AsyncStreamingClient(api_key="your_api_key_here") as client: await client.connect(MyHandlers()) # ... send audio with client.send_audio(...) await client.flush() except ConnectionError as e: print(f"Connection failed: {e}") except SessionClosedError: print("Session was closed; stop sending audio.")if __name__ == "__main__": asyncio.run(main())More details
JavaScript / TypeScript Streaming
The OrbitalsAI npm package provides StreamingClient for real-time transcription in Node.js and browsers. It connects over WebSocket and sends raw PCM16 mono audio.
Async streaming from raw PCM
Use StreamingClient to stream PCM16 mono little-endian audio and receive partial and final transcripts via event callbacks.
import { StreamingClient } from "orbitalsai";const client = new StreamingClient({ apiKey: "your_api_key_here", language: "english", sampleRate: 16000,});// Handle incoming transcriptsclient.on("partial", (data) => { console.log("Partial:", data.text);});client.on("final", (data) => { console.log("Final:", data.text, "Cost:", data.cost);});client.on("error", (err) => { console.error("Streaming error:", err);});await client.connect();// Stream raw PCM16 mono data (e.g., from a file or audio buffer)const pcmChunk = /* Uint8Array or ArrayBuffer of PCM16 */;await client.sendAudio(pcmChunk);// Flush remaining audio and receive final transcriptsawait client.flush();await client.disconnect();PCM requirements
librosa (Node.js) to convert your audio if needed.Stream from microphone (browser)
Capture microphone audio with the Web Audio API, convert to PCM16, and stream in real time. This example works in modern browsers.
import { StreamingClient } from "orbitalsai";async function streamFromMicrophone() { const stream = await navigator.mediaDevices.getUserMedia({ audio: true }); const audioContext = new AudioContext({ sampleRate: 16000 }); const source = audioContext.createMediaStreamSource(stream); const processor = audioContext.createScriptProcessor(4096, 1, 1); const client = new StreamingClient({ apiKey: "your_api_key_here", language: "english", sampleRate: 16000, }); client.on("partial", (data) => console.log("Partial:", data.text)); client.on("final", (data) => console.log("Final:", data.text)); await client.connect(); processor.onaudioprocess = (e) => { const float32 = e.inputBuffer.getChannelData(0); const pcm16 = new Int16Array(float32.length); for (let i = 0; i < float32.length; i++) { const s = Math.max(-1, Math.min(1, float32[i])); pcm16[i] = s < 0 ? s * 0x8000 : s * 0x7fff; } client.sendAudio(pcm16.buffer); }; source.connect(processor); processor.connect(audioContext.destination); // Stop after 30 seconds setTimeout(async () => { stream.getTracks().forEach((t) => t.stop()); await client.flush(); await client.disconnect(); }, 30000);}streamFromMicrophone();Browser permissions
StreamingClient options
Pass configuration when creating a StreamingClient:
const client = new StreamingClient({ apiKey: "your_api_key_here", // Audio sampleRate: 16000, // 8000–48000 Hz (16kHz recommended) // Language language: "english", // english, hausa, igbo, yoruba // Connection baseUrl: "https://api.orbitalsai.com", // Optional, for custom endpoints});| Option | Type | Description |
|---|---|---|
apiKey | string | Your OrbitalsAI API key (required). |
sampleRate | number | Audio sample rate in Hz (default: 16000). |
language | string | Target language (english, hausa, igbo, yoruba). |
baseUrl | string | API base URL for custom endpoints. |
Error handling (JavaScript)
Listen for the error event and handle connection failures, authentication issues, or exhausted credits.
import { StreamingClient, AuthenticationError } from "orbitalsai";const client = new StreamingClient({ apiKey: process.env.ORBITALS_API_KEY, language: "english",});client.on("error", (err) => { if (err instanceof AuthenticationError) { console.error("Invalid API key – check ORBITALS_API_KEY."); } else if (err.message?.includes("credits")) { console.error("Credits exhausted – please top up your account."); } else { console.error("Streaming error:", err); }});try { await client.connect(); // ... send audio await client.flush();} catch (err) { console.error("Connection failed:", err);} finally { await client.disconnect();}npm package
Streaming use cases
Call Center Transcription
Real-time transcription of customer service calls in African languages for quality assurance, training, and compliance.
Live Event Captioning
Provide live captions for conferences, webinars, and broadcasts in real-time, making content accessible to broader audiences.
Virtual Meeting Assistants
Create AI assistants that transcribe and summarize virtual meetings in real-time, capturing action items and key decisions.
Voice Assistants
Build voice-controlled applications with low-latency speech recognition for African language speakers.