Streaming | aiOla

Stream real-time audio transcription with our SDKs. This guide covers how to implement live audio streaming for immediate speech-to-text conversion in your applications.

Prerequisites

Before you begin, make sure you have:

An aiOla API key (get one here)
Python 3.10+ (for Python SDK) or Node.js 18+ (for TypeScript SDK)
Microphone access (for live audio streaming)

Installation

$ pip install 'aiola[mic]'

Step 1: Set up authentication

First, generate an access token and create your client:

For comprehensive authentication details, security considerations, and token management strategies, see our Authentication Guide.

1 import os
2 from aiola import AiolaClient
3 
4 # Generate access token
5 result = AiolaClient.grant_token(
6     api_key=os.getenv('AIOLA_API_KEY') or 'YOUR_API_KEY'
7 )
8 
9 # Create client using the access token
10 client = AiolaClient(
11     access_token=result.access_token
12 )

Step 2: Initialize streaming connection

Create a streaming connection with event handlers:

1 from aiola import AiolaClient, MicrophoneStream
2 from aiola.types import LiveEvents
3 
4 # Create streaming connection
5 connection = client.stt.stream(lang_code='en')
6 
7 # Set up event handlers
8 @connection.on(LiveEvents.Transcript)
9 def on_transcript(data):
10     print('Transcript:', data.get('transcript', data))
11 
12 @connection.on(LiveEvents.Connect)
13 def on_connect():
14     print('Connected to streaming service')
15 
16 @connection.on(LiveEvents.Disconnect)
17 def on_disconnect():
18     print('Disconnected from streaming service')
19 
20 @connection.on(LiveEvents.Error)
21 def on_error(error):
22     print('Streaming error:', error)

Step 3: Start streaming with microphone

Start the streaming connection and pipe microphone audio:

1 # Connect to the streaming service
2 connection.connect()
3 
4 try:
5     # Capture audio from microphone using the SDK's MicrophoneStream
6     with MicrophoneStream(
7         channels=1,
8         samplerate=16000,
9         blocksize=4096,
10     ) as mic:
11         print("Listening... Speak into your microphone")
12         mic.stream_to(connection)
13 
14         # Keep the main thread alive
15         while True:
16             try:
17                 import time
18                 time.sleep(0.1)
19             except KeyboardInterrupt:
20                 print('Keyboard interrupt')
21                 break
22 
23 except KeyboardInterrupt:
24     print('Keyboard interrupt')
25 finally:
26     connection.disconnect()

Custom audio sources

For custom audio sources instead of microphone:

1 import asyncio
2 
3 async def stream_audio_file():
4     # Connect to streaming service
5     connection = client.stt.stream(lang_code='en')
6     
7     @connection.on(LiveEvents.Transcript)
8     def on_transcript(data):
9         print('Transcript:', data.get('transcript', data))
10     
11     connection.connect()
12     
13     # Stream audio file in chunks
14     with open('audio_file.wav', 'rb') as audio_file:
15         chunk_size = 4096
16         while True:
17             chunk = audio_file.read(chunk_size)
18             if not chunk:
19                 break
20             
21             # Send audio chunk
22             connection.send(chunk)
23             
24             # Small delay to simulate real-time streaming
25             await asyncio.sleep(0.1)
26     
27     # Close connection
28     connection.disconnect()
29 
30 # Run the async function
31 asyncio.run(stream_audio_file())

Advanced streaming options

Keyword detection

Enable keyword detection during streaming:

1 # Create connection with keyword detection
2 connection = client.stt.stream(
3     lang_code='en',
4     keywords={
5         "postgres": "PostgreSQL",
6         "k eight s": "Kubernetes"
7     }
8 )

Multiple language support

Stream with different languages:

1 # Supported languages: en, de, fr, es, pr, zh, ja, it
2 connection = client.stt.stream(lang_code='es')  # Spanish

Error handling

Implement robust error handling for streaming:

1 try:
2     connection = client.stt.stream(lang_code='en')
3     
4     @connection.on(LiveEvents.Error)
5     def on_error(error):
6         print(f"Streaming error: {error}")
7         # Implement reconnection logic here
8         
9     @connection.on(LiveEvents.Disconnect)
10     def on_disconnect():
11         print("Connection lost. Attempting to reconnect...")
12         # Implement reconnection logic
13         
14     connection.connect()
15     
16 except Exception as e:
17     print(f"Failed to initialize streaming: {e}")

Complete working example

Here’s a complete Python example that combines all steps:

1 import os
2 from aiola import AiolaClient, MicrophoneStream
3 from aiola.types import LiveEvents
4 
5 def live_streaming():
6     try:
7         # Step 1: Generate access token, save it
8         result = AiolaClient.grant_token(
9             api_key=os.getenv('AIOLA_API_KEY') or 'YOUR_API_KEY'
10         )
11 
12         # Step 2: Create client using the access token
13         client = AiolaClient(
14             access_token=result.access_token
15         )
16 
17         # Step 3: Start streaming
18         connection = client.stt.stream(
19             lang_code='en'
20         )
21 
22         @connection.on(LiveEvents.Transcript)
23         def on_transcript(data):
24             print('Transcript:', data.get('transcript', data))
25 
26         @connection.on(LiveEvents.Connect)
27         def on_connect():
28             print('Connected to streaming service')
29 
30         @connection.on(LiveEvents.Disconnect)
31         def on_disconnect():
32             print('Disconnected from streaming service')
33 
34         @connection.on(LiveEvents.Error)
35         def on_error(error):
36             print('Streaming error:', error)
37 
38         connection.connect()
39 
40         try:
41             # Capture audio from microphone using the SDK's MicrophoneStream
42             with MicrophoneStream(
43                 channels=1,
44                 samplerate=16000,
45                 blocksize=4096,
46             ) as mic:
47                 mic.stream_to(connection)
48 
49                 # Keep the main thread alive
50                 while True:
51                     try:
52                         import time
53                         time.sleep(0.1)
54                     except KeyboardInterrupt:
55                         print('Keyboard interrupt')
56                         break
57 
58         except KeyboardInterrupt:
59             print('Keyboard interrupt')
60 
61     except Exception as error:
62         print('Error:', error)
63     finally:
64         connection.disconnect()
65 
66 if __name__ == "__main__":
67     live_streaming()

Best practices

Audio Quality: Use 16kHz sample rate, mono channel for optimal results
Chunk Size: 4096 bytes is recommended for real-time performance
Error Handling: Always implement reconnection logic for production use
Resource Cleanup: Properly disconnect streaming connections when done
Audio Input: Handle audio input sources and permissions appropriately
Latency: Consider buffering strategies for smoother transcription

Supported audio formats

For streaming, the following formats work best:

PCM 16-bit (recommended)
WAV uncompressed
Raw audio at 16kHz sample rate

Next steps

Now that you’ve implemented streaming transcription, you can:

Explore Speech to Text SDK for file-based transcription
Learn about Text to Speech for speech synthesis
Check out the SDK repositories for more examples:
- Python SDK
- TypeScript SDK

Browser Examples

For web applications, check out our complete browser microphone streaming example:

Browser Microphone Streaming - Full web app example showing real-time microphone transcription in the browser