Text to Speech - Streaming
Stream real-time audio generation with our SDKs. This guide covers how to implement streaming text-to-speech synthesis for immediate audio playback and low-latency applications.
Prerequisites
Before you begin, make sure you have:
- An aiOla API key (get one here)
- Python 3.10+ (for Python SDK) or Node.js 18+ (for TypeScript SDK)
Step 1: Set up authentication
First, generate an access token and create your client:
For comprehensive authentication details, security considerations, and token management strategies, see our Authentication Guide.
Step 2: Basic streaming synthesis
Here’s how to stream audio generation for immediate processing:
Step 3: Async streaming (Python)
For asynchronous streaming operations:
Python
Best practices
- Chunk Processing: Process chunks immediately for lower latency
- Buffer Management: Implement proper audio buffering for smooth playback
- Error Recovery: Handle network issues and retry failed streams
- Memory Usage: Process chunks incrementally to avoid memory buildup
- Audio Quality: Use appropriate sample rates and formats for your use case
Next steps
Now that you’ve implemented streaming text-to-speech synthesis, you can:
- Explore Text to Speech SDK for file-based synthesis
- Learn about Speech to Text Streaming for real-time transcription
- Check out the SDK repositories for more examples: