Text to Speech SDK
Generate natural-sounding speech from text with our SDKs. This guide will walk you through the essential steps to get started with text-to-speech synthesis in your applications.
Prerequisites
Before you begin, make sure you have:
- An aiOla API key (get one here)
- Python 3.10+ (for Python SDK) or Node.js 18+ (for TypeScript SDK)
Step 1: Install the SDK
Step 2: Set up authentication
First, you’ll need to generate an access token using your API key:
For detailed authentication information, security best practices, and advanced token management, see our Authentication Guide.
Step 3: Generate speech from text
Here’s how to convert text to speech and save it as an audio file:
Real-time streaming
For real-time audio streaming, check out our dedicated Text to Speech Streaming Guide which covers:
- Streaming audio generation for immediate playback
- Chunk-based processing for low latency
- Node.js implementations
Available voices
The following predefined voices are available: tara, zoe, zac, dan, jess, leo, mia, julia, and leah.
Supported language
The SDK currently supports:
- English (
en
) - Primary language for text-to-speech synthesis
Error handling
Always implement proper error handling for your API calls:
Async operations (Python)
For asynchronous operations in Python:
Best practices
- Text Length: Keep text reasonably short for optimal performance
- Error Handling: Always implement try-catch blocks for API calls
- Resource Management: Properly handle audio streams and file operations
- Audio Format Compatibility: Ensure audio format compatibility across different environments
- Rate Limiting: Be mindful of API rate limits for production applications
Next steps
Now that you’ve successfully generated your first audio file, you can:
- Explore TTS Streaming for real-time speech synthesis
- Learn about Speech to Text SDK for converting speech back to text
- Check out the SDK repositories for more examples: