Text to Speech SDK

Generate natural-sounding speech from text with our SDKs. This guide will walk you through the essential steps to get started with text-to-speech synthesis in your applications.

Prerequisites

Before you begin, make sure you have:

  • An aiOla API key (get one here)
  • Python 3.10+ (for Python SDK) or Node.js 18+ (for TypeScript SDK)

Step 1: Install the SDK

$pip install aiola

Step 2: Set up authentication

First, you’ll need to generate an access token using your API key:

For detailed authentication information, security best practices, and advanced token management, see our Authentication Guide.

1from aiola import AiolaClient
2
3# Generate access token
4result = AiolaClient.grant_token(api_key='your-api-key')
5access_token = result['accessToken']
6
7# Create client
8client = AiolaClient(access_token=access_token)

Step 3: Generate speech from text

Here’s how to convert text to speech and save it as an audio file:

1# Generate speech from text
2text = "Hello, this is a sample text to speech conversion using aiOla SDK."
3
4audio = client.tts.synthesize(
5 text=text,
6 voice='tara', # Choose from available voices
7 language='en'
8)
9
10# Save the audio to a file
11with open('output.wav', 'wb') as f:
12 for chunk in audio:
13 f.write(chunk)
14
15print("Audio file generated successfully!")

Real-time streaming

For real-time audio streaming, check out our dedicated Text to Speech Streaming Guide which covers:

  • Streaming audio generation for immediate playback
  • Chunk-based processing for low latency
  • Node.js implementations

Available voices

The following predefined voices are available: tara, zoe, zac, dan, jess, leo, mia, julia, and leah.

Supported language

The SDK currently supports:

  • English (en) - Primary language for text-to-speech synthesis

Error handling

Always implement proper error handling for your API calls:

1try:
2 audio = client.tts.synthesize(
3 text="Sample text for speech synthesis.",
4 voice='jess',
5 language='en'
6 )
7
8 with open("output.wav", "wb") as f:
9 for chunk in audio:
10 f.write(chunk)
11
12 print("Speech synthesis completed successfully!")
13
14except Exception as e:
15 print(f"Speech synthesis failed: {e}")

Async operations (Python)

For asynchronous operations in Python:

Python
1from aiola import AsyncAiolaClient
2import asyncio
3
4async def async_tts_example():
5 # Generate access token
6 result = AsyncAiolaClient.grant_token(api_key='your-api-key')
7 access_token = result['accessToken']
8
9 # Create async client
10 async_client = AsyncAiolaClient(access_token=access_token)
11
12 # Generate speech asynchronously
13 audio = await async_client.tts.synthesize(
14 text="This is async text to speech.",
15 voice='jess',
16 language='en'
17 )
18
19 # Save audio
20 with open("async_output.wav", "wb") as f:
21 async for chunk in audio:
22 f.write(chunk)
23
24# Run the async function
25asyncio.run(async_tts_example())

Best practices

  1. Text Length: Keep text reasonably short for optimal performance
  2. Error Handling: Always implement try-catch blocks for API calls
  3. Resource Management: Properly handle audio streams and file operations
  4. Audio Format Compatibility: Ensure audio format compatibility across different environments
  5. Rate Limiting: Be mindful of API rate limits for production applications

Next steps

Now that you’ve successfully generated your first audio file, you can: