Speech to Text SDK
Convert audio to text with our easy-to-use SDKs. This guide will walk you through the essential steps to get started with speech transcription in your applications.
Prerequisites
Before you begin, make sure you have:
- An aiOla API key (get one here)
- Python 3.10+ (for Python SDK) or Node.js 18+ (for TypeScript SDK)
Step 1: Install the SDK
Step 2: Set up authentication
First, you’ll need to generate an access token using your API key:
For detailed authentication information, security best practices, and advanced token management, see our Authentication Guide.
Step 3: Transcribe an audio file
Here’s how to transcribe an audio file:
Real-time streaming
For real-time audio streaming transcription, check out our dedicated Speech to Text Streaming Guide which covers:
- Live microphone streaming
- Event-based transcription handling
- Custom audio source streaming
- Keyword detection during streaming
- Connection management and error handling
Supported audio formats
The SDK supports the following audio formats:
- WAV (
.wav) - FLAC (
.flac) - AIFF (
.aiff) - M4A (
.m4a) - MP4 (
.mp4) - MOV (
.mov) - M4V (
.m4v) - AAC (
.aac) - MKV (
.mkv) - MP3 (
.mp3) - Opus (
.opus)
File size limitations
- Maximum file size: 50 MB
- For larger files, consider using streaming transcription or splitting the audio into smaller chunks
Supported languages
The SDK supports the following languages:
- English (
en) - German (
de) - French (
fr) - Spanish (
es) - Portuguese (
pr) - Chinese (
zh) - Japanese (
ja) - Italian (
it)
Error handling
Always implement proper error handling for your API calls:
Advanced options
Keyword detection
You can enable keyword detection for more accurate transcription:
Custom configuration
For enterprise users, you can configure custom endpoints:
Next steps
Now that you’ve successfully transcribed your first audio file, you can:
- Explore Real-time Streaming for live audio transcription
- Learn about Text to Speech SDK for converting text back to speech
- Check out the SDK repositories for more examples:
Browser Examples
For web applications, check out our complete browser microphone streaming example:
- Browser Microphone Streaming - Full web app example showing real-time microphone transcription in the browser
