EasyVoice
VoicesPricingAPI
EasyVoice

Free text-to-speech powered by open source AI.

Product

  • Voices
  • Pricing
  • API

Resources

  • Blog
  • Documentation
  • About

Legal

  • Privacy Policy
  • Terms of Service

© 2026 EasyVoice. Powered by Kokoro-82M (Apache 2.0).

Built with ❤️ and open source AI.

Built by InfoDriven

Dubai, United Arab Emirates · Support@infodriven.ae · infodriven.ae

  1. Home
  2. /TTS API
  3. /TTS API for Developers — Bearer Auth, OpenAI Shape, Flat Pricing

TTS API for Developers — Bearer Auth, OpenAI Shape, Flat Pricing

Most TTS API documentation is written for procurement teams: lots of pricing-page bullets, slim integration examples, and a 'contact sales' button where the code should be. EasyVoice's TTS API is built for developers who want to grab an API key, paste a curl into their terminal, and ship the feature before lunch. This guide is the integration-first walkthrough: the auth model, the request shape, the response handling, error codes, rate limits, and reference implementations in curl, JavaScript, Python, and Go. Free tier is 5,000 characters per day, no credit card. Pro is $9.99/mo unlimited. The full OpenAPI spec lives at /api-docs.

5,000 characters per day on the free tier, no credit card. Pro $9.99/mo unlimited. 46 voices, 8 languages.

Part of the Best TTS APIs in 2026 hub — compares EasyVoice, OpenAI tts-1, ElevenLabs, Google Cloud TTS, and Azure Speech.

Authentication — Bearer key, one secret, no rituals

Authentication is a single Bearer token in the Authorization header. Sign up at /signup, navigate to the API Keys section of your dashboard, click 'Create API key', and copy the value — it's shown once, store it in your env file or secrets manager immediately. There's no service-account JSON, no OAuth dance, no region selection, no tenant ID, no environment-specific endpoints. Both free-tier and Pro keys hit the same URL at https://easyvoice.ae/api/tts/generate; the tier is enforced server-side per key.

Rotate keys at any time from the dashboard — old keys are revoked on rotation, so plan a rolling deploy if you have multiple services using the same key. We don't currently support scoped keys (e.g. read-only or volume-capped sub-keys) — every key has full access to the TTS generate endpoint and your account's tier. Scoped keys are on the roadmap; for now, if you need per-service isolation, create separate accounts (one per service) and tie them to the same billing entity.

Request shape — JSON in, audio bytes out

POST https://easyvoice.ae/api/tts/generate with Content-Type: application/json and a body containing voice (string, required, e.g. 'af_heart'), input (string, required, the text to synthesize; alias 'text' is also accepted for OpenAI compatibility), response_format (string, optional, one of 'mp3', 'wav', 'opus', defaults to 'mp3'), and an optional speed parameter (float, 0.25-4.0, defaults to 1.0). The response is raw audio bytes — Content-Type is audio/mpeg, audio/wav, or audio/opus depending on the format you requested. There's no JSON wrapper, no base64 encoding, no envelope; pipe the response body directly into a file or audio sink.

Voice IDs follow the Kokoro convention: a 1-2 character locale prefix (af = American female, am = American male, bf = British female, bm = British male, ef/em = Spanish, ff = French, if/im = Italian, jf/jm = Japanese, hf/hm = Hindi, pf/pm = Portuguese, zf/zm = Chinese), an underscore, and a friendly name. The full ID list is at /voices, or programmatically via the /api/voices/list endpoint (which returns JSON metadata for every voice). For OpenAI-stack migrators, the closest matches are alloy→af_alloy, echo→am_echo, fable→bm_fable, onyx→am_onyx, nova→af_nova, shimmer→af_jessica — see /tts-api/openai-alternative for the full migration guide.

Response handling and streaming

The response opens with Transfer-Encoding: chunked, so audio bytes flow as the model generates them. Typical first-byte latency is 300-600ms warm; full generation for a 200-character sentence is 800ms-1.5s. Any standard HTTP library handles chunked streaming without configuration: fetch in Node 18+ exposes response.body as a ReadableStream, requests in Python supports stream=True, Go's http.Client streams by default. For non-streaming use cases (write the whole file then process), just read the full response body — same code path, no flag.

For client-side browser apps that want to play audio as it streams (chatbot UIs, real-time read-aloud), the simplest pattern is to pipe response.body into a MediaSource via SourceBuffer.appendBuffer(). Most modern browsers handle MP3 streaming natively; WAV streaming requires slightly more setup because WAV's header includes a total length that's unknown at stream start. For pure browser MP3 streaming, you can also create an <audio> element with src pointing at a blob URL — accept the small latency penalty in exchange for code simplicity.

Error codes and rate limits

HTTP 200: success, response body is audio bytes. HTTP 400: malformed request body — check voice ID is valid, input is non-empty, response_format is one of the supported values. The response body for errors is JSON ({error: 'message'}). HTTP 401: API key missing or invalid — re-check the Authorization header format (Bearer prefix, single space, full key). HTTP 429: rate limited — either you've hit the free-tier daily 5K-char cap (resets at 00:00 UTC), or you're sending sustained traffic above ~1 QPS on free tier (Pro has no per-second throttle at indie-developer rates). HTTP 5xx: server-side error, retry with exponential backoff.

Rate limits on free tier are designed for normal developer integration patterns, not stress tests. Bursts of 10-20 requests in a few seconds are fine; sustained >1 QPS gets throttled. Pro tier has no per-second throttle at any normal volume — if you're doing load tests beyond ~50 QPS sustained, contact us via the support link in the dashboard so we can give you guidance on burstable instance allocations. The daily 5K-char cap on free tier is enforced by character count of the input field, not by audio length or response size.

Reference implementations in 4 languages

The code samples below show the canonical request pattern in curl, JavaScript (Node 18+ and modern browsers), Python (stdlib requests), and Go (net/http). All four assume you've set EASYVOICE_API_KEY as an environment variable; rotate it via the dashboard if you ever paste it into a commit by mistake. For more complete reference clients (CI integration tests, retry logic, streaming playback), the /api-docs page links to a TypeScript and Python reference implementation on GitHub.

For frameworks: Next.js Route Handlers should call the API server-side rather than from the client, to avoid leaking the API key to the browser — proxy the request through your own /api/tts route. Express, FastAPI, Flask, and Rails patterns are the same: backend-only, key in environment variables, never in client-side code. If you're building a public-facing tool where the browser needs to make TTS calls directly, generate short-lived signed URLs from your backend rather than exposing the long-lived API key.

OpenAPI spec, SDKs, and tooling

The full OpenAPI 3.0 spec is published at /api-docs and machine-readable at /api-docs/openapi.json. Use it to generate clients in any language the openapi-generator family supports — TypeScript, Python, Go, Rust, Ruby, Java, C#, Swift, Kotlin, PHP — or to import into Postman, Insomnia, or Bruno for interactive exploration. The spec is the source of truth; if anything in this page contradicts the spec, the spec wins.

First-party SDKs are not currently published — we've intentionally stayed lean while the API surface is small enough that an SDK adds more dependency overhead than ergonomic benefit. The 5-line stdlib-HTTP pattern works the same in every language, and avoiding an SDK means there's nothing to version-pin or upgrade. If we ship SDKs in the future, they'll be thin wrappers around the same endpoints, generated from the OpenAPI spec, and entirely optional.

Code samples

Drop-in examples for the EasyVoice TTS API. Every request below assumes you've set EASYVOICE_API_KEY as an environment variable.

curl

Minimum-viable smoke test
curl -X POST https://easyvoice.ae/api/tts/generate \
  -H "Authorization: Bearer $EASYVOICE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "voice": "af_heart",
    "input": "Hello from EasyVoice.",
    "response_format": "mp3"
  }' \
  --output out.mp3

JavaScript / TypeScript

Node 18+ or modern browsers (fetch globally available)
async function tts(text, voice = "af_heart") {
  const res = await fetch("https://easyvoice.ae/api/tts/generate", {
    method: "POST",
    headers: {
      "Authorization": `Bearer ${process.env.EASYVOICE_API_KEY}`,
      "Content-Type": "application/json",
    },
    body: JSON.stringify({ voice, input: text, response_format: "mp3" }),
  });
  if (!res.ok) {
    const err = await res.json().catch(() => ({ error: res.statusText }));
    throw new Error(`TTS failed (${res.status}): ${err.error}`);
  }
  return Buffer.from(await res.arrayBuffer());
}

const audio = await tts("Hello world", "am_adam");
require("fs").writeFileSync("out.mp3", audio);

Python

stdlib requests, includes retry on 5xx
import os, time, requests

def tts(text: str, voice: str = "af_heart") -> bytes:
    for attempt in range(3):
        res = requests.post(
            "https://easyvoice.ae/api/tts/generate",
            headers={"Authorization": f"Bearer {os.environ['EASYVOICE_API_KEY']}"},
            json={"voice": voice, "input": text, "response_format": "mp3"},
            timeout=30,
        )
        if res.status_code == 200:
            return res.content
        if res.status_code in (429, 500, 502, 503, 504):
            time.sleep(2 ** attempt)
            continue
        res.raise_for_status()
    raise RuntimeError("TTS retries exhausted")

with open("out.mp3", "wb") as f:
    f.write(tts("Hello world", "am_adam"))

Go

net/http, streaming-safe
// // Note: language=javascript used for syntax highlighting only; actual Go code below.
package main

import (
    "bytes"
    "encoding/json"
    "io"
    "net/http"
    "os"
)

func TTS(text, voice string) ([]byte, error) {
    body, _ := json.Marshal(map[string]any{
        "voice": voice,
        "input": text,
        "response_format": "mp3",
    })
    req, _ := http.NewRequest("POST",
        "https://easyvoice.ae/api/tts/generate",
        bytes.NewReader(body))
    req.Header.Set("Authorization", "Bearer "+os.Getenv("EASYVOICE_API_KEY"))
    req.Header.Set("Content-Type", "application/json")
    res, err := http.DefaultClient.Do(req)
    if err != nil { return nil, err }
    defer res.Body.Close()
    return io.ReadAll(res.Body)
}

Voices to try with the API

Every voice below is callable via the same voice parameter — preview samples and read the full character profile.

HeartFree
American English · af_heart
AdamFree
American English · am_adam
MichaelFree
American English · am_michael

Frequently asked questions

How do I get an API key?▾

Sign up at /signup (no credit card required), then go to your dashboard → API Keys → Create API key. The value is shown once — store it in your env file or secrets manager immediately. Rotate from the same page anytime. Both free-tier and Pro keys hit the same endpoint URL; the tier is enforced server-side per key.

What's the request body schema?▾

JSON with three fields: voice (string, required, e.g. 'af_heart' — see /voices for the full catalog), input (string, required, the text to synthesize; alias 'text' is accepted for OpenAI compatibility), and response_format (optional, one of 'mp3', 'wav', 'opus', defaults to 'mp3'). Optional speed parameter (float 0.25-4.0, defaults to 1.0). The response is raw audio bytes — no JSON wrapper, no base64, just the binary you'd write to a file.

What error codes should I handle?▾

200 success (response body is audio bytes). 400 malformed request (check voice ID is in the catalog, input is non-empty). 401 missing or invalid Bearer token. 429 rate limited — either you hit the free-tier 5K chars/day cap (resets at 00:00 UTC) or sustained >1 QPS on free tier. 5xx retry with exponential backoff. Error responses are JSON {error: 'message'} — wrap your client to parse them.

Is there an SDK?▾

Not currently — by design. The 5-line stdlib-HTTP pattern (fetch in JS, requests in Python, net/http in Go) works the same in every language, and avoiding an SDK means there's nothing to version-pin. The OpenAPI spec at /api-docs/openapi.json can generate clients in any openapi-generator-supported language if you prefer a typed client. If we ship first-party SDKs in the future, they'll be thin wrappers around the same endpoints.

Can I use the API from the browser?▾

Technically yes, but you'd be exposing your API key in client-side code — which means anyone with browser devtools can copy and abuse it. The right pattern is to proxy through your own backend route (Next.js Route Handler, Express endpoint, Cloudflare Worker, etc.) that holds the EasyVoice key and forwards the request. Browser → your-backend → EasyVoice. Your backend can also add user-level auth and per-user quotas on top of the EasyVoice tier.

What's the difference between free and Pro for API users?▾

Free: 5,000 characters per day, daily reset, no credit card, sustained rate limited at ~1 QPS, all 46 voices and 8 languages included. Pro: $9.99/mo flat unlimited, no daily cap, no per-second throttle at normal volumes, same voices, same endpoint, same API key (just upgraded server-side). There's no integration code change between tiers — you upgrade in the dashboard and the existing key keeps working with the higher cap.

Related TTS API guides

OpenAI TTS API Alternative — Drop-in Migration to Flat-Rate

Swap OpenAI tts-1 for EasyVoice in 5 lines. Same request shape, flat $9.99/mo unlimited vs $15/1M chars. Voice mapping for alloy/echo/onyx + free tier.

Low Latency TTS API — 300-600ms First-Byte on Kokoro-82M

Low-latency TTS API for real-time chatbots and IVR. EasyVoice Kokoro-82M ships first audio byte in 300-600ms warm. Streaming, cold-start numbers, benchmarks.

Comparing vendors? See EasyVoice vs elevenlabs →

Start building with the EasyVoice TTS API

5,000 characters per day free, no credit card. Pro $9.99/mo unlimited. OpenAI-compatible request shape.

More TTS API guides

← TTS API hubOpenAI TTS API Alternative — Drop-in Migration to Flat-RateFree TTS API — 5,000 Characters Per Day, No Credit CardLow Latency TTS API — 300-600ms First-Byte on Kokoro-82MBest TTS API for Customer Support Chatbots and Voice AgentsMultilingual TTS API — 8 Languages, 46 Voices, Single Endpoint