POST
/
api
/
v1
/
client
/
txt2audio
Text to Speech
curl --request POST \
  --url https://api.modelbeam.srv1069417.hstgr.cloud/api/v1/client/txt2audio \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: multipart/form-data' \
  --form 'text=<string>' \
  --form model=Kokoro \
  --form lang=en-us \
  --form speed=1 \
  --form format=mp3 \
  --form sample_rate=44100 \
  --form mode=custom_voice \
  --form 'voice=<string>' \
  --form ref_audio='@example-file' \
  --form 'ref_text=<string>' \
  --form 'instruct=<string>' \
  --form 'webhook_url=<string>'
{
  "data": {
    "request_id": "550e8400-e29b-41d4-a716-446655440000"
  }
}

Authorizations

Authorization
string
header
required

API key obtained from the ModelBeam dashboard

Body

multipart/form-data
text
string
required

Text to synthesize

model
string
required

Model slug

Example:

"Kokoro"

lang
string
required

Language code

Example:

"en-us"

speed
number
required

Speech speed multiplier

Required range: 0.5 <= x <= 2
Example:

1

format
enum<string>
required

Output format

Available options:
mp3,
wav,
flac
Example:

"mp3"

sample_rate
integer
required

Audio sample rate in Hz

Example:

44100

mode
enum<string>

Voice mode

Available options:
custom_voice,
voice_clone,
voice_design
voice
string

Voice preset slug for custom_voice mode

ref_audio
file

Reference audio for voice cloning (3-10s, max 10MB)

ref_text
string

Transcript of reference audio

instruct
string

Voice design instructions

webhook_url
string

HTTPS webhook URL

Maximum string length: 2048

Response

TTS job created

data
object