MiniMax Speech 2.8 HD

Text-to-Speech • MiniMax

MiniMax Speech 2.8 HD focuses on studio-grade audio generation with emotion control, multilingual support (40+ languages), and voice cloning.

Model Info
Terms and License	link ↗
More information	link ↗
Pricing	View pricing in the Cloudflare dashboard ↗

Usage

TypeScript
cURL

const response = await env.AI.run(
  'minimax/speech-2.8-hd',
  {
    format: 'mp3',
    pitch: 0,
    speed: 1,
    text: 'Hello! Welcome to Cloudflare AI Gateway. Let me show you what we can do.',
    voice_id: 'English_expressive_narrator',
    volume: 1,
  },
)
console.log(response)

curl https://api.cloudflare.com/client/v4/accounts/$CLOUDFLARE_ACCOUNT_ID/ai/run \
  --header "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \
  --header "Content-Type: application/json" \
  --data '{
  "model": "minimax/speech-2.8-hd",
  "input": {
    "format": "mp3",
    "pitch": 0,
    "speed": 1,
    "text": "Hello! Welcome to Cloudflare AI Gateway. Let me show you what we can do.",
    "voice_id": "English_expressive_narrator",
    "volume": 1
  }
}'

Output
Raw response

{
  "gatewayMetadata": {
    "keySource": "Unified"
  },
  "result": {
    "audio": "https://pub-04a6d208d361438ea01b797e6973bd19.r2.dev/catalog/minimax__speech-2.8-hd/simple-speech.mp3"
  },
  "state": "Completed"
}

Examples

Custom Voice — Use a specific voice and adjust speed

TypeScript
cURL

const response = await env.AI.run(
  'minimax/speech-2.8-hd',
  {
    format: 'mp3',
    pitch: 0,
    speed: 0.9,
    text: 'The weather today is sunny with a high of 72 degrees. Perfect for a walk in the park.',
    voice_id: 'English_expressive_narrator',
    volume: 1,
  },
)
console.log(response)

curl https://api.cloudflare.com/client/v4/accounts/$CLOUDFLARE_ACCOUNT_ID/ai/run \
  --header "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \
  --header "Content-Type: application/json" \
  --data '{
  "model": "minimax/speech-2.8-hd",
  "input": {
    "format": "mp3",
    "pitch": 0,
    "speed": 0.9,
    "text": "The weather today is sunny with a high of 72 degrees. Perfect for a walk in the park.",
    "voice_id": "English_expressive_narrator",
    "volume": 1
  }
}'

Output
Raw response

{
  "gatewayMetadata": {
    "keySource": "Unified"
  },
  "result": {
    "audio": "https://pub-04a6d208d361438ea01b797e6973bd19.r2.dev/catalog/minimax__speech-2.8-hd/custom-voice.mp3"
  },
  "state": "Completed"
}

With Emotion — Apply emotional tone to speech

TypeScript
cURL

const response = await env.AI.run(
  'minimax/speech-2.8-hd',
  {
    emotion: 'happy',
    format: 'mp3',
    pitch: 0,
    speed: 1,
    text: "Congratulations! You've just won the grand prize! This is absolutely incredible news!",
    voice_id: 'English_expressive_narrator',
    volume: 1,
  },
)
console.log(response)

curl https://api.cloudflare.com/client/v4/accounts/$CLOUDFLARE_ACCOUNT_ID/ai/run \
  --header "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \
  --header "Content-Type: application/json" \
  --data '{
  "model": "minimax/speech-2.8-hd",
  "input": {
    "emotion": "happy",
    "format": "mp3",
    "pitch": 0,
    "speed": 1,
    "text": "Congratulations! You'\''ve just won the grand prize! This is absolutely incredible news!",
    "voice_id": "English_expressive_narrator",
    "volume": 1
  }
}'

Output
Raw response

{
  "gatewayMetadata": {
    "keySource": "Unified"
  },
  "result": {
    "audio": "https://pub-04a6d208d361438ea01b797e6973bd19.r2.dev/catalog/minimax__speech-2.8-hd/with-emotion.mp3"
  },
  "state": "Completed"
}

High Sample Rate — Studio quality at 44.1kHz sample rate

TypeScript
cURL

const response = await env.AI.run(
  'minimax/speech-2.8-hd',
  {
    format: 'mp3',
    pitch: 0,
    sample_rate: 44100,
    speed: 1,
    text: 'This recording is generated at studio quality sample rate for the highest possible audio fidelity.',
    voice_id: 'English_expressive_narrator',
    volume: 1,
  },
)
console.log(response)

curl https://api.cloudflare.com/client/v4/accounts/$CLOUDFLARE_ACCOUNT_ID/ai/run \
  --header "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \
  --header "Content-Type: application/json" \
  --data '{
  "model": "minimax/speech-2.8-hd",
  "input": {
    "format": "mp3",
    "pitch": 0,
    "sample_rate": 44100,
    "speed": 1,
    "text": "This recording is generated at studio quality sample rate for the highest possible audio fidelity.",
    "voice_id": "English_expressive_narrator",
    "volume": 1
  }
}'

Output
Raw response

{
  "gatewayMetadata": {
    "keySource": "Unified"
  },
  "result": {
    "audio": "https://pub-04a6d208d361438ea01b797e6973bd19.r2.dev/catalog/minimax__speech-2.8-hd/high-sample-rate.mp3"
  },
  "state": "Completed"
}

emotion

stringenum: happy, sad, angry, fearful, disgusted, surprised, calm, fluentEmotion control for synthesized speech

format

stringrequireddefault: mp3enum: mp3, flac, wavOutput audio format

pitch

integerrequireddefault: 0maximum: 12minimum: -12Pitch adjustment (-12 to 12)

▶sample_rate

one of

speed

numberrequireddefault: 1maximum: 2minimum: 0.5Speech speed (0.5 to 2)

text

stringrequiredmaxLength: 10000The text to convert to speech. Maximum 10,000 characters.

voice_id

stringrequireddefault: English_expressive_narratorThe voice ID to use for synthesis

volume

numberrequireddefault: 1maximum: 10minimum: 0Speech volume (0 to 10)

audio

stringURL to the generated audio file

API Schemas (Raw)

Input

Output