ElevenLabs presented Eleven v3 (alpha) — the most expressive text voiceover model

The most expressive voiceover model for today's text.

It supports 70+ languages, multi-voice mode, and now — audio tags that set intonation, emotions, and even pauses in speech.

New architecture better understands text and context, creating natural, "live" audio.

What Eleven v3 can do:

• Generate realistic dialog with multiple voices

• Read emotional transitions

• React to the context and change the tone during the speech

The model is managed through tags:

- Emotions: [sad], [angry], [happily]

- Delivery: [whispers], [shouts]

- Reactions: [laughs], [sighs], [clears throat]

The public API is promised to be rolled out very soon.

This is a preview version - it may require fine-tuning of the prompts. But the result is really impressive