Korean
TTS Voices

Korean text-to-speech voices with even syllable timing

TelnyxInWorldMiniMaxRimeAzureAWS

Top 7 TTS for Korean

Name	Provider
Mimi - Show Stopper	telnyx
Cheerful Little Sister	minimax
Minho - Friendly Spirit	telnyx
Seoyeon	aws
Junho MAI-Voice-2	azure
Minji	inworld
Hyunwoo	inworld

Test Korean voices

[ VOICE AI PLATFORM ]

From text to talk.
Pick your path.

Call our TTS & STT endpoints directly, wire voice into LiveKit rooms with one plug-in, or spin up an AI assistant on a real phone number.

TTS & STT Endpoints

Production-grade streaming and batch TTS/STT. Low latency, 50+ languages, customizable voices, and SDKs for Node/Python/Browser.

›Streaming for live apps
›Multi-speaker diarization & punctuation
›SDKs, code samples, and latency benchmarks

TTS — CURL
$ curl -X POST \
".../v1/tts" \
-H "Authorization: Bearer $API_KEY" \
-d '{
"voice": "alloy_female_v1",
"language": "en-US",
"format": "mp3",
"text": "Hello, welcome..."
} ' --output speech.mp3

Sends text to the TTS endpoint and saves the synthesized audio as an MP3 file.

View TTS docs →

LiveKit Plug-in

Plug our real-time speech pipeline into LiveKit rooms — transcribe live sessions, synthesize responses and stream audio back into the room.

›One-line install, example room demo
›WebRTC + server bridge patterns
›Works in browser & mobile

LIVEKIT — NODE.JS
import { Room } from "livekit-client";
import { TelnyxSpeechPlugin }
from "@telnyx/livekit-plugin";
const room = new Room();
await room.connect(URL, token);
const plugin = new TelnyxSpeechPlugin({
apiKey: process.env.TELNYX_API_KEY,
voice: "alloy_female_v1",
});
plugin.attach(room);

Connects to a LiveKit room and attaches real-time TTS/STT — transcribes audio in, synthesizes audio out.

Try LiveKit demo →

AI-Assistants (Phone)

Deploy a phone-number based AI assistant in minutes — inbound/outbound calls, IVR, call recording, and DTMF support.

›Purchase & map a phone number
›Templates: Support Bot, Sales Assistant, Reminder Bot
›PSTN reliability & compliance tools

AI-ASSISTANT — CURL
$ curl -X POST \
".../v1/assistants" \
-H "Authorization: Bearer $API_KEY" \
-d '{
"name": "Support Bot",
"phone_number": "+18005551234",
"voice": "alloy_female_v1",
"system_prompt": "You are a
helpful support agent.",
"capabilities": ["inbound",
"recording", "dtmf"]
} '

Creates an AI assistant bound to a phone number with inbound call handling, recording, and DTMF support.

Create your assistant →

Spanish voices

294TTS voices

Español

Browse →

French voices

98TTS voices

Français

Browse →

German voices

82TTS voices

Deutsch

Browse →

Indonesian voices

31TTS voices

Bahasa Indonesia

Browse →

Italian voices

51TTS voices

Italiano

Browse →

Japanese voices

85TTS voices

日本語

Browse →

Korean voices

171TTS voices

한국어

Browse →

Portuguese voices

277TTS voices

Português

Browse →

Russian voices

34TTS voices

Русский

Browse →

Chinese voices

189TTS voices

中文

Browse →

Korean phonology and prosody

Every syllable gets equal time

English is stress-timed: speakers compress unstressed syllables and stretch stressed ones, creating a bouncy strong-weak alternation. Korean is syllable-timed: each syllable receives roughly the same duration and energy, producing an even, staccato cadence with nothing swallowed or rushed. A TTS engine trained on English stress-timing will impose prominence where Korean expects none, making output sound foreign immediately. Natural Korean synthesis requires inference tuned for syllable-level uniformity running where the audio is processed: not handed off across providers mid-stream.

Vowels that refuse to reduce

In English, unstressed vowels collapse toward [ə]: the second syllable of "sofa," the first of "about." Korean vowels stay stable regardless of position; there is no systematic centralization or weakening tied to prominence. Where English TTS learns to blur unstressed vowels as a core feature of naturalness, a Korean pipeline must do the opposite: maintain full vowel quality on every syllable. Getting this wrong produces output that sounds like an English accent imposed on Korean. Accurate rendering at this consistency requires models and audio processing co-located on the same infrastructure: not routed between separate speech and telephony systems.

Pitch at the phrase, not the word

English intonation rides on lexical stress: pitch peaks land on stressed syllables, tying melody tightly to individual words. Korean intonation operates at the phrase level, using boundary tones and phrase-final pitch movements rather than word-internal prominence to signal questions, focus, and emotion. To an English ear, Korean can sound flat; to a Korean ear, it is precisely contoured. A voice AI system that maps English prosodic patterns onto Korean output misplaces every melodic cue. Reproducing phrase-level pitch contours demands co-located inference where synthesis and telephony share the same network: no inter-provider hops distorting the tonal signal.

Korean
TTS Voices

Female Korean TTS Voices

Male Korean TTS Voices

South Korea Korean TTS Voices

Spanish voices

French voices

German voices

Indonesian voices

Italian voices

Japanese voices

Korean voices

Portuguese voices

Russian voices

Chinese voices

Korean phonology and prosody

Every syllable gets equal time

Vowels that refuse to reduce

Pitch at the phrase, not the word