Text-to-speech (TTS) converts written text into spoken audio. You write the script, and the tool reads it back in a natural-sounding voice — saving you the time of booking a recording studio, hiring a voice actor, or recording yourself over and over. Common applications include video narration, podcasts, audiobooks, social media short-form video, and accessibility read-aloud.

The biggest change over the past two years is audio quality. Early TTS was immediately recognizable as robotic. Tools like ElevenLabs and Murf can now produce voice quality close to a real human, with the ability to clone specific voices and generate multi-character dialogue. Below is a rundown of the major tools, the difference between free and paid tiers, Traditional Chinese and Taiwan-accent quality, and what to watch out for with commercial licensing.


What Is Text-to-Speech (TTS)?

The core idea is simple: input text, select a voice, generate an audio file. The meaningful differences come down to three things:

  • Voice naturalness: New-generation models sound close to human; older ones lean robotic.
  • Language and accent: Even within Chinese, Mandarin, Taiwan Mandarin, and Cantonese are very different.
  • Advanced features: Voice cloning (copying a specific voice), multi-character dialogue, emotion and tone control.

This space is still moving fast in 2026 — pricing, free-tier limits, supported languages, and commercial terms change regularly. Always treat current official information as the authority.


Tool Overview

ToolPositioningFree PlanPaid Starting PriceChinese / Traditional Chinese
ElevenLabsInternational realistic voice, cloning, multi-voice10,000 credits/month (≈10 min)Starter US$6/monthMandarin/Chinese officially confirmed; Traditional Chinese to be tested
Murf AICommercial voiceover studioFree trial (no download)Creator ≈ US$19/monthIncludes Chinese (Taiwanese)
YatingTaiwan-accent TTSFree trial availableTo be verifiedSpecializes in Taiwan accent
VoAIMost Taiwan-accent voice optionsFree trial availableTo be verifiedTaiwan-accent focus; supports multi-voice dialogue
TTSMakerFree, downloadable, commercially usablePermanent free (≈20,000 chars/week)Pro pricing TBDIncludes Traditional Chinese
PlayAI (Play.ht)Multi-voice dialogue, APIFree trialTo be verifiedIncludes Taiwan Chinese

(ElevenLabs and Murf figures from official pages; Yating and VoAI paid pricing not publicly listed on official pages — marked “to be verified.” PlayAI showed possible service discontinuation on its official page in 2026-05 while still showing a product entry point — status unconfirmed. Always verify with the official source.)


”Free” Means Three Different Things

A lot of people see “free” and assume they can start using it right away — this is where most people hit a wall. At minimum, break free into three layers:

  1. Can I listen? Almost always yes. Free previewing is rarely blocked.
  2. Can I download? This is where things split. Murf’s free trial does not allow downloads; NaturalReader’s free plan lets you listen but not export to mp3.
  3. Can I use it commercially? The most overlooked question. Narakeet’s free plan explicitly prohibits commercial use; TTSMaker’s free plan explicitly permits commercial use and does not require attribution.

So before you say “I’ll use the free plan,” figure out whether you need listening, downloading, or commercial rights. If you need all three at no cost, TTSMaker is one of the very few tools whose official documentation says exactly that.


Traditional Chinese and Taiwan-Accented Mandarin

The biggest trap in Chinese TTS: a tool listing “Chinese” does not mean it sounds natural in Taiwan Mandarin. Many “Chinese” voices are trained on Mandarin and sound odd when reading colloquial Taiwan-style text.

For Taiwan content, these are the options worth testing first:

  • Yating and VoAI: Taiwan-based services that explicitly target Taiwan-accented Mandarin. VoAI also claims the largest collection of Taiwan-accent AI voice actors.
  • Murf (Chinese Taiwanese), Narakeet (Taiwanese Mandarin), TTSMaker (Traditional Chinese): international tools that separately list a Taiwan Mandarin option.

The most reliable test is to run the same Taiwan-style colloquial script through each tool and listen. No spec sheet beats a real listen.


Podcasts and Multi-Voice Dialogue

For dual-host podcasts or videos with character dialogue, the key factor is multi-voice capability:

  • ElevenLabs v3: officially advertises dialogue optimization and multi-voice support.
  • VoAI Text-to-MP3 Pro: officially states support for multi-character dialogue voiceover with a large pool of voice options.
  • PlayAI (Play.ht): has a multi-role dialogue feature (PlayDialog), but official service status is unconfirmed — hold off on using it as your primary tool.

Same advice here: test all candidates with an identical two-person dialogue script and listen for how natural the character switching and tone handoff sounds.


Pairing: Use NotebookLM to Turn Material into a Script First

Before you can voice anything, you need the words. If you only have scattered notes or source material and haven’t written a proper script yet, NotebookLM is a good first step. Upload your sources — PDFs, web pages, notes — and ask it to “generate a script” or “organize this into a narration draft.” It turns the material into readable text, which you then take into any TTS tool above.

NotebookLM is a source-to-text organizer, not a voiceover tool. It fits best as the “write the script first” step before TTS. For a deeper look: complete NotebookLM guide.

Basic Workflow

  1. Decide on your use case first: video narration, podcast, audio content — and whether you need commercial rights.
  2. Pick a tool and voice: if Taiwan accent matters, start with Yating or VoAI; for international-quality realistic voice, try ElevenLabs.
  3. Paste the text, adjust speed and tone, generate a preview.
  4. If the output isn’t right, switch voices or tweak the settings and regenerate.
  5. Before downloading, check: can the free plan export files? Is commercial use allowed? Is attribution required?

Commercial Licensing — Check Before You Publish

This is the part most people skip. Free plans routinely block commercial use, file downloads, or both. Always verify against the current official terms before you publish anything:

  • Commercial rights: ElevenLabs and Murf officially state commercial use is included in paid plans; TTSMaker explicitly allows it on the free plan; Narakeet’s free plan does not allow commercial use.
  • Voice cloning compliance: Cloning someone else’s voice touches portrait rights and personality rights. Always get explicit consent — never clone a celebrity or another person’s voice without permission.
  • Disclosure: For AI-generated audio content, it’s good practice to label it as AI-generated and follow each platform’s content policies.

How to Choose

  • Want international-quality voice, voice cloning, or multi-voice dialogue → ElevenLabs.
  • Need a stable commercial voiceover studio workflow → Murf.
  • Prioritize Taiwan-accented Mandarin and natural Traditional Chinese → start with Yating and VoAI.
  • Need free, downloadable, and commercially usable → TTSMaker.

Pricing, free-tier limits, and Taiwan availability change quickly across all of these. Check the official pages before committing, and run your own script through any tool you’re serious about.


Further Reading


— Penchan