How to Install and Run Coqui TTS on Mac for High-Quality Voice Synthesis

Coqui TTS is a powerful open-source text-to-speech system that supports multiple languages, including Spanish, German, Chinese, Japanese etc . In this guide, we’ll walk you through how to install it on your Mac , resolve compatibility issues with PyTorch, and generate natural-sounding Spanish speech step by step.


🗣️ Bonus Tip: Preview Built-in macOS Spanish Voices

Before diving into Coqui TTS, you might want to explore what’s already built into macOS.

👉 To list all available voices on macOS:

say -v "?"

You’ll see a list like:

...
Jorge        es_ES    # Spanish (Spain)
Juan         es_MX    # Spanish (Mexico)
Paulina      es_MX    # Spanish (Mexico)
Monica       es_ES    # Spanish (Spain)
...

To test one of these voices:

say -v "Jorge" "Hola, ¿cómo estás?"

While macOS voices are good, they are not open-source and can’t be customized. That’s where Coqui TTS comes in.

✅ Prerequisites

  • A Mac M1, M2, M3 or M4 running macOS
  • Conda installed (via Miniconda or Homebrew)

🔧 Step 1: Create a Conda Environment

We’ll use Python 3.10 for compatibility.

conda create -n coqui-tts python=3.10 -y
conda activate coqui-tts

📦 Step 2: Install Coqui TTS with Full Features

Install the TTS library along with development and notebook extras for full support:

pip install "TTS[all,dev,notebooks]"

⚠️ Step 3: Fix the PyTorch Compatibility Issue

Coqui TTS models like Tacotron2 use custom layers that are blocked by newer PyTorch versions (>=2.6) due to a pickle.UnpicklingError. To fix this, downgrade PyTorch to version 2.5.1.

pip uninstall torch torchaudio -y
pip install torch==2.5.1 torchaudio==2.5.1

📥 Step 4: Download and Use a Spanish TTS Model

We’ll use the pre-trained Spanish model tts_models/es/mai/tacotron2-DDC.

Run this command to synthesize Spanish audio:

tts --text "Hola, ¿cómo estás?" \
    --model_name tts_models/es/mai/tacotron2-DDC \
    --out_path hola.wav

✅ This will:

  • Download the Spanish TTS model
  • Download a compatible vocoder (MelGAN)
  • Generate hola.wav with natural-sounding Spanish

To play the audio:

afplay hola.wav

🛠 Optional: Use Python Script Instead of CLI

Here’s a Python version if you prefer using code:

from TTS.api import TTS

tts = TTS(model_name="tts_models/es/mai/tacotron2-DDC")
tts.tts_to_file(text="Hola, ¿cómo estás?", file_path="output.wav")

🚀 Bonus: Try Higher Quality with Voice Cloning (xtts_v2)

For even better quality and multilingual voice cloning (optional):

tts = TTS("tts_models/multilingual/multi-dataset/xtts_v2")
tts.tts_to_file(
    text="Hola, ¿cómo estás?",
    speaker_wav="your-voice-sample.wav",  # Optional for cloning
    language="es",
    file_path="xtts_output.wav"
)

✅ Final Thoughts

With this setup:

  • You’re running Coqui TTS locally on Mac M1 Pro
  • Using high-quality Spanish voices
  • Solved PyTorch compatibility using the correct version
  • And optionally prepared for voice cloning or multilingual support

🔗 Related Links


Leave a Reply

Your email address will not be published. Required fields are marked *