How to Set Up Hindi TTS on Mac Apple Silicon: Step-by-Step Guide

If you’re looking to run Hindi Text-to-Speech (TTS) locally on your Mac M1, M2, M3 or M4 without GPU dependency, this guide walks you through setting up Fastspeech2_HS — a powerful multilingual TTS system developed by IIT Madras.

✅ Prerequisites

Mac with Apple Silicon (M1, M2, M3 or M4)
Miniconda installed
Basic knowledge of terminal and Python

🧪 Step 1: Clone the Repository

git clone https://github.com/smtiitm/Fastspeech2_HS.git
cd Fastspeech2_HS

🧱 Step 2: Create the Conda Environment

Create a file named environment_m1.yml and paste the following content:

name: tts-hs-hifigan
channels:
  - conda-forge
  - apple
dependencies:
  - python=3.10
  - pip
  - numpy
  - scipy
  - matplotlib
  - pandas
  - librosa
  - tqdm
  - pyyaml
  - pytorch
  - torchaudio
  - torchvision

Now create the environment:

conda env create -f environment_m1.yml
conda activate tts-hs-hifigan

📦 Step 3: Install Python Dependencies via pip

Inside the activated environment, run:

pip install phonemizer g2p_en unidecode soundfile flask nltk jamo sentencepiece inflect numba h5py pydub resampy pyworld
pip install typeguard==2.13.3
pip install --upgrade scipy

🛠️ Step 4: Fix Compatibility Issues

Fix 1: `scipy.signal.kaiser` Import Issue

Edit this file:

vim /Users/zahir/miniconda3/envs/tts-hs-hifigan/lib/python3.10/site-packages/espnet2/gan_tts/melgan/pqmf.py

Replace:

from scipy.signal import kaiser

With:

from numpy import kaiser

Fix 2: `wordparse()` Crash on Invalid Word

In text_preprocess_for_inference.py, before this line:

parsed_word = wordparse(word, 0, 0, 1)

Add this:

if not word or not word[0].isalpha():
    continue

🧠 Step 5: Install Missing Modules

The repo requires the following modules not included in the environment.yml:

indic-num2words

pip install indic-num2words

indic-unified-parser

Since the GitHub link is broken, install it from PyPI:

pip install indic-unified-parser

📥 Step 6: Download the Model Files

By default, when you run:

git clone https://github.com/smtiitm/Fastspeech2_HS.git

you may not get the full set of model files, especially for Hindi (hindi/female/model, hindi/male/model). If the model folders are empty or missing, follow these steps:

✅ 1. Try a Git Pull to Update the Repo

Navigate into the project folder:

cd Fastspeech2_HS
git pull

This may fetch the missing model directories if they exist in the default branch.

🔄 2. Check Other Branches for Model Files

Sometimes, the model files are available in another branch of the repository. To check available branches:

git branch -r

Then switch to the desired branch:

git checkout <branch-name>
git pull

📌 Replace <branch-name> with the name of the branch that likely contains the models (e.g., models, hindi, or main if not already on it).

🔍 3. Check Model File Locations on GitHub

To manually verify model availability on GitHub.com:

Go to the repo: https://github.com/smtiitm/Fastspeech2_HS
Click the branch dropdown on the top left
Switch to different branches and navigate to:hindi/female/model/ hindi/male/model/
If you see files like model.pth, config.yaml, and .npz files, that’s the correct branch to pull from.

✅ After downloading, the models should be located inside:

Fastspeech2_HS/hindi/female/model/

Fastspeech2_HS/hindi/male/model/

You’re now ready to run inference!

✅ Step 7: Verify Installation

Run this to ensure all critical imports work:

python -c "import torch; import librosa; import phonemizer; print('✅ All good!')"

🔊 Step 8: Run Hindi TTS Inference

Here’s how to generate a female Hindi voice sample:

python inference.py \
  --sample_text "नमस्ते, यह एक महिला आवाज़ में हिंदी टेक्स्ट टू स्पीच डेमो है।" \
  --language hindi \
  --gender female \
  --alpha 1 \
  --output_file hindi_female_output.wav

You can replace the --sample_text with your own Hindi sentence.

🧪 Sample Longer Sentence

python inference.py \
  --sample_text "एक गरीब किसान को रास्ते में सोने की थैली मिली, उसने लालच न करके मालिक को लौटा दी; ईमानदारी से प्रभावित होकर मालिक ने इनाम दिया, और गाँव वालों ने उसे सम्मानित किया — यह दिखाता है कि सच्चाई हमेशा जीतती है।" \
  --language hindi \
  --gender female \
  --alpha 1 \
  --output_file hindi_female_output.wav

⚠️ Note: The parser may fail on some special characters like em dash (—). If you encounter errors, remove or replace such characters.

🎉 Conclusion

You now have a working Hindi TTS system running on your Mac M1, M2, M3 or M4 using Fastspeech2_HS! You can easily extend this to other supported languages such as Bengali, Tamil, Marathi, etc.