If you’re looking to run Hindi Text-to-Speech (TTS) locally on your Mac M1, M2, M3 or M4 without GPU dependency, this guide walks you through setting up Fastspeech2_HS — a powerful multilingual TTS system developed by IIT Madras.
✅ Prerequisites
- Mac with Apple Silicon (M1, M2, M3 or M4)
- Miniconda installed
- Basic knowledge of terminal and Python
🧪 Step 1: Clone the Repository
git clone https://github.com/smtiitm/Fastspeech2_HS.git
cd Fastspeech2_HS
🧱 Step 2: Create the Conda Environment
Create a file named environment_m1.yml
and paste the following content:
name: tts-hs-hifigan
channels:
- conda-forge
- apple
dependencies:
- python=3.10
- pip
- numpy
- scipy
- matplotlib
- pandas
- librosa
- tqdm
- pyyaml
- pytorch
- torchaudio
- torchvision
Now create the environment:
conda env create -f environment_m1.yml
conda activate tts-hs-hifigan
📦 Step 3: Install Python Dependencies via pip
Inside the activated environment, run:
pip install phonemizer g2p_en unidecode soundfile flask nltk jamo sentencepiece inflect numba h5py pydub resampy pyworld
pip install typeguard==2.13.3
pip install --upgrade scipy
🛠️ Step 4: Fix Compatibility Issues
Fix 1: scipy.signal.kaiser
Import Issue
Edit this file:
vim /Users/zahir/miniconda3/envs/tts-hs-hifigan/lib/python3.10/site-packages/espnet2/gan_tts/melgan/pqmf.py
Replace:
from scipy.signal import kaiser
With:
from numpy import kaiser
Fix 2: wordparse()
Crash on Invalid Word
In text_preprocess_for_inference.py
, before this line:
parsed_word = wordparse(word, 0, 0, 1)
Add this:
if not word or not word[0].isalpha():
continue
🧠 Step 5: Install Missing Modules
The repo requires the following modules not included in the environment.yml
:
indic-num2words
pip install indic-num2words
indic-unified-parser
Since the GitHub link is broken, install it from PyPI:
pip install indic-unified-parser
📥 Step 6: Download the Model Files
By default, when you run:
git clone https://github.com/smtiitm/Fastspeech2_HS.git
you may not get the full set of model files, especially for Hindi (hindi/female/model
, hindi/male/model
). If the model folders are empty or missing, follow these steps:
✅ 1. Try a Git Pull to Update the Repo
Navigate into the project folder:
cd Fastspeech2_HS
git pull
This may fetch the missing model directories if they exist in the default branch.
🔄 2. Check Other Branches for Model Files
Sometimes, the model files are available in another branch of the repository. To check available branches:
git branch -r
Then switch to the desired branch:
git checkout <branch-name>
git pull
📌 Replace
<branch-name>
with the name of the branch that likely contains the models (e.g.,models
,hindi
, ormain
if not already on it).
🔍 3. Check Model File Locations on GitHub
To manually verify model availability on GitHub.com:
- Go to the repo:
https://github.com/smtiitm/Fastspeech2_HS
- Click the branch dropdown on the top left
- Switch to different branches and navigate to:
hindi/female/model/ hindi/male/model/
- If you see files like
model.pth
,config.yaml
, and.npz
files, that’s the correct branch to pull from.
✅ After downloading, the models should be located inside:
Fastspeech2_HS/hindi/female/model/
Fastspeech2_HS/hindi/male/model/
You’re now ready to run inference!
✅ Step 7: Verify Installation
Run this to ensure all critical imports work:
python -c "import torch; import librosa; import phonemizer; print('✅ All good!')"
🔊 Step 8: Run Hindi TTS Inference
Here’s how to generate a female Hindi voice sample:
python inference.py \
--sample_text "नमस्ते, यह एक महिला आवाज़ में हिंदी टेक्स्ट टू स्पीच डेमो है।" \
--language hindi \
--gender female \
--alpha 1 \
--output_file hindi_female_output.wav
You can replace the --sample_text
with your own Hindi sentence.
🧪 Sample Longer Sentence
python inference.py \
--sample_text "एक गरीब किसान को रास्ते में सोने की थैली मिली, उसने लालच न करके मालिक को लौटा दी; ईमानदारी से प्रभावित होकर मालिक ने इनाम दिया, और गाँव वालों ने उसे सम्मानित किया — यह दिखाता है कि सच्चाई हमेशा जीतती है।" \
--language hindi \
--gender female \
--alpha 1 \
--output_file hindi_female_output.wav
⚠️ Note: The parser may fail on some special characters like em dash (—). If you encounter errors, remove or replace such characters.
🎉 Conclusion
You now have a working Hindi TTS system running on your Mac M1, M2, M3 or M4 using Fastspeech2_HS! You can easily extend this to other supported languages such as Bengali, Tamil, Marathi, etc.