Creating talking images or videos is a fascinating application of deep learning. One such tool that makes this possible is Wav2Lip, a highly accurate lip-sync model. This article will guide you through the process of installing and using Wav2Lip on an Ubuntu system without a GPU.
Step 1: Setting Up the Environment
First, we need to set up a Python environment using Conda. For a detailed guide on installing Conda on Ubuntu, you can refer to step-by-step mentioned on below site :
Once installed, create a new Conda environment named ‘wav2lip’ with Python 3.6:
conda create -n wav2lip python=3.6 conda activate wav2lip
Step 2: Installing ffmpeg
Next, install ffmpeg, a software suite to handle multimedia data:
sudo apt-get install ffmpeg
Step 3: Cloning the Wav2Lip Repository
Clone the Wav2Lip repository from GitHub:
git clone https://github.com/Rudrabha/Wav2Lip.git cd Wav2Lip
Step 4: Modifying and Installing Requirements
Edit the requirements.txt
file and remove opencv-contrib-python
and opencv-python
. Then, install OpenCV from the Conda-Forge channel:
conda install -c conda-forge opencv
And after that install packages from requirements.txt
pip install -r requirements.txt
Step 5: Downloading Pre-Trained Models
Download the face detection pre-trained model and place it in the face_detection/detection/sfd/s3fd.pth
directory. You can download it from here.
Additionally, download the checkpoints for the Wav2Lip models. Here are the links to the models:
After Downloading checkpoints place it in checkpoints folder.
Step 6: Generating the Talking Image
Finally, you can generate the talking image using the following command:
python inference.py --checkpoint_path checkpoints/wav2lip_gan.pth --face input/zahir2.jpeg --audio input/bazigar_part1.wav --outfile results/pad-90-100-90-0-resize720.mp4 --pads 90 100 90 0 --resize_factor 720
Replace input/zahir2.jpeg
with the path to your image file and input/bazigar_part1.wav
with the path to your audio file.
And that’s it! You’ve now created a talking image using Wav2Lip on Ubuntu without a GPU. Enjoy bringing your images to life!