creating-talking-images-on-ubuntu-without-a-gpu-using-wav2lip

Breathing Life into Images: Creating Talking Images on Ubuntu Without a GPU Using Wav2Lip

Creating talking images or videos is a fascinating application of deep learning. One such tool that makes this possible is Wav2Lip, a highly accurate lip-sync model. This article will guide you through the process of installing and using Wav2Lip on an Ubuntu system without a GPU.

Step 1: Setting Up the Environment

First, we need to set up a Python environment using Conda. For a detailed guide on installing Conda on Ubuntu, you can refer to step-by-step mentioned on below site :

Once installed, create a new Conda environment named ‘wav2lip’ with Python 3.6:

conda create -n wav2lip python=3.6
conda activate wav2lip

Step 2: Installing ffmpeg

Next, install ffmpeg, a software suite to handle multimedia data:

sudo apt-get install ffmpeg

Step 3: Cloning the Wav2Lip Repository

Clone the Wav2Lip repository from GitHub:

git clone https://github.com/Rudrabha/Wav2Lip.git
cd Wav2Lip

Step 4: Modifying and Installing Requirements

Edit the requirements.txt file and remove opencv-contrib-python and opencv-python. Then, install OpenCV from the Conda-Forge channel:

conda install -c conda-forge opencv

And after that install packages from requirements.txt

pip install -r requirements.txt

Step 5: Downloading Pre-Trained Models

Download the face detection pre-trained model and place it in the face_detection/detection/sfd/s3fd.pth directory. You can download it from here.

Additionally, download the checkpoints for the Wav2Lip models. Here are the links to the models:

After Downloading checkpoints place it in checkpoints folder.

Step 6: Generating the Talking Image

Finally, you can generate the talking image using the following command:

python inference.py --checkpoint_path checkpoints/wav2lip_gan.pth --face input/zahir2.jpeg --audio input/bazigar_part1.wav --outfile results/pad-90-100-90-0-resize720.mp4 --pads 90 100 90 0 --resize_factor 720

Replace input/zahir2.jpeg with the path to your image file and input/bazigar_part1.wav with the path to your audio file.

And that’s it! You’ve now created a talking image using Wav2Lip on Ubuntu without a GPU. Enjoy bringing your images to life!

Leave a Reply

Your email address will not be published. Required fields are marked *