Transform Photos into Talking Avatars: Free, Open-Source EchoMimic on Ubuntu 24.04 LTS

Imagine bringing your favorite photos to life and making them talk! EchoMimic, a free and open-source tool, empowers you to do just that. It uses the magic of deep learning to create stunning audio-driven animations from your static images. This guide will walk you through installing and setting up EchoMimic on Ubuntu 24.04 LTS, allowing you to create your own talking avatars.

Prerequisites

Before diving into the installation, ensure you have the following:

An Ubuntu 24.04 LTS system
Basic knowledge of terminal commands
Conda for managing Python environments (check this article for installation steps: Conda Installation on Ubuntu)

Step-by-Step Installation

1. Install System Dependencies

Open your terminal and update your package list:

sudo apt update
sudo apt upgrade

2. Install Git

Ensure Git is installed to clone the EchoMimic repository:

sudo apt install git

3. Clone the EchoMimic Repository

Clone the EchoMimic repository from GitHub:

git clone https://github.com/BadToBest/EchoMimic
cd EchoMimic

4. Set Up Python Environment with Conda

Create and activate a new conda environment:

conda create -n echomimic python=3.8
conda activate echomimic

5. Install Python Packages

Install the required Python packages using pip:

pip install -r requirements.txt

6. Install ffmpeg (if not already installed)

To install ffmpeg, use the following commands:

sudo apt update
sudo apt install ffmpeg

Verify the installation:

ffmpeg -version

Set the FFMPEG_PATH environment variable which is returned from ffmpeg -version command:

export FFMPEG_PATH=/usr/bin/ffmpeg

In my case ffmpeg -version command has returned ‘/usr/bin/ffmpeg’

7. Install Git LFS and Download Pretrained Weights

Install Git LFS and clone the pretrained weights.

Download and install Git LFS:

wget https://github.com/git-lfs/git-lfs/releases/download/v3.5.1/git-lfs-linux-amd64-v3.5.1.tar.gz
tar -zxvf git-lfs-linux-amd64-v3.5.1.tar.gz
bash git-lfs-3.5.1/install.sh

Configure Git LFS and download models:

git lfs install
git clone https://huggingface.co/BadToBest/EchoMimic pretrained_weights

Clean up:

rm -r git-lfs-3.5.1

8. Run EchoMimic

Run the Gradio UI to start using EchoMimic:

python -u webgui.py --server_port=3000

Using EchoMimic

With EchoMimic up and running, you can now create talking avatars by uploading images and providing audio files through the Gradio UI. Customize your configurations by editing the ./configs/prompts/animation.yaml file to tailor the outputs to your needs.

Conclusion

EchoMimic is a powerful tool for creating engaging, talking avatars from static images. By following this guide, you can easily set it up on Ubuntu 24.04 LTS and start exploring its capabilities.

If you’re looking to create 4K, long-duration talking videos, you might also want to check out my tutorial on installing Hallo 2, a more advanced tool designed for high-quality, extended animations.

Enjoy transforming your photos into dynamic, audio-driven animations!