{"id":185,"date":"2024-02-22T14:36:12","date_gmt":"2024-02-22T14:36:12","guid":{"rendered":"https:\/\/zahiralam.com\/blog\/?p=185"},"modified":"2024-10-17T13:57:19","modified_gmt":"2024-10-17T13:57:19","slug":"breathing-life-into-images-creating-talking-images-on-ubuntu-without-a-gpu-using-wav2lip","status":"publish","type":"post","link":"https:\/\/zahiralam.com\/blog\/breathing-life-into-images-creating-talking-images-on-ubuntu-without-a-gpu-using-wav2lip\/","title":{"rendered":"Breathing Life into Images: Creating Talking Images on Ubuntu Without a GPU Using Wav2Lip"},"content":{"rendered":"\n<p class=\"wp-block-paragraph\">Creating talking images or videos is a fascinating application of deep learning. One such tool that makes this possible is Wav2Lip, a highly accurate lip-sync model. This article will guide you through the process of installing and using Wav2Lip on an Ubuntu system without a GPU.\n\n\n\n<h2 class=\"wp-block-heading\">Step 1: Setting Up the Environment<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">First, we need to set up a Python environment using Conda. For a detailed guide on installing Conda on Ubuntu, you can refer to step-by-step mentioned on below site :\n\n\n\n<figure class=\"wp-block-embed is-type-wp-embed is-provider-zahirs-blog wp-block-embed-zahirs-blog\"><div class=\"wp-block-embed__wrapper\">\n<blockquote class=\"wp-embedded-content\" data-secret=\"t4Zp7k5eqT\"><a href=\"https:\/\/zahiralam.com\/blog\/conda-installation-on-ubuntu-simplified-step-by-step-instructions-with-activation-and-deactivation\/\">Conda Installation on Ubuntu: Simplified Step-by-Step Instructions with Activation and Deactivation<\/a><\/blockquote><iframe loading=\"lazy\" class=\"wp-embedded-content\" sandbox=\"allow-scripts\" security=\"restricted\" style=\"position: absolute; clip: rect(1px, 1px, 1px, 1px);\" title=\"&#8220;Conda Installation on Ubuntu: Simplified Step-by-Step Instructions with Activation and Deactivation&#8221; &#8212; Zahirs Blog\" src=\"https:\/\/zahiralam.com\/blog\/conda-installation-on-ubuntu-simplified-step-by-step-instructions-with-activation-and-deactivation\/embed\/#?secret=vCZrK9d96o#?secret=t4Zp7k5eqT\" data-secret=\"t4Zp7k5eqT\" width=\"600\" height=\"338\" frameborder=\"0\" marginwidth=\"0\" marginheight=\"0\" scrolling=\"no\"><\/iframe>\n<\/div><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">Once installed, create a new Conda environment named \u2018wav2lip\u2019 with Python 3.6:\n\n\n\n<pre class=\"wp-block-syntaxhighlighter-code\">conda create -n wav2lip python=3.6\nconda activate wav2lip<\/pre>\n\n\n\n<h2 class=\"wp-block-heading\">Step 2: Installing ffmpeg<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Next, install ffmpeg, a software suite to handle multimedia data:\n\n\n\n<pre class=\"wp-block-syntaxhighlighter-code\">sudo apt-get install ffmpeg<\/pre>\n\n\n\n<h2 class=\"wp-block-heading\">Step 3: Cloning the Wav2Lip Repository<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Clone the Wav2Lip repository from GitHub:\n\n\n\n<pre class=\"wp-block-syntaxhighlighter-code\">git clone https:\/\/github.com\/Rudrabha\/Wav2Lip.git\ncd Wav2Lip<\/pre>\n\n\n\n<h2 class=\"wp-block-heading\">Step 4: Modifying and Installing Requirements<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Edit the <code>requirements.txt<\/code> file and remove <code>opencv-contrib-python<\/code> and <code>opencv-python<\/code>. Then, install OpenCV from the Conda-Forge channel:\n\n\n\n<pre class=\"wp-block-syntaxhighlighter-code\">conda install -c conda-forge opencv<\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">And after that install packages from requirements.txt\n\n\n\n<pre class=\"wp-block-syntaxhighlighter-code\">pip install -r requirements.txt<\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">\n\n\n\n<h2 class=\"wp-block-heading\">Step 5: Downloading Pre-Trained Models<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Download the face detection <a href=\"https:\/\/www.adrianbulat.com\/downloads\/python-fan\/s3fd-619a316812.pth\" target=\"_blank\" rel=\"noreferrer noopener\">pre-trained model<\/a>  and place it in the <code>face_detection\/detection\/sfd\/s3fd.pth<\/code> directory. You can download it from <a href=\"https:\/\/www.adrianbulat.com\/downloads\/python-fan\/s3fd-619a316812.pth\" target=\"_blank\" rel=\"noreferrer noopener\">here<\/a>.\n\n\n\n<p class=\"wp-block-paragraph\">Additionally, download the checkpoints for the Wav2Lip models. Here are the links to the models:\n\n\n\n<ul class=\"wp-block-list\">\n<li><a href=\"https:\/\/iiitaphyd-my.sharepoint.com\/:u:\/g\/personal\/radrabha_m_research_iiit_ac_in\/Eb3LEzbfuKlJiR600lQWRxgBIY27JZg80f7V9jtMfbNDaQ?e=TBFBVW\" target=\"_blank\" rel=\"noreferrer noopener\">Wav2Lip<\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/iiitaphyd-my.sharepoint.com\/:u:\/g\/personal\/radrabha_m_research_iiit_ac_in\/EdjI7bZlgApMqsVoEUUXpLsBxqXbn5z8VTmoxp55YNDcIA?e=n9ljGW\" target=\"_blank\" rel=\"noreferrer noopener\">Wav2Lip + GAN<\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/iiitaphyd-my.sharepoint.com\/:u:\/g\/personal\/radrabha_m_research_iiit_ac_in\/EQRvmiZg-HRAjvI6zqN9eTEBP74KefynCwPWVmF57l-AYA?e=ZRPHKP\" target=\"_blank\" rel=\"noreferrer noopener\">Expert Discriminator<\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/iiitaphyd-my.sharepoint.com\/:u:\/g\/personal\/radrabha_m_research_iiit_ac_in\/EQVqH88dTm1HjlK11eNba5gBbn15WMS0B0EZbDBttqrqkg?e=ic0ljo\" target=\"_blank\" rel=\"noreferrer noopener\">Visual Quality Discriminator<\/a><\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">After Downloading checkpoints place it in <strong>checkpoints<\/strong> folder.\n\n\n\n<h2 class=\"wp-block-heading\">Step 6: Generating the Talking Image<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Finally, you can generate the talking image using the following command:\n\n\n\n<pre class=\"wp-block-syntaxhighlighter-code\">python inference.py --checkpoint_path checkpoints\/wav2lip_gan.pth --face input\/zahir2.jpeg --audio input\/bazigar_part1.wav --outfile results\/pad-90-100-90-0-resize720.mp4 --pads 90 100 90 0 --resize_factor 720<\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Replace <code>input\/zahir2.jpeg<\/code> with the path to your image file and <code>input\/bazigar_part1.wav<\/code> with the path to your audio file.\n\n\n\n<p class=\"wp-block-paragraph\">And that\u2019s it! You\u2019ve now created a talking image using Wav2Lip on Ubuntu without a GPU. Enjoy bringing your images to life!\n\n\n\n<h2 class=\"wp-block-heading\"><strong>Conclusion<\/strong><\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Using&nbsp;<strong>Wav2Lip<\/strong>, you can easily create talking images on Ubuntu without the need for a GPU. This powerful tool allows you to bring images to life by syncing lip movements to any audio file, making it an accessible solution for anyone interested in animation and deep learning.\n\n\n\n<p class=\"wp-block-paragraph\">If you&#8217;re looking to push the boundaries further and create&nbsp;<strong>4K, long-duration talking videos<\/strong>&nbsp;with even more advanced capabilities, check out my guide on installing&nbsp;<strong>Hallo2<\/strong>&nbsp;<a href=\"https:\/\/zahiralam.com\/blog\/bring-images-to-life-install-hallo2-on-ubuntu-for-4k-long-duration-talking-videos\/\">here<\/a>. This tutorial will help you leverage the full potential of Hallo2 for creating high-quality, extended animations.\n","protected":false},"excerpt":{"rendered":"<p>Creating talking images or videos is a fascinating application of deep learning. One such tool that makes this possible is Wav2Lip, a highly accurate lip-sync [&#8230;]<\/p>\n","protected":false},"author":1,"featured_media":1130,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[24],"tags":[41,40,43,221],"class_list":["post-185","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-artificial-intelligence","tag-ai-ubuntu","tag-ai-without-gpu","tag-image-to-video","tag-talking-avatars"],"amp_enabled":true,"_links":{"self":[{"href":"https:\/\/zahiralam.com\/blog\/wp-json\/wp\/v2\/posts\/185","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/zahiralam.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/zahiralam.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/zahiralam.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/zahiralam.com\/blog\/wp-json\/wp\/v2\/comments?post=185"}],"version-history":[{"count":5,"href":"https:\/\/zahiralam.com\/blog\/wp-json\/wp\/v2\/posts\/185\/revisions"}],"predecessor-version":[{"id":1305,"href":"https:\/\/zahiralam.com\/blog\/wp-json\/wp\/v2\/posts\/185\/revisions\/1305"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/zahiralam.com\/blog\/wp-json\/wp\/v2\/media\/1130"}],"wp:attachment":[{"href":"https:\/\/zahiralam.com\/blog\/wp-json\/wp\/v2\/media?parent=185"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/zahiralam.com\/blog\/wp-json\/wp\/v2\/categories?post=185"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/zahiralam.com\/blog\/wp-json\/wp\/v2\/tags?post=185"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}