{"id":1608,"date":"2025-11-10T14:26:48","date_gmt":"2025-11-10T14:26:48","guid":{"rendered":"https:\/\/zahiralam.com\/blog\/?p=1608"},"modified":"2025-11-13T09:48:46","modified_gmt":"2025-11-13T09:48:46","slug":"installing-musetalk-on-runpod-full-step-by-step-guide-with-fixes","status":"publish","type":"post","link":"https:\/\/zahiralam.com\/blog\/installing-musetalk-on-runpod-full-step-by-step-guide-with-fixes\/","title":{"rendered":"MuseTalk for Real-Time Lip-Sync Avatars: Complete GPU Installation &amp; Fixes"},"content":{"rendered":"\n<p>MuseTalk is a real-time talking-face system that combines pose estimation, voice analysis, and lip-sync. Getting it running smoothly on a cloud GPU like&nbsp;<strong>RunPod (NVIDIA A5000)<\/strong>&nbsp;can take a few tweaks. Below is the exact process that worked for me-plus the fixes for the common snags I hit (PyTorch\/CUDA mismatch, missing&nbsp;<code>huggingface-cli<\/code>, FFmpeg path issues, and incomplete weight downloads).\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">\ud83d\udda5\ufe0f 1) RunPod Environment<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>GPU:<\/strong>&nbsp;NVIDIA&nbsp;<strong>A5000<\/strong><\/li>\n\n\n\n<li><strong>Template:<\/strong>&nbsp;<code>sub38-pod<\/code><\/li>\n\n\n\n<li><strong>Base Image:<\/strong>&nbsp;<code>runpod\/pytorch:2.0.1-py3.10-cuda11.8.0-devel<\/code><\/li>\n<\/ul>\n\n\n\n<p>Clone the repo:\n\n\n\n<div class=\"code-block-container\">\n                        <pre class=\"wp-block-code\"><code id=\"code-1\">git clone https:\/\/github.com\/TMElyralab\/MuseTalk.git\ncd MuseTalk<\/code><\/pre>\n                        <amp-iframe sandbox=\"allow-scripts\" width=\"94\" height=\"72\" frameborder=\"0\" \n                                    src=\"https:\/\/zahiralam.com\/blog\/wp-content\/plugins\/amp-copy-code-button\/copier.html#git%20clone%20https%3A%2F%2Fgithub.com%2FTMElyralab%2FMuseTalk.git%0Acd%20MuseTalk\">\n                            <button class=\"copy-button\" data-label=\"git clone https:\/\/github.com\/TMElyralab\/MuseTalk.git\ncd MuseTalk\"  placeholder disabled>Copy<\/button>\n                        <\/amp-iframe>\n                    <\/div>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">\ud83d\udce6 2) (If present) Activate the prebuilt virtualenv<\/h2>\n\n\n\n<p>On this image, there\u2019s a precreated environment at&nbsp;<code>\/muse_env<\/code>. Activate it before installing anything so tools land in the right place:\n\n\n\n<div class=\"code-block-container\">\n                        <pre class=\"wp-block-code\"><code id=\"code-2\">python3 -m venv muse_env\nsource muse_env\/bin\/activate\nwhich python\n# e.g. \/muse_env\/bin\/python<\/code><\/pre>\n                        <amp-iframe sandbox=\"allow-scripts\" width=\"94\" height=\"72\" frameborder=\"0\" \n                                    src=\"https:\/\/zahiralam.com\/blog\/wp-content\/plugins\/amp-copy-code-button\/copier.html#python3%20-m%20venv%20muse_env%0Asource%20muse_env%2Fbin%2Factivate%0Awhich%20python%0A%23%20e.g.%20%2Fmuse_env%2Fbin%2Fpython\">\n                            <button class=\"copy-button\" data-label=\"python3 -m venv muse_env\nsource muse_env\/bin\/activate\nwhich python\n# e.g. \/muse_env\/bin\/python\"  placeholder disabled>Copy<\/button>\n                        <\/amp-iframe>\n                    <\/div>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p>If&nbsp;<code>\/muse_env<\/code>&nbsp;doesn\u2019t exist in your template, create your own venv instead and use that consistently.\n<\/blockquote>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">\u2699\ufe0f 3) Install Python dependencies<\/h2>\n\n\n\n<div class=\"code-block-container\">\n                        <pre class=\"wp-block-code\"><code id=\"code-3\">pip install -r requirements.txt<\/code><\/pre>\n                        <amp-iframe sandbox=\"allow-scripts\" width=\"94\" height=\"72\" frameborder=\"0\" \n                                    src=\"https:\/\/zahiralam.com\/blog\/wp-content\/plugins\/amp-copy-code-button\/copier.html#pip%20install%20-r%20requirements.txt\">\n                            <button class=\"copy-button\" data-label=\"pip install -r requirements.txt\"  placeholder disabled>Copy<\/button>\n                        <\/amp-iframe>\n                    <\/div>\n\n\n\n<p>This generally succeeds, but later you must ensure PyTorch and CUDA really match.\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">\ud83e\udde0 4) Fix PyTorch \u2194 CUDA compatibility (cu118)<\/h2>\n\n\n\n<p>The base image ships with&nbsp;<strong>PyTorch 2.0.1 (CU118)<\/strong>. Verify:\n\n\n\n<div class=\"code-block-container\">\n                        <pre class=\"wp-block-code\"><code id=\"code-4\">python -c &quot;import torch; print(torch.__version__, torch.version.cuda)&quot;\n# Expect something like: 2.0.1 11.8<\/code><\/pre>\n                        <amp-iframe sandbox=\"allow-scripts\" width=\"94\" height=\"72\" frameborder=\"0\" \n                                    src=\"https:\/\/zahiralam.com\/blog\/wp-content\/plugins\/amp-copy-code-button\/copier.html#python%20-c%20%22import%20torch%3B%20print%28torch.__version__%2C%20torch.version.cuda%29%22%0A%23%20Expect%20something%20like%3A%202.0.1%2011.8\">\n                            <button class=\"copy-button\" data-label=\"python -c &quot;import torch; print(torch.__version__, torch.version.cuda)&quot;\n# Expect something like: 2.0.1 11.8\"  placeholder disabled>Copy<\/button>\n                        <\/amp-iframe>\n                    <\/div>\n\n\n\n<p>If it\u2019s not CUDA 11.8 (e.g., you accidentally pulled CU12 wheels), pin the correct builds:\n\n\n\n<div class=\"code-block-container\">\n                        <pre class=\"wp-block-code\"><code id=\"code-5\">pip uninstall -y torch torchvision torchaudio\npip install torch==2.0.1+cu118 torchvision==0.15.2+cu118 torchaudio==2.0.2+cu118 \\\n  -f https:\/\/download.pytorch.org\/whl\/torch_stable.html<\/code><\/pre>\n                        <amp-iframe sandbox=\"allow-scripts\" width=\"94\" height=\"72\" frameborder=\"0\" \n                                    src=\"https:\/\/zahiralam.com\/blog\/wp-content\/plugins\/amp-copy-code-button\/copier.html#pip%20uninstall%20-y%20torch%20torchvision%20torchaudio%0Apip%20install%20torch%3D%3D2.0.1%2Bcu118%20torchvision%3D%3D0.15.2%2Bcu118%20torchaudio%3D%3D2.0.2%2Bcu118%20%5C%0A%20%20-f%20https%3A%2F%2Fdownload.pytorch.org%2Fwhl%2Ftorch_stable.html\">\n                            <button class=\"copy-button\" data-label=\"pip uninstall -y torch torchvision torchaudio\npip install torch==2.0.1+cu118 torchvision==0.15.2+cu118 torchaudio==2.0.2+cu118 \\\n  -f https:\/\/download.pytorch.org\/whl\/torch_stable.html\"  placeholder disabled>Copy<\/button>\n                        <\/amp-iframe>\n                    <\/div>\n\n\n\n<p>\u2705&nbsp;<strong>Result:<\/strong>&nbsp;PyTorch and CUDA align with the image (11.8).\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">\ud83e\udde9 5) Install OpenMMLab components (MMPose)<\/h2>\n\n\n\n<p>MuseTalk relies on MMPose. Use&nbsp;<code>mim<\/code>:\n\n\n\n<div class=\"code-block-container\">\n                        <pre class=\"wp-block-code\"><code id=\"code-6\">pip install --no-cache-dir -U openmim\nmim install mmengine\nmim install &quot;mmcv==2.0.1&quot;\nmim install &quot;mmdet==3.1.0&quot;\nmim install &quot;mmpose==1.1.0&quot;<\/code><\/pre>\n                        <amp-iframe sandbox=\"allow-scripts\" width=\"94\" height=\"72\" frameborder=\"0\" \n                                    src=\"https:\/\/zahiralam.com\/blog\/wp-content\/plugins\/amp-copy-code-button\/copier.html#pip%20install%20--no-cache-dir%20-U%20openmim%0Amim%20install%20mmengine%0Amim%20install%20%22mmcv%3D%3D2.0.1%22%0Amim%20install%20%22mmdet%3D%3D3.1.0%22%0Amim%20install%20%22mmpose%3D%3D1.1.0%22\">\n                            <button class=\"copy-button\" data-label=\"pip install --no-cache-dir -U openmim\nmim install mmengine\nmim install &quot;mmcv==2.0.1&quot;\nmim install &quot;mmdet==3.1.0&quot;\nmim install &quot;mmpose==1.1.0&quot;\"  placeholder disabled>Copy<\/button>\n                        <\/amp-iframe>\n                    <\/div>\n\n\n\n<p>\u2705&nbsp;<strong>Result:<\/strong>&nbsp;MMCV &amp; MMPose installed and compiled against your current stack.\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">\ud83c\udfac 6) Add FFmpeg (static build)<\/h2>\n\n\n\n<p>The base image usually lacks FFmpeg. Fetch a static build and expose it on&nbsp;<code>PATH<\/code>:\n\n\n\n<div class=\"code-block-container\">\n                        <pre class=\"wp-block-code\"><code id=\"code-7\">wget https:\/\/johnvansickle.com\/ffmpeg\/releases\/ffmpeg-release-amd64-static.tar.xz\ntar -xvf ffmpeg-release-amd64-static.tar.xz\nexport FFMPEG_PATH=&quot;$PWD\/ffmpeg-7.0.2-amd64-static&quot;\nexport PATH=&quot;$FFMPEG_PATH:$PATH&quot;\n\nffmpeg -version\n# should print version info<\/code><\/pre>\n                        <amp-iframe sandbox=\"allow-scripts\" width=\"94\" height=\"72\" frameborder=\"0\" \n                                    src=\"https:\/\/zahiralam.com\/blog\/wp-content\/plugins\/amp-copy-code-button\/copier.html#wget%20https%3A%2F%2Fjohnvansickle.com%2Fffmpeg%2Freleases%2Fffmpeg-release-amd64-static.tar.xz%0Atar%20-xvf%20ffmpeg-release-amd64-static.tar.xz%0Aexport%20FFMPEG_PATH%3D%22%24PWD%2Fffmpeg-7.0.2-amd64-static%22%0Aexport%20PATH%3D%22%24FFMPEG_PATH%3A%24PATH%22%0A%0Affmpeg%20-version%0A%23%20should%20print%20version%20info\">\n                            <button class=\"copy-button\" data-label=\"wget https:\/\/johnvansickle.com\/ffmpeg\/releases\/ffmpeg-release-amd64-static.tar.xz\ntar -xvf ffmpeg-release-amd64-static.tar.xz\nexport FFMPEG_PATH=&quot;$PWD\/ffmpeg-7.0.2-amd64-static&quot;\nexport PATH=&quot;$FFMPEG_PATH:$PATH&quot;\n\nffmpeg -version\n# should print version info\"  placeholder disabled>Copy<\/button>\n                        <\/amp-iframe>\n                    <\/div>\n\n\n\n<p><em>(If your extracted folder name differs, update&nbsp;<code>FFMPEG_PATH<\/code>&nbsp;accordingly.)<\/em>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">\ud83d\udcbe 7) Download the model weights<\/h2>\n\n\n\n<p>MuseTalk ships a helper script:\n\n\n\n<div class=\"code-block-container\">\n                        <pre class=\"wp-block-code\"><code id=\"code-8\">bash .\/download_weights.sh<\/code><\/pre>\n                        <amp-iframe sandbox=\"allow-scripts\" width=\"94\" height=\"72\" frameborder=\"0\" \n                                    src=\"https:\/\/zahiralam.com\/blog\/wp-content\/plugins\/amp-copy-code-button\/copier.html#bash%20.%2Fdownload_weights.sh\">\n                            <button class=\"copy-button\" data-label=\"bash .\/download_weights.sh\"  placeholder disabled>Copy<\/button>\n                        <\/amp-iframe>\n                    <\/div>\n\n\n\n<h3 class=\"wp-block-heading\">If you see:&nbsp;<code>huggingface-cli: command not found<\/code><\/h3>\n\n\n\n<p><strong>Root cause:<\/strong>&nbsp;Newer&nbsp;<code>huggingface_hub<\/code>&nbsp;(&gt;=1.0) removed the legacy CLI entry point.\n\n\n\n<p><strong>Fix: install a CLI-providing version and keep it pinned<\/strong>\n\n\n\n<div class=\"code-block-container\">\n                        <pre class=\"wp-block-code\"><code id=\"code-9\">pip install &quot;huggingface_hub[cli]==0.25.2&quot; --force-reinstall\nwhich huggingface-cli\n# e.g. \/muse_env\/bin\/huggingface-cli<\/code><\/pre>\n                        <amp-iframe sandbox=\"allow-scripts\" width=\"94\" height=\"72\" frameborder=\"0\" \n                                    src=\"https:\/\/zahiralam.com\/blog\/wp-content\/plugins\/amp-copy-code-button\/copier.html#pip%20install%20%22huggingface_hub%5Bcli%5D%3D%3D0.25.2%22%20--force-reinstall%0Awhich%20huggingface-cli%0A%23%20e.g.%20%2Fmuse_env%2Fbin%2Fhuggingface-cli\">\n                            <button class=\"copy-button\" data-label=\"pip install &quot;huggingface_hub[cli]==0.25.2&quot; --force-reinstall\nwhich huggingface-cli\n# e.g. \/muse_env\/bin\/huggingface-cli\"  placeholder disabled>Copy<\/button>\n                        <\/amp-iframe>\n                    <\/div>\n\n\n\n<p><strong>Prevent auto-upgrades inside the script<\/strong>: open&nbsp;<code>download_weights.sh<\/code>&nbsp;and comment out the self-update line:\n\n\n\n<div class=\"code-block-container\">\n                        <pre class=\"wp-block-code\"><code id=\"code-10\"># pip install -U &quot;huggingface_hub[cli]&quot;   # keep pinned CLI working<\/code><\/pre>\n                        <amp-iframe sandbox=\"allow-scripts\" width=\"94\" height=\"72\" frameborder=\"0\" \n                                    src=\"https:\/\/zahiralam.com\/blog\/wp-content\/plugins\/amp-copy-code-button\/copier.html#%23%20pip%20install%20-U%20%22huggingface_hub%5Bcli%5D%22%20%20%20%23%20keep%20pinned%20CLI%20working\">\n                            <button class=\"copy-button\" data-label=\"# pip install -U &quot;huggingface_hub[cli]&quot;   # keep pinned CLI working\"  placeholder disabled>Copy<\/button>\n                        <\/amp-iframe>\n                    <\/div>\n\n\n\n<p>Re-run:\n\n\n\n<div class=\"code-block-container\">\n                        <pre class=\"wp-block-code\"><code id=\"code-11\">bash .\/download_weights.sh<\/code><\/pre>\n                        <amp-iframe sandbox=\"allow-scripts\" width=\"94\" height=\"72\" frameborder=\"0\" \n                                    src=\"https:\/\/zahiralam.com\/blog\/wp-content\/plugins\/amp-copy-code-button\/copier.html#bash%20.%2Fdownload_weights.sh\">\n                            <button class=\"copy-button\" data-label=\"bash .\/download_weights.sh\"  placeholder disabled>Copy<\/button>\n                        <\/amp-iframe>\n                    <\/div>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">\ud83d\udd0d 8) Verify weights<\/h2>\n\n\n\n<div class=\"code-block-container\">\n                        <pre class=\"wp-block-code\"><code id=\"code-12\">du -h models\n# e.g. ~8.7G  models<\/code><\/pre>\n                        <amp-iframe sandbox=\"allow-scripts\" width=\"94\" height=\"72\" frameborder=\"0\" \n                                    src=\"https:\/\/zahiralam.com\/blog\/wp-content\/plugins\/amp-copy-code-button\/copier.html#du%20-h%20models%0A%23%20e.g.%20~8.7G%20%20models\">\n                            <button class=\"copy-button\" data-label=\"du -h models\n# e.g. ~8.7G  models\"  placeholder disabled>Copy<\/button>\n                        <\/amp-iframe>\n                    <\/div>\n\n\n\n<p>Typical artifacts include:\n\n\n\n<ul class=\"wp-block-list\">\n<li><code>musetalk\/*<\/code><\/li>\n\n\n\n<li><code>musetalkV15\/unet.pth<\/code><\/li>\n\n\n\n<li><code>sd-vae\/diffusion_pytorch_model.bin<\/code><\/li>\n\n\n\n<li><code>whisper\/pytorch_model.bin<\/code><\/li>\n\n\n\n<li><code>dwpose\/dw-ll_ucoco_384.pth<\/code><\/li>\n\n\n\n<li><code>syncnet\/latentsync_syncnet.pt<\/code><\/li>\n<\/ul>\n\n\n\n<p>\u2705&nbsp;<strong>Result:<\/strong>&nbsp;All weights present (\u22488.7 GB total).\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">\u25b6\ufe0f 9) Run MuseTalk<\/h2>\n\n\n\n<div class=\"code-block-container\">\n                        <pre class=\"wp-block-code\"><code id=\"code-13\">python app.py\n<\/code><\/pre>\n                        <amp-iframe sandbox=\"allow-scripts\" width=\"94\" height=\"72\" frameborder=\"0\" \n                                    src=\"https:\/\/zahiralam.com\/blog\/wp-content\/plugins\/amp-copy-code-button\/copier.html#python%20app.py%0A\">\n                            <button class=\"copy-button\" data-label=\"python app.py\n\"  placeholder disabled>Copy<\/button>\n                        <\/amp-iframe>\n                    <\/div>\n\n\n\n<p>You can also run a one-shot inference:\n\n\n\n<div class=\"code-block-container\">\n                        <pre class=\"wp-block-code\"><code id=\"code-14\">bash inference.sh\n# or:\npython app.py --input_video sample.mp4 --input_audio sample.wav --output output.mp4<\/code><\/pre>\n                        <amp-iframe sandbox=\"allow-scripts\" width=\"94\" height=\"72\" frameborder=\"0\" \n                                    src=\"https:\/\/zahiralam.com\/blog\/wp-content\/plugins\/amp-copy-code-button\/copier.html#bash%20inference.sh%0A%23%20or%3A%0Apython%20app.py%20--input_video%20sample.mp4%20--input_audio%20sample.wav%20--output%20output.mp4\">\n                            <button class=\"copy-button\" data-label=\"bash inference.sh\n# or:\npython app.py --input_video sample.mp4 --input_audio sample.wav --output output.mp4\"  placeholder disabled>Copy<\/button>\n                        <\/amp-iframe>\n                    <\/div>\n\n\n\n<p>\u2705&nbsp;<strong>Result:<\/strong>&nbsp;Generates a realistic talking-face video from your inputs.\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h3 class=\"wp-block-heading\">\ud83e\udde9 10) Running the Real-Time Pipeline (MuseTalk v1.5)<\/h3>\n\n\n\n<p>After MuseTalk was successfully installed and verified, we moved from the one-shot demo (<code>python app.py<\/code>) to the&nbsp;<strong>real-time inference<\/strong>&nbsp;pipeline designed for faster lip-sync generation &#8211; suitable for live or interactive AI avatar applications.\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h4 class=\"wp-block-heading\">\ud83e\udde0 Step 1 \u2013 Configure your avatar<\/h4>\n\n\n\n<p>Open&nbsp;<code>configs\/inference\/realtime.yaml<\/code>&nbsp;and set up your avatar block:\n\n\n\n<div class=\"code-block-container\">\n                        <pre class=\"wp-block-code\"><code id=\"code-15\">avator_1:\n  preparation: True          # run once to build cached features\n  bbox_shift: 5\n  video_path: &quot;data\/video\/1-prisha.mp4&quot;   # or &quot;data\/video\/1-prisha.png&quot;\n  audio_clips:\n    audio_0: &quot;data\/audio\/sumi.wav&quot;\n<\/code><\/pre>\n                        <amp-iframe sandbox=\"allow-scripts\" width=\"94\" height=\"72\" frameborder=\"0\" \n                                    src=\"https:\/\/zahiralam.com\/blog\/wp-content\/plugins\/amp-copy-code-button\/copier.html#avator_1%3A%0A%20%20preparation%3A%20True%20%20%20%20%20%20%20%20%20%20%23%20run%20once%20to%20build%20cached%20features%0A%20%20bbox_shift%3A%205%0A%20%20video_path%3A%20%22data%2Fvideo%2F1-prisha.mp4%22%20%20%20%23%20or%20%22data%2Fvideo%2F1-prisha.png%22%0A%20%20audio_clips%3A%0A%20%20%20%20audio_0%3A%20%22data%2Faudio%2Fsumi.wav%22%0A\">\n                            <button class=\"copy-button\" data-label=\"avator_1:\n  preparation: True          # run once to build cached features\n  bbox_shift: 5\n  video_path: &quot;data\/video\/1-prisha.mp4&quot;   # or &quot;data\/video\/1-prisha.png&quot;\n  audio_clips:\n    audio_0: &quot;data\/audio\/sumi.wav&quot;\n\"  placeholder disabled>Copy<\/button>\n                        <\/amp-iframe>\n                    <\/div>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p>\ud83d\udd38&nbsp;<strong>Note:<\/strong><br>The&nbsp;<code>video_path<\/code>&nbsp;can point to either<br>\u2022 a short&nbsp;<strong>video file<\/strong>&nbsp;(<code>.mp4<\/code>,&nbsp;<code>.mov<\/code>, etc.), which captures natural head pose and lighting &#8211; often more stable results, or<br>\u2022 a single&nbsp;<strong>image<\/strong>&nbsp;(<code>.png<\/code>,&nbsp;<code>.jpg<\/code>), which is faster to prepare but produces a static-face style output.<br>Both work. MuseTalk automatically extracts frames and face regions based on what you provide.\n<\/blockquote>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h4 class=\"wp-block-heading\">\u2699\ufe0f Step 2 \u2013 Prepare the avatar<\/h4>\n\n\n\n<p>Run the initial preparation and caching pass:\n\n\n\n<div class=\"code-block-container\">\n                        <pre class=\"wp-block-code\"><code id=\"code-16\">sh inference.sh v1.5 realtime<\/code><\/pre>\n                        <amp-iframe sandbox=\"allow-scripts\" width=\"94\" height=\"72\" frameborder=\"0\" \n                                    src=\"https:\/\/zahiralam.com\/blog\/wp-content\/plugins\/amp-copy-code-button\/copier.html#sh%20inference.sh%20v1.5%20realtime\">\n                            <button class=\"copy-button\" data-label=\"sh inference.sh v1.5 realtime\"  placeholder disabled>Copy<\/button>\n                        <\/amp-iframe>\n                    <\/div>\n\n\n\n<p>This performs one-time feature extraction and creates the cache folder:\n\n\n\n<div class=\"code-block-container\">\n                        <pre class=\"wp-block-code\"><code id=\"code-17\">\/results\/v15\/avatars\/avator_1\/\n\u251c\u2500\u2500 avator_info.json\n\u251c\u2500\u2500 latents.pt\n\u251c\u2500\u2500 coords.pkl\n\u2514\u2500\u2500 masks\/ ...<\/code><\/pre>\n                        <amp-iframe sandbox=\"allow-scripts\" width=\"94\" height=\"72\" frameborder=\"0\" \n                                    src=\"https:\/\/zahiralam.com\/blog\/wp-content\/plugins\/amp-copy-code-button\/copier.html#%2Fresults%2Fv15%2Favatars%2Favator_1%2F%0A%E2%94%9C%E2%94%80%E2%94%80%20avator_info.json%0A%E2%94%9C%E2%94%80%E2%94%80%20latents.pt%0A%E2%94%9C%E2%94%80%E2%94%80%20coords.pkl%0A%E2%94%94%E2%94%80%E2%94%80%20masks%2F%20...\">\n                            <button class=\"copy-button\" data-label=\"\/results\/v15\/avatars\/avator_1\/\n\u251c\u2500\u2500 avator_info.json\n\u251c\u2500\u2500 latents.pt\n\u251c\u2500\u2500 coords.pkl\n\u2514\u2500\u2500 masks\/ ...\"  placeholder disabled>Copy<\/button>\n                        <\/amp-iframe>\n                    <\/div>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h4 class=\"wp-block-heading\">\u26a1 Step 3 \u2013 Switch to fast mode<\/h4>\n\n\n\n<p>Once the avatar has been prepared, set:\n\n\n\n<div class=\"code-block-container\">\n                        <pre class=\"wp-block-code\"><code id=\"code-18\">preparation: False<\/code><\/pre>\n                        <amp-iframe sandbox=\"allow-scripts\" width=\"94\" height=\"72\" frameborder=\"0\" \n                                    src=\"https:\/\/zahiralam.com\/blog\/wp-content\/plugins\/amp-copy-code-button\/copier.html#preparation%3A%20False\">\n                            <button class=\"copy-button\" data-label=\"preparation: False\"  placeholder disabled>Copy<\/button>\n                        <\/amp-iframe>\n                    <\/div>\n\n\n\n<p>Then run fast inference:\n\n\n\n<div class=\"code-block-container\">\n                        <pre class=\"wp-block-code\"><code id=\"code-19\">python -m scripts.realtime_inference --version v15 \\\n  --inference_config configs\/inference\/realtime.yaml \\\n  --skip_save_images<\/code><\/pre>\n                        <amp-iframe sandbox=\"allow-scripts\" width=\"94\" height=\"72\" frameborder=\"0\" \n                                    src=\"https:\/\/zahiralam.com\/blog\/wp-content\/plugins\/amp-copy-code-button\/copier.html#python%20-m%20scripts.realtime_inference%20--version%20v15%20%5C%0A%20%20--inference_config%20configs%2Finference%2Frealtime.yaml%20%5C%0A%20%20--skip_save_images\">\n                            <button class=\"copy-button\" data-label=\"python -m scripts.realtime_inference --version v15 \\\n  --inference_config configs\/inference\/realtime.yaml \\\n  --skip_save_images\"  placeholder disabled>Copy<\/button>\n                        <\/amp-iframe>\n                    <\/div>\n\n\n\n<p><em>(The valid flag is&nbsp;<code>v15<\/code>, not&nbsp;<code>v1.5<\/code>&nbsp;&#8211;&nbsp;<code>v1.5<\/code>&nbsp;will throw an \u201cinvalid choice\u201d error.)<\/em>\n\n\n\n<p>This command reuses the cached avatar, skips per-frame PNG writes, and benchmarks true inference speed on the GPU.\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h4 class=\"wp-block-heading\">\ud83d\udcca Example timing results (RunPod A5000, 59-second audio)<\/h4>\n\n\n\n<figure class=\"wp-block-table\"><table><thead><tr><th>Mode<\/th><th>Command<\/th><th><code>preparation<\/code><\/th><th>Media Type<\/th><th>Total Time<\/th><\/tr><\/thead><tbody><tr><td>First run<\/td><td><code>sh inference.sh v1.5 realtime<\/code><\/td><td><code>True<\/code><\/td><td>image\/video<\/td><td>\u2248 90 s<\/td><\/tr><tr><td>Cached run<\/td><td><code>sh inference.sh v1.5 realtime<\/code><\/td><td><code>False<\/code><\/td><td>image\/video<\/td><td>\u2248 160 s (includes ffmpeg &amp; disk I\/O)<\/td><\/tr><tr><td>Fast mode<\/td><td><code>python -m scripts.realtime_inference --version v15 --skip_save_images<\/code><\/td><td><code>False<\/code><\/td><td>image\/video<\/td><td><strong>\u2248 49 s<\/strong>&nbsp;(faster-than-real-time)<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h4 class=\"wp-block-heading\">\ud83c\udfac Outcome<\/h4>\n\n\n\n<ul class=\"wp-block-list\">\n<li><code>preparation: True<\/code>&nbsp;\u2192 builds and caches avatar geometry from image or video<\/li>\n\n\n\n<li><code>preparation: False<\/code>&nbsp;\u2192 reuses cached data for fast inference<\/li>\n\n\n\n<li><code>--skip_save_images<\/code>&nbsp;\u2192 disables PNG\/ffmpeg steps for pure model speed<\/li>\n<\/ul>\n\n\n\n<p>On RunPod A5000, MuseTalk v1.5 reaches near-real-time performance using either static image or short video inputs for&nbsp;<code>video_path<\/code>.\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h3 class=\"wp-block-heading\">\ud83e\udde9 10) Monitor GPU Usage (Optional)<\/h3>\n\n\n\n<div class=\"code-block-container\">\n                        <pre class=\"wp-block-code\"><code id=\"code-20\">watch -n 1 nvidia-smi<\/code><\/pre>\n                        <amp-iframe sandbox=\"allow-scripts\" width=\"94\" height=\"72\" frameborder=\"0\" \n                                    src=\"https:\/\/zahiralam.com\/blog\/wp-content\/plugins\/amp-copy-code-button\/copier.html#watch%20-n%201%20nvidia-smi\">\n                            <button class=\"copy-button\" data-label=\"watch -n 1 nvidia-smi\"  placeholder disabled>Copy<\/button>\n                        <\/amp-iframe>\n                    <\/div>\n\n\n\n<p>\u2705 Refreshes GPU status every second &#8211; perfect to verify MuseTalk is utilizing the GPU and to monitor VRAM.\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">\ud83e\udded Environment Summary (Working)<\/h2>\n\n\n\n<figure class=\"wp-block-table\"><table><thead><tr><th>Component<\/th><th>Version \/ Setting<\/th><\/tr><\/thead><tbody><tr><td><strong>GPU<\/strong><\/td><td><strong>NVIDIA A5000<\/strong><\/td><\/tr><tr><td>RunPod Template<\/td><td><code>sub38-pod<\/code><\/td><\/tr><tr><td>Base Image<\/td><td><code>runpod\/pytorch:2.0.1-py3.10-cuda11.8.0-devel<\/code><\/td><\/tr><tr><td>Python Env<\/td><td><strong><code>\/muse_env<\/code><\/strong>&nbsp;(activated via&nbsp;<code>source \/muse_env\/bin\/activate<\/code>)<\/td><\/tr><tr><td><code>which python<\/code><\/td><td><code>\/muse_env\/bin\/python<\/code>&nbsp;(after activation)<\/td><\/tr><tr><td>PyTorch<\/td><td><code>2.0.1<\/code>&nbsp;(CUDA&nbsp;<strong>11.8<\/strong>)<\/td><\/tr><tr><td>huggingface_hub<\/td><td><code>0.25.2<\/code>&nbsp;(<strong>CLI available<\/strong>)<\/td><\/tr><tr><td>FFmpeg<\/td><td>Static build&nbsp;<code>7.0.2<\/code>&nbsp;on&nbsp;<code>PATH<\/code><\/td><\/tr><tr><td>Total Model Size<\/td><td>~<strong>8.7 GB<\/strong><\/td><\/tr><tr><td>MuseTalk<\/td><td><strong>Working end-to-end \u2705<\/strong><\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">\u2705 Key Takeaways<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Match PyTorch\/CUDA<\/strong>&nbsp;with the base image (CU118 for this setup).<\/li>\n\n\n\n<li><strong>Pin&nbsp;<code>huggingface_hub==0.25.2<\/code><\/strong>&nbsp;to retain&nbsp;<code>huggingface-cli<\/code>.<\/li>\n\n\n\n<li><strong>Disable self-upgrades<\/strong>&nbsp;inside helper scripts that could break the CLI.<\/li>\n\n\n\n<li><strong>Install FFmpeg<\/strong>&nbsp;manually and export to&nbsp;<code>PATH<\/code>.<\/li>\n\n\n\n<li><strong>Activate the correct Python env<\/strong>&nbsp;(<code>\/muse_env<\/code>) so installs go where you expect.<\/li>\n\n\n\n<li><strong>Verify weights<\/strong>&nbsp;(~8.7 GB) before first run.<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">\ud83e\uddef Quick Troubleshooting<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong><code>ModuleNotFoundError<\/code>&nbsp;(mmcv\/mmpose)<\/strong><br>Ensure you ran&nbsp;<code>openmim<\/code>&nbsp;installs&nbsp;<strong>after<\/strong>&nbsp;activating your env and with the final PyTorch\/CUDA pinned.<\/li>\n\n\n\n<li><strong><code>ffmpeg: command not found<\/code><\/strong><br>Re-export&nbsp;<code>PATH<\/code>&nbsp;(new terminals don\u2019t inherit). Consider adding to&nbsp;<code>~\/.bashrc<\/code>:<code>echo 'export PATH=\"'\"$FFMPEG_PATH\"':$PATH\"' &gt;&gt; ~\/.bashrc<\/code><\/li>\n\n\n\n<li><strong>Slow\/partial weight downloads<\/strong><br>Re-run&nbsp;<code>download_weights.sh<\/code>. For large files,&nbsp;<code>huggingface-cli download<\/code>&nbsp;supports resuming.<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<p>With the above, MuseTalk runs reliably on&nbsp;<strong>RunPod A5000<\/strong>&nbsp;using the&nbsp;<strong>PyTorch 2.0.1 + CUDA 11.8<\/strong>&nbsp;image-no CUDA mismatches, working Hugging Face CLI, FFmpeg available, and all weights in place.\n","protected":false},"excerpt":{"rendered":"<p>MuseTalk is a real-time talking-face system that combines pose estimation, voice analysis, and lip-sync. Getting it running smoothly on a cloud GPU like&nbsp;RunPod (NVIDIA A5000)&nbsp;can [&#8230;]<\/p>\n","protected":false},"author":1,"featured_media":1646,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[24],"tags":[279,221,18],"class_list":["post-1608","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-artificial-intelligence","tag-live-avatar","tag-talking-avatars","tag-ubuntu"],"amp_enabled":true,"_links":{"self":[{"href":"https:\/\/zahiralam.com\/blog\/wp-json\/wp\/v2\/posts\/1608","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/zahiralam.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/zahiralam.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/zahiralam.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/zahiralam.com\/blog\/wp-json\/wp\/v2\/comments?post=1608"}],"version-history":[{"count":15,"href":"https:\/\/zahiralam.com\/blog\/wp-json\/wp\/v2\/posts\/1608\/revisions"}],"predecessor-version":[{"id":1645,"href":"https:\/\/zahiralam.com\/blog\/wp-json\/wp\/v2\/posts\/1608\/revisions\/1645"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/zahiralam.com\/blog\/wp-json\/wp\/v2\/media\/1646"}],"wp:attachment":[{"href":"https:\/\/zahiralam.com\/blog\/wp-json\/wp\/v2\/media?parent=1608"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/zahiralam.com\/blog\/wp-json\/wp\/v2\/categories?post=1608"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/zahiralam.com\/blog\/wp-json\/wp\/v2\/tags?post=1608"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}