This workflow uses the Wan Fantasy Talking Model for lip syncing.
NOTE :
There are sometimes issues with lip syncing... I hope there will be a fix from Alibaba.
I will update if a fix comes along, meanwhile.. please check MultiTalk, this has no issues with synchronization. This works a lot better at the moment. See link below.
This is very natural looking lip sync.
Input: an audio file with a voice, a photo of someone's face (close up is better)
The workflow will create a video by animating the photo and sync up the voice.
You may want to upscale the video with your favorite upscaler.
LIPSYNC using FantasyTalking model (Alibaba)
wan video model
-----------------
https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/tree/main/split_files/diffusion_models
Fantasytalking model
--------------------
https://huggingface.co/Kijai/WanVideo_comfy/tree/main
This workflow was tested with 24GB VRAM and 64GB RAM
100 frames at 512x512 with 15 steps is taking about 9 to 10 minutes.
Description
There was an issue the video/audio syncing in version 1.0. Hopefully that is a fixed now. Had to increase the fps, based on feedback from author of model.
FAQ
Looks like we don't have an active mirror for this file right now.
CivArchive is a community-maintained index — we catalog mirrors that volunteers upload to HuggingFace, torrents, and other public hosts. Looks like no one has uploaded a copy of this file yet.
Some files do get recovered over time through contributions. If you're looking for this one, feel free to ask in Discord, or help preserve it if you have a copy.