This workflow uses the Wan Fantasy Talking Model for lip syncing.
NOTE :
There are sometimes issues with lip syncing... I hope there will be a fix from Alibaba.
I will update if a fix comes along, meanwhile.. please check MultiTalk, this has no issues with synchronization. This works a lot better at the moment. See link below.
This is very natural looking lip sync.
Input: an audio file with a voice, a photo of someone's face (close up is better)
The workflow will create a video by animating the photo and sync up the voice.
You may want to upscale the video with your favorite upscaler.
LIPSYNC using FantasyTalking model (Alibaba)
wan video model
-----------------
https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/tree/main/split_files/diffusion_models
Fantasytalking model
--------------------
https://huggingface.co/Kijai/WanVideo_comfy/tree/main
This workflow was tested with 24GB VRAM and 64GB RAM
100 frames at 512x512 with 15 steps is taking about 9 to 10 minutes.
Description
I had to clean up some things that was causing the video/audio to get out of sync. Fantasy Talking is very sensitive to the configuration. This one seems to produce good results. Please try to stick to the config settings in this workflow. Also , made it easier to do continuation of lipsync for low VRAM.
FAQ
Looks like we don't have an active mirror for this file right now.
CivArchive is a community-maintained index — we catalog mirrors that volunteers upload to HuggingFace, torrents, and other public hosts. Looks like no one has uploaded a copy of this file yet.
Some files do get recovered over time through contributions. If you're looking for this one, feel free to ask in Discord, or help preserve it if you have a copy.