WAN2.2 | LLM Enhancer | T2I -> I2V | Detailer/Upscale
Features:
Turn any Text to Image to a Image 2 Video.
Released version:
Z-Image (Includes now also a QWEN Video prompt enhancer)
Flux.Krea
QWEN
SDXL (Initial version)
In terms of speed vs quality, the latest Z-Image version is for sure my own personal favourite. The QWEN Video prompter might need some additional fine-tuning.
Intended use:
Feed a rather simple/short base prompt to a LLM (generating an extensive prompt) to quickly preview multiple images from any Text to Image model (I usually use 4 images).
Select the preferred image to feed that to an dual-pass upscaler, followed by a face-detailer.
The final image is passed to the WAN 2.2 processing (dual pass, interpolating and a final upscaling).
On a RTX4070 the total runtime, from start to finish usually takes around 6 minutes.
(note that the LLM prompt generation is fully optional and can be disabled using a switch).
Description
FAQ
Comments (9)
Is this adaptable to Qwen in any way? Thank you
Anything that 'can make an image' is suitable. I'll add a QWEN version soon.
I can't get this wf to work. (WAN22-ZIMAGE) I can't install these missing custom nodes with manager: Searge_LLM_Node & ReActorRestoreFace
Even if I bypass them, it won't work. Any ideas?
The Searge LLM module would come from https://github.com/SeargeDP/ComfyUI_Searge_LLM (ComfyUI manager should be able to resolve that), but you could also replace it with a newer prompt enhancer like TS QWen 3 (or fully remove it and only rely on your own prompts). The ReActorRestoreFace would improve facial details (but you can also remove it and link the previous box to the one after it)
The speed and quality of z-image is fantastic, but I have the feeling that the compression is too high for WAN to get good video results.
In my experience these are quite a bit slower than stated on a 4090, taking 15-20 minutes and being a tad inconsistent in video results. I'm not sure if it's a comfyui update, if I wasn't able to get all the exact same models, or if it's some other issue. Do you have any ideas?
On my 4070 (Ti Super) a latest run took 16m 40s on 119 (wan) steps, I would expect a 4090 to be faster for sure (running latest comfyui on stable).
@dutchit288 Putting each pass on 4 steps got me down to about 6m 40s though adherence is a little bit dicey. Any chance this could work with hunyuan 1.5 as the video maker? It seems to listen to prompts better than wan but is quite slow.
You have an error in the video prompt concetrator. You produce an empty promt. And please replace the lora loader manager with rgtrees power lora loader as the function you are trying to use from the manager doesnt work as WAN loras mostly don´t need trigger word and using a static promt for the trigger word is much more conveniant and has a bit more privacy as sending all your loras you have to civitai with every use.