[I FORGOT TO ADD A WHITE BG TO THE CAPTIONS SHOWCASING THE WORKFLOWS]
2 Steps High Lora 2 Steps Low Lora / 4 Steps High Lora 4 Steps Low Lora
2 Steps High 2 Steps High Lora 2 Steps Low Lora / 4 Steps High 4 Steps Low
Simple Workflow I've put together, heavily modified upon the Smooth Workflow by Digital Pastel
Due to that, it's very easy to pick it up and modify
Works on my 12GB RTX 3080Ti with 32GB of Ram, at 6/9 it takes 200-400 seconds to generate a 5-sec 16fps 480p > upscaled to 960p 32fps video
I've found the sweetspot for quality/speed to be at 9 steps on 480p.
Due to the use of 3 Samplers, without lora and with loras, you can generate videos with lots of motion and details while keeping the generation time low.
I'll try and put out a list of models I use/recommend on the workflow with their respective links. For now here's a list:
Wan2.2 I2V 14B High/Low Q4 K M . GGUF
LIGHT2X 4 Steps Lora High/Low . Safetensors
UMT5 XXL Fp8 E4MN3F Scaled . Safetensors
Wan 2.1 VAE . Safetensors
RealESRGAN 2X Plus . PTH
Sageattention (recommended)
It can generate NSFW but I won't test for it
Description
It was created
FAQ
Comments (14)
Worth trying on an rtx 3090 @ 24gb vram? Ill do it anyway and get back to you.
please do let me know, should work just fine.
you could also run the fp8 scaled version instead of the gguf.
@LeonardoFritoAlt Definitely works with my rtx 3090. A bit slow—maybe taking about 7-10 minutes to generate—but the quality is well worth it. Pretty much same settings but w/ additional LoRAs.
Thanks for this!
I liked this workflow, it makes my dog's videos have more movement (no LoRAS in those videos), but for... more private things, sometimes it looks like a Charlie Chaplin movie. There are even LoRAS that lose effectiveness and quality with this workflow. For those things, I stick with the original from Digital Pastel.
yeah, i haven't and i won't test for nsfw, also, default wan sucks a bit for nsfw, that's why theres a ton of loras for all sorts of positions and movements, i suggest using NSFW Helper and Genital Helper for that.
I'd like to add a suggestion to save VRAM and prevent OOM by purging models & cache[optional; negligible] in 3 locations:
After major steps:
1. Video Combine [1] > Purge
2. Upscale > Purge (clearing only model instead of cache to save time is optional [7 seconds in my case])
3. Video Combine [2] > Purge
(1) You no longer need the wan models. Free up space for upscale model.
(2) You no longer need the upscale model for frame interpolation. Clearing everything before RIFE could increase efficiency.
(3) Clearing everything before a new generation.
(more testing needed It's really just upscaling causing OOM)
RTX 3090 24gb
Generation time [16fps]: 400s-500s
Total: 800s-900s
Wow, wait a second, 800s for a 5s video is a lot for a 3090, specially since on my 3080ti i get 200-300 total. If you want to speed it up check out the gguf models instead of the full versions and use the loras included for 4 steps, they will massively speed it up and free ram
about your suggestion, yes, i'll make a low vram option where it purges after those steps, but for +12gb users like me, purging after every run makes the generation process because it has to unload, then load stuff back up tons of times, so i can go from 200s to 500s generation times just like that.
saw your other comment: oh that makes sense now lol, 10min is not ideal for a 3090, but youre not using loras so yeah, about expected, btw, at the point of not using any loras you should stick to the 2 sampler method, the 3 sampler method is built around fixing the lora not giving any movements when used ;)
@LeonardoFritoAlt ohh... I'll try to find time to play around w/ the sampler arrangement. I'm using the GGUF models and the 4steps loras already. But I'm guessing I'll stick to the 3 sampler method to make scenes look more alive. Not entirely sure where I'm bottlenecking; figure it out eventually, maybe.
off-note: I found a commit which edits the comfy_extras/nodes_upscale_model.py to use gpu instead of using cpu https://github.com/comfyanonymous/ComfyUI/commit/5cd75306378ab6e8d1760a017bd1ca369d950324
> "Fix GPU utilization in upscale model node by keeping tensors on GPU. Added output_device parameter to tiled_scale function to prevent unnecessary CPU transfers, resulting in 2x faster processing. Commented out model CPU offloading to maintain GPU acceleration throughout the pipeline."
i9-10900KF (3.70 GHz, 10-Core, 20-Thread)
ASUS ROG Strix GeForce RTX 3090-24 GB GDDR6X VRAM
64 GB DDR4 @ 3600 MHz
Thanks. Struggled to get this 3-sampler scenario to work, but now it's good. Removed steps group because ComfyUI-LogicUtils couldn't install, removed upscaling and vram cleaning nodes, works flawlessly on 0.9mp resoultion. Have to try 3+2+2 steps. 3+3+3 is too slow even on 3090.
Upd:change youre multiplier on RIFE node to 2, 3 makes longer and slowmo videos
im curious now, how long is your 3090 taking? because somehow im getting faster gen that 3090 users, im on a 3080ti.
Also, install sageatttention: How To Install Sage Attention 2.2 On ComfyUI Portable And Desktop Version
@LeonardoFritoAlt approx 480s - 2-2-2 steps and for 3-2-2 steps - 590s, depends on what model I use - fp8 scaled makes it a bit longer, that is without upscale.
I already have sage and triton, that group of nodes make my 3090 brrrr to 360w 100% tdp limit, ignoring my undervolt settings
You have all of these examples where you pair two videos. What are you comparing between the two? Which is which?
[I FORGOT TO ADD A WHITE BG TO THE CAPTIONS SHOWCASING THE WORKFLOWS]
2 Steps High Lora 2 Steps Low Lora / 4 Steps High Lora 4 Steps Low Lora
2 Steps High 2 Steps High Lora 2 Steps Low Lora / 4 Steps High 4 Steps Low