This is the ComfyUI workflow I use daily on my RTX 5060 8GB. I'm sharing it because getting WAN 2.2 to produce consistent quality at this VRAM level took quite a bit of trial and error, and I figure some of that research might save others time. These settings work well for me — your results may vary depending on your GPU, drivers and system setup, but this is what I keep coming back to.
Also available as a 1-Click Pack — includes auto-installer, exclusive custom nodes, all models, and all dependencies ready to run. No downloads, no terminal, no debugging. Extract → Run → Generate. Available on my Ko-fi
Free workflow: 1-Click Pack:
✓ Workflow JSON ✓ Everything included
✗ 7 standard nodes ✓ Exclusive custom nodes
✗ Standard nodes ✓ Auto-installed
✗ Download 5 models ✓ Auto-downloaded
✗ Configure paths ✓ Pre-configured
✗ Debug compatibility ✓ Tested & working
✗ ~2hrs setup ✓ 5 min extract & runWhat it does
Two-pass generation using the WAN 2.2 Remix V2.1 Q3_K_M GGUF models, followed by a post-process pipeline that upscales and smooths the output. The final video comes out at around 5 seconds, 2x the native resolution, with RIFE frame interpolation for smoother motion.
Honest about timing: on my RTX 5060 8GB with 32GB RAM the full pipeline takes around 10 minutes per video. The quality I get out of it is pretty decent for this class of hardware — good enough that I kept using it. If you want faster results and don't need the upscale or RIFE, skipping those brings generation time down to around 400 seconds.
What you need
System: 8GB VRAM minimum, 32GB RAM recommended (the model offloads to RAM between passes).
Models:
wan22RemixT2VI2V_i2vHighV21-Q3_K_M.ggufwan22RemixT2VI2V_i2vLowV21-Q3_K_M.ggufumt5-xxl-encoder-Q5_K_M.ggufwan_2.1_vae.safetensors4x-UltraSharp.pth
Custom nodes: ComfyUI-GGUF, KJNodes, ComfyUI-WanVideoWrapper, VideoHelperSuite, ComfyUI_Fill-Nodes, Comfy-WaveSpeed, rgthree-comfy
Custom nodes can be installed via ComfyUI Manager → Install Missing Nodes after loading the workflow.
How to use
Drag the JSON into ComfyUI or use Load
In node
106load your input imageIn node
6write your animation promptQueue and wait
Any image orientation works — the workflow auto-adapts to portrait, landscape or square. Very dark images or images with busy backgrounds tend to produce less consistent results.
Why these settings
euler on HIGH, heun on LOW. This specific sampler pairing handles Q3_K_M quantization significantly better than other combinations I tested — most multi-step samplers accumulate precision errors that show as visible pixelation by step 4-5. Finding this took extensive testing across most available options.
CFG is split: 1.5 on HIGH, 1.0 on LOW. The split prevents shadow artifacts that appear with uniform CFG on Q3 models while maintaining facial structure consistency. More steps go to LOW (4) than HIGH (2) because of how WAN 2.2's architecture handles facial identity.
Post-process pipeline:
VAEDecode → FastUnsharpSharpen (0.4) → 4x-UltraSharp → ×0.5 scale → RIFE ×2 → H264 CRF17 @ 32fpsNet 2x upscale plus doubled frames. If you want even smoother output, change the RIFE multiplier in node 310 from 2 to 4 — no extra VRAM needed since RIFE runs on already-decoded frames.
NAG at scale 11, alpha 0.25, tau 2.5 — works through attention normalization so it doesn't trigger the shadow artifacts you'd get from pushing CFG higher. FBCache at threshold 0.12 for speed with minimal quality impact.
RTX 5000 series (Blackwell) — read this
SageAttention is patched on both models in auto mode. Don't change this to an explicit backend. On RTX 5060, 5070, 5080 and 5090 (sm_120 compute capability) the sageattn_qk_int8_pv_fp16_cuda backend crashes natively — those CUDA kernels haven't been compiled for Blackwell yet. auto detects the right implementation at runtime and works correctly.
If you're on an RTX 5000 series card, also use PyTorch cu130 (CUDA 13.0, compiled for sm_120) and launch ComfyUI with --disable-async-offload. The cu128 build underperforms on Blackwell and async offload has some instability on sm_120.
NVFP4 quantized WAN models are not available yet but when they are, Blackwell's FP4 tensor cores will handle them natively — expect a significant quality and speed uplift for this GPU family when that happens.
Variants
+ LightX2V — if you want faster generation at the cost of some quality, add the LightX2V LoRAs on top of this workflow:
HIGH:
wan2.2_i2v_A14b_high_noise_lora_rank64_lightx2v_4step_1022.safetensorsat strength 1.5LOW:
wan2.2_i2v_A14b_low_noise_lora_rank64_lightx2v_4step_1022.safetensorsat strength 1.0
I find the base workflow without LoRAs gives better quality at the same step count, but LightX2V is useful when you want faster iteration.
+ SLG Face Lock — I've also written a custom node (SkipLayerGuidanceWAN) that adds zero-cost face lock on the LOW pass. With heun the difference is subtle since the corrector already stabilizes identity reasonably well, but it's there if you want it. I'll publish it as a separate post.
Want this workflow without the setup? The 1-Click Pack includes everything — auto-installer, models, nodes, optimized for 8GB. → Ko-fi
What didn't work
Noting these because they come up a lot and I spent time on all of them: res_2m and res_3s both pixelate from quantization error accumulation, uniform CFG 3.0 creates heavy shadow artifacts with Q3, MagCache throws a division by zero with heun and beta scheduler, karras scheduler is designed for noise-prediction architectures and doesn't fit WAN's flow matching, and the explicit SageAttention CUDA backend crashes on Blackwell.
Note on the negative prompt
Please don't remove the Chinese-language terms in the negative prompt. They're part of WAN's official negative conditioning and work at the model level, not just as text guidance. Removing them noticeably affects output quality.
v2.0 — 8 LoRA slots (rgthree), FastUnsharpSharpen, auto-orientation, fixed negative prompt + NAG conditioning, resolution corrected to 480×832
v1.0 — initial release
If something doesn't work for you or you get better results with different settings, drop it in the comments. Always curious what others are finding on different hardware.
Description
8 LoRA slots (4 HIGH + 4 LOW) using rgthree Lora Loader Stack - FastUnsharpSharpen added before upscale for noticeably sharper details
Auto-adapts to any image orientation (portrait, landscape, square) — no more forced cropping
Complete negative prompt with all Chinese + English terms for better quality
NAG conditioning properly configured
Resolution corrected to 480×832
New custom node required: rgthree-comfy