Wan Video ComfyUI (T2V & I2V) - CivArchive (CivitAI Archive)

Wan Video ComfyUI (T2V & I2V) - v3.0

NSFW

The compressed package contains 2 ComfyUI workflows for running:

1.Wan 2.1 T2V: wan2-t2v-upscale-v1.json

2.Wan 2.1 I2V: wan2-i2v-upscale-v1.json

Reference output:

On my RTX4060 8GB vRAM + 32G RAM i2v: Prompt executed in 2807.42 seconds

On my RTX5080 laptop 16G vRAM + 32G RAM i2v: Prompt executed in 1401.00 seconds

Requirements:

Models:

- wan2.1-t2v-14b-Q3_K_M.gguf (T2V) Put in: ComfyUI\models\unet

https://huggingface.co/city96/Wan2.1-T2V-14B-gguf/resolve/main/wan2.1-t2v-14b-Q3_K_M.gguf

- wan2.1-i2v-14b-480p-Q3_K_M.gguf (I2V) Put in: ComfyUI\models\unet

https://huggingface.co/city96/Wan2.1-I2V-14B-480P-gguf/resolve/main/wan2.1-i2v-14b-480p-Q3_K_M.gguf

- wan2.1_t2v_1.3B_fp16.safetensors (t2v model, used in workflow "v2v") Put in: ComfyUI\models\diffusion_models

https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/diffusion_models/wan2.1_t2v_1.3B_fp16.safetensors

- umt5-xxl-encoder-Q4_K_M.gguf (CLIP) Put in: ComfyUI\models\text_encoders

https://huggingface.co/city96/umt5-xxl-encoder-gguf/resolve/main/umt5-xxl-encoder-Q4_K_M.gguf

- umt5_xxl_fp8_e4m3fn_scaled.safetensors (CLIP, can use above if you modify workflow "v2v") Put in: ComfyUI\models\text_encoders

https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/text_encoders/umt5_xxl_fp8_e4m3fn_scaled.safetensors

- wan_2.1_vae.safetensors (VAE) Put in: ComfyUI\models\vae

https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/vae/wan_2.1_vae.safetensors

- clip_vision_h.safetensors (CLIP VISION) Put in: ComfyUI\models\clip_vision

https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/clip_vision/clip_vision_h.safetensors

- RealESRGAN_x2plus.pth (Upscale Model) Put in: ComfyUI\models\upscale_models

https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.1/RealESRGAN_x2plus.pth

ComfyUI Nodes:

- rgthree-comfy

- ComfyUI-KJNodes

- ComfyUI-VideoHelperSuite

- ComfyUI-Frame-Interpolation

- Comfyui-Memory_Cleanup (Not required if you modify the workflow)

If you have higher performance hardware, you can choose higher quantization models.

Description

I completely refactored the workflow.

The compressed file contains 3 ComfyUI workflows for running:

1.wan2.2-i2v-gguf-v2.json

2.wan2.2-i2v-v2.json

2.wan2.2-t2v-gguf-v2.json

Reference output:

RTX4060 8GB vRAM + 32G RAM i2v(GGUF Q5_K_M; 81*512*784): Prompt executed in 00:10:14

RTX4070 16GB vRAM + 32G RAM i2v(Remix; 81*576*864): Prompt executed in 325.84 seconds

RTX5080 laptop 16G vRAM + 64G RAM i2v(Remix; 81*576*960): Prompt executed in 377.07 seconds

Requirements:

Models:

Main Process:

- Wan 2.2 (GGUF), Put in: ComfyUI\models\unet; vRAM 16G can use Q8_0, vRAM 8G can use Q5_K_M.

- Wan2.2-T2V.

https://huggingface.co/QuantStack/Wan2.2-T2V-A14B-GGUF/resolve/main/HighNoise/Wan2.2-T2V-A14B-HighNoise-Q5_K_M.gguf

https://huggingface.co/QuantStack/Wan2.2-T2V-A14B-GGUF/resolve/main/LowNoise/Wan2.2-T2V-A14B-LowNoise-Q5_K_M.gguf

- Wan2.2-I2V.

https://huggingface.co/QuantStack/Wan2.2-I2V-A14B-GGUF/resolve/main/HighNoise/Wan2.2-I2V-A14B-HighNoise-Q5_K_M.gguf

https://huggingface.co/QuantStack/Wan2.2-I2V-A14B-GGUF/resolve/main/LowNoise/Wan2.2-I2V-A14B-LowNoise-Q5_K_M.gguf

Wan 2.2 (safetensors), Put in: ComfyUI\models\diffusion_models

- Remix 2.0 NSFW Version, I2V.

https://huggingface.co/FX-FeiHou/wan2.2-Remix/resolve/main/NSFW/Wan2.2_Remix_NSFW_i2v_14b_high_lighting_v2.0.safetensors

https://huggingface.co/FX-FeiHou/wan2.2-Remix/resolve/main/NSFW/Wan2.2_Remix_NSFW_i2v_14b_low_lighting_v2.0.safetensors

- umt5-xxl-encoder-Q8_0.gguf; (CLIP) Put in: ComfyUI\models\text_encoders

https://huggingface.co/city96/umt5-xxl-encoder-gguf/resolve/main/umt5-xxl-encoder-Q8_0.gguf

NSFW-API/NSFW-Wan-UMT5-XXL

https://huggingface.co/NSFW-API/NSFW-Wan-UMT5-XXL/resolve/main/nsfw_wan_umt5-xxl_fp8_scaled.safetensors

- wan_2.1_vae.safetensors (VAE) Put in: ComfyUI\models\vae

https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/vae/wan_2.1_vae.safetensors

- Optional Loras, speed-up; Put in: ComfyUI\models\loras\LightX2V

Page Link: https://huggingface.co/lightx2v/Wan2.2-Distill-Loras/tree/main

https://huggingface.co/lightx2v/Wan2.2-Distill-Loras/resolve/main/wan2.2_i2v_A14b_high_noise_lora_rank64_lightx2v_4step_1022.safetensors

https://huggingface.co/lightx2v/Wan2.2-Distill-Loras/resolve/main/wan2.2_i2v_A14b_low_noise_lora_rank64_lightx2v_4step_1022.safetensors

ComfyUI Nodes:

- rgthree-comfy

- ComfyUI-KJNodes

- ComfyUI-VideoHelperSuite

- ComfyUI-Frame-Interpolation

- Comfyui-Memory_Cleanup

- ComfyUI-GGUF

- comfyui-custom-scripts

If you have higher performance hardware, you can choose higher quantization models.

FAQ

Comments (2)

moshiNov 14, 2025

CivitAI

大佬请问一下，我使用的是ITV的工作流，打开配置好以后发现LORA加载器的CLIP没有链接，然后我就把CLIP加载器的CLIP输出链接过来，也能正常运行，但是输出的视频第一秒还是清晰的，后面越来越模糊。到第2，3秒已经完全看不清了。Hi, I have a question. I'm using ITV's workflow. After opening and configuring it, I found that the CLIP in the LoRa loader wasn't linked. So I linked the CLIP output from the CLIP loader, and it ran normally. However, the output video was clear for the first second, but it became increasingly blurry. By the second or third second, it was completely unreadable.

Chris_T

Author

Nov 17, 2025

Lora的Clip可以不用连接，采样时一样生效。

模糊是因为采样步数不足。工作流中的6, 2步是官方原模型/GGUF 版配上 lightx2v 加速Lora，或使用集成了light加速的模型（像Remix）来采样的，加速的步数建议是 8, 4 (CFG：1.0, 1.0)，我测试下来 6, 2 (CFG：2.0, 1.0) 效果也可以就以此在工作流了。

如果没有使用加速模型，像t2v工作流中至少需要 10, 4 (CFG: 3.0, 2.5)，然后再依据生成效果调整步数和CFG。

目前试下来 Remix 模型的生成效果不错，你可以去 Hugging Face 上下载该模型或 Wan2.2-Lightx2v Lora。

Workflows

Wan Video 2.2 I2V-A14B

by Chris_T

Download (Beta) View on CivitAI