WAN 2.2 S2V 14B GGUF - CivArchive (CivitAI Archive)

WAN 2.2 S2V 14B GGUF - Clip Q8

NSFW

Wan-S2V is an AI video generation model that can transform static images and audio into high-quality videos.

WIP: working on description adding all needed infos/tools! Use with some caution 🤪

Note: S2V has a very high chance of producing some 1st "flashy" over-saturated frames. That seems a limitation of all Wan 2.2 S2V models right now.

Requirements:

lite lora for 4/8-step operation (optional)
Main Model Wan2.2-S2V-14B ComfyUI/models/unet GGUF
Audio Encoder wav2vec2_large_english ComfyUI/models/audio_encoders
Encoder Umt5-xxl ComfyUI/models/text_encoders
Wan2.1_VAE.safetensors ComfyUI/models/vae

Usage hints:

Audio file should be about same length as the video file in seconds

👂🎶 👉 Hint: Click the sample for full-screen and play from the post with SOUND ON!

Sources:

Clip: https://huggingface.co/city96/umt5-xxl-encoder-gguf/

Model: https://huggingface.co/QuantStack/Wan2.2-S2V-14B-GGUF/

Lite LoRA: https://huggingface.co/calcuis/wan2-gguf/

YOU are responsible for outputs as always! If you make ToS violating content and I get aware I WILL report this.

Description

umt5-xxl-encoder-Q8_0

FAQ

Comments (3)

Seeker360Aug 30, 2025· 2 reactions

CivitAI

I was a bit confused by this at first as I assumed from the description that it was a checkpoint with the TE, VAE, AE etc all bundled into one, but I assume it isn't as I don't know of a GGUF checkpoint loader node?

It seems to perform similarly to using the GGUFs from Quantstack, but with the added bonus of not needing to load the Lightning Lora separately. The addition of the FP8 Audio Encoder is greatly appreciated as I think the FP16 AE was causing very long generation times and pushing the VRAM to its limits...

Unfortunately, the combination of low quant GGUF and the Lighting lora replicates the same issue as using the separate files - the lip syncing is blurry and inconsistent and there's next to no motion in the video. I managed to eke out a standard no-GGUF, no-Lightning render yesterday which almost toppled my GPU and took an age to generate. The lip syncing was decent and there was some natural motion that is missing here.

Not at all a criticism or problem with your model here itself, but a sobering reminder that there just doesn't seem to be any way to get extremely demanding models like S2V to work properly on lower VRAM systems, without compromising about 85% of the quality in the process :(

haidensd58757Aug 30, 2025

CivitAI

What's the difference between this and Image2video?

cocoleviAug 31, 2025

CivitAI

Its possible use this s2v into a 3060 12GB?

Checkpoint

Wan Video 2.2 I2V-A14B

by darksidewalker

Download (Beta)

base model

Details

Downloads

198

Platform

CivitAI

Platform Status

Deleted

Created

8/30/2025

Updated

7/2/2026

Deleted

4/27/2026

Files

wan22S2V14BGGUF_clipQ8.gguf

Size:

5.63 GB

SHA256:

2521d4de0bf9e1cc6549866463ceae85e4ec3239bc6063f7488810be39033bbc

Mirrors

HuggingFace (14 mirrors)

umt5-xxl-encoder-Q8_0.gguf

wan22S2V14BGGUF_clipQ8.gguf

umt5-xxl-encoder-Q8_0.gguf

umt5-xxl-encoder-q8-0.gguf

umt5-xxl-encoder-Q8_0.gguf

CivitAI (1 mirrors)

wan22S2V14BGGUF_clipQ8.gguf

ModelScope CN (1 mirrors)

umt5-xxl-encoder-Q8_0.gguf

Available On (1 platform)

Same model published on other platforms. May have additional downloads or version variants.

SeaArt

WAN 2.2 S2V 14B GGUF - Clip Q8

Wan-S2V is an AI video generation model that can transform static images and audio into high-quality videos.

Description

FAQ

What is WAN 2.2 S2V 14B GGUF?

Why was this model removed from CivitAI?

How do I use WAN 2.2 S2V 14B GGUF?

What should I watch out for with Wan Video models?

What other Wan Video-based models are worth knowing?

Can I use this model commercially?

What files are available and where can I download them?

Comments (3)

Details

Files

wan22S2V14BGGUF_clipQ8.gguf

Mirrors

Available On (1 platform)