Wan 2.2 5B - Latent Video Upscaler & Enhancer / Transform Low-Res Videos into HD Masterpieces — The Intelligent Way

Wan 2.2 5B - Latent Video Upscaler & Enhancer / Transform Low-Res Videos into HD Masterpieces — The Intelligent Way - v1.0

Transform Low-Res Videos into HD Masterpieces — The Intelligent Way

Introduction: Beyond Traditional Upscaling

Traditional AI upscalers like RealESRGAN are great for images, but they often struggle with videos. They can introduce artifacts, fail to add meaningful detail, and leave footage looking blurry and unconvincing.

This workflow, "Wan 2.2 5B - Latent Video Upscaler," offers a paradigm shift. Instead of just guessing pixels, it uses the immense power of the Wan 2.2 5B Text-to-Video model to intelligently reinterpret and reconstruct your video in high definition. It doesn't just scale up; it dreams up the missing details, resulting in a cleaner, more detailed, and more coherent HD video than any conventional upscaler can achieve.

TL;DR: Stop using image upscalers on video. Use a diffusion model to truly enhance and upscale your footage with intelligent detail.

Key Features & Highlights

🤖 Intelligent Enhancement: Leverages the Wan 2.2 5B model to add semantically correct details, textures, and coherence, far surpassing the capabilities of traditional upscalers.
⚡ Fast & Efficient: Built on the lightweight 5B parameter model, this workflow performs latent upscaling and denoising significantly faster than generating from scratch.
🎨 Quality Preservation: Applies a light touch (denoise=0.2) to enhance and upscale without altering the original motion or content of the video drastically.
📈 2x Resolution Boost: Doubles the resolution of your input video directly in the latent space before decoding.
🎬 Smooth Final Output: Includes an optional RIFE frame interpolation pass to double the frame rate (from 16fps to 32fps) for buttery-smooth motion in the final render.
🔊 Audio Passthrough: Automatically carries over the original audio track from your source video to the final enhanced output.

Workflow Overview & Strategy

This workflow is a sophisticated video processing chain:

Input: Load your low-resolution source video using VHS_LoadVideo.
Initial Upscale: The video is immediately 2x upscaled using a Lanczos filter to get to the target size. This provides a better starting point for the model.
Latent Processing: The upscaled frames are encoded into the latent space.
Intelligent Enhancement: The core of the workflow. The Wan 2.2 5B model, guided by quality-positive and detail-negative prompts, gently denoises (denoise=0.2) the latents over just 8 steps with UniPC. This step is where the "magic" happens—the model fills in plausible, high-quality details.
Decoding: The enhanced latents are decoded back into a high-resolution image sequence.
Final Output:
- Option A: Save the immediately upscaled video at 16fps.
- Option B (Recommended): Pass the sequence through RIFE VFI to interpolate frames to 32fps, creating a final video that is both high-resolution and super smooth.

Technical Details & Requirements

🧰 Models Required:

Base Model: (GGUF Format)
- Wan2.2-TI2V-5B-Q8_0.gguf
- Source: Likely from HuggingFace or other model repositories.
LoRA:
- Wan2_2_5B_FastWanFullAttn_lora_rank_128_bf16.safetensors (Applied at strength 0.5)
VAE:
- Wan2.2_VAE.safetensors
CLIP Vision: (For GGUF Loader)
- umt5-xxl-encoder-q4_k_m.gguf
Interpolation Model:
- rife47.pth (For RIFE VFI node)

⚙️ Recommended Hardware:

A GPU with a good amount of VRAM (e.g., 12GB+) is recommended for comfortable operation, especially when processing longer videos.

🔌 Custom Nodes:
This workflow uses:

comfyui-videohelpersuite (VHS) - For video loading/combining
comfyui-frame-interpolation - For RIFE VFI
comfyui-gguf / gguf - For model loading
comfyui-easy-use - For memory management
comfyui-kjnodes - For performance patches (Sage Attention)

Usage Instructions

Load the JSON: Import the provided .json file into your ComfyUI.
Load the Models: Ensure all required models are in their correct folders. Check the paths in the LoaderGGUF, VAELoader, and LoraLoaderModelOnly nodes.
Select Your Video: In the VHS_LoadVideo node, click the video icon to select your low-resolution input video.
Queue Prompt: Run the workflow!
Retrieve Output: Find your two enhanced videos in the output directory:
- .../Wan 2.2 5B Upscales/Denoise 0.2_xxxxx.mp4 (16fps)
- .../Wan 2.2 5B Upscales/Denoise 0.2_32fps_xxxxx.mp4 (32fps - Smoother)

Tips & Tricks

Denoise Strength: The denoise parameter in the KSampler (default 0.2) is key.
- ~0.1-0.3: Best for upscaling/enhancement. Preserves the original content while improving quality.
- >0.5: Will start to significantly alter the content and style, moving towards a new generation based on your video.
Source Quality: This workflow excels at breathing new life into low-quality, pixelated, or noisy source videos from older generators.
Prompt Engineering: The positive prompt (high detail, high quality...) is generic to encourage enhancement. For stylistic changes, you can modify this prompt (e.g., "cinematic, film grain, photorealism").

Conclusion: The Future of Video Upscaling

This workflow demonstrates a powerful new application for diffusion models: not just as generators, but as intelligent enhancement tools. By leveraging the knowledge within the Wan 2.2 model, we can upscale videos with a level of coherence and detail that traditional methods simply cannot match. It’s faster than full generation and smarter than simple scaling.

Upload your low-res clips and witness the intelligent upscaling revolution.

Credit: Crafted by the ComfyUI community. Special thanks to the creators of the Wan 2.2 models and the FastWanFullAttn LoRA.

Description

FAQ

Comments (20)

blobby99Aug 28, 2025· 2 reactions

CivitAI

This method is very important, and many of us have been using it for ages as the only decent local temporal solution. I prefer the better low noise 14B Wan2.2 model- the 5B model has no speed advantage if you know how to launch Comfy properly so the model uses RAM not VRAM.

AND, it is not true that one cannot use image upscaling successfully for some videos - SD ultimate upscale workflows allow 1080P videos to become 4K without hitting VRAM restrictions by processing the video as a series of images, that can later be recombined from a folder. The lack of a temporal element strangely doesn't matter.

parallelepipedonSep 5, 2025

Speed? This WF sucks up every available byte of memory on my box. Sadly, I've only got 64 GB RAM and a 4090. But I was all the way down to Q3_K umt5 and wan2.2 i2v low Q2 GGUF and it still maxed both kinds of RAM then OOMed. Yet for regular i2v I'm running the fp16 t5 and Wan2.2 i2v fp16, plus an accelerator LoRA and drone LoRA... and I tacked an upscale on the end, but just ESRGAN and FILM VFI. No OOMs.

parallelepipedonSep 5, 2025

Now I'm no stranger to SD ultimate upscale (just new to Comfy). Hell I'll take Forge-style Hires Fix. Just 4x-UltraSharp is enough to give me enough extra fine details. Latent ends up changing the image too much. I guess separating frames and upscaling each individually is a last resort I might have to take. The original images were made in Flux and Wan 2.2 is just taking a wild guess, but doing a surprisingly good job overall. It's just missing the fine details. Overlookable at 1408 x 800... but not at twice that. Maybe you're right... it's time to bite the bullet. You've frame numbers for a "temporal element". ;-> That's all I've ever had doing 3D animation for many years. I'll only render straight to an mp4 for a quickie test.

mediimedii1medii448Nov 19, 2025

I'm new to ComfyUI and so far I haven't found any video scaling models that don't destroy my 16GB of VRAM. Do you have a workflow that works for scaling?

7765564Aug 28, 2025· 3 reactions

CivitAI

It would be nice if you added links to the necessary models in the description.

dxjaymzAug 30, 2025· 3 reactions

LORA:

https://huggingface.co/Kijai/WanVideo_comfy/resolve/main/FastWan/Wan2_2_5B_FastWanFullAttn_lora_rank_128_bf16.safetensors

MODEL: https://huggingface.co/QuantStack/Wan2.2-TI2V-5B-GGUF/blob/main/Wan2.2-TI2V-5B-Q8_0.gguf

VAE: https://huggingface.co/QuantStack/Wan2.2-TI2V-5B-GGUF/blob/main/VAE/Wan2.2_VAE.safetensors

CLIP: https://huggingface.co/city96/umt5-xxl-encoder-gguf/blob/0e9a7657447c3a2215edf3a7c5a081633102d19c/umt5-xxl-encoder-Q4_K_M.gguf

RobopsychoAug 29, 2025· 1 reaction

CivitAI

it keeps getting stuck at vae decode stage - res is 1216 w 480h (trying for an ultrawide look) and I'm feeding it 16 fps clips with 102 frames per clip. I'm also using all your model files, text encoder and vae files, so idk -- I also have a 4090, lots of ram and cpu so - kinda stuck as to how to make it work. BTW your motion enhancements worked great for me - very cool. But now I'm trying to upscale them and yeah , no dice.

zardozai

Author

Aug 29, 2025

Did you try using the VAE tile node (encode/decode)?

For low VRAM, it should work since you are using an RTX 4090, which only has 24 GB of VRAM, if I am correct.

RobopsychoAug 29, 2025

@zardozai ok, yeah gpt actually suggested that I do tiling also -- so i should run tiling then -- ok I'll try it -- yes a 4090.

RobopsychoAug 29, 2025

@zardozai ok, I'm gonna use a tiled vae decode at 512 just to see if it works. I'll let you know how it goes.

RobopsychoAug 29, 2025

@zardozai I got it to work -- batch vae decode worked well at 512, and I was feeding in 24 fps clips before, that's probably why it was failing before. I thought the clips were 16 fps, but the were 24 fps. now that I have the clip feeds coming in at 16 fps it's working now.

zardozai

Author

Aug 29, 2025· 1 reaction

@Robopsycho To feed in at 24 frames per second, simply adjust the output settings accordingly.

RobopsychoSep 2, 2025

any way to make the upscale less saturated? I notice that it turned a warm light pretty orange. I tried putting denoise to 1.0 - we'll see what happens. TY

zardozai

Author

Sep 2, 2025

@Robopsycho ComfyUI Node: Color Match

Give it a try.

RobopsychoSep 3, 2025· 1 reaction

@zardozai so it worked - made the light bright -- but not too saturated - Thanks :)

killkrazedSep 3, 2025· 1 reaction

CivitAI

Not sure what is going wrong here, but trying to use the workflow unedited I am for some reason getting "torch.linalg.solve: The solver failed because the input matrix is singular."

I suspect that uni_pc as the sampler is the problem, what other samplers would you say would work best?

I tried changing to SA_Solver and Beta but then I got "contracted dimensions need to match, but first has size 4 in dim 1 and second has size 0 in dim 0"

This is when using all the default values of the workflow.

Tried swapping ComfyUI to launching qith Quad attention instead of Sage Attention, and I don't get any errors, but the workflow finishes far faster than expected and the result is just pure noise.

zardozai

Author

Sep 8, 2025

I have removed the Lora that can cause this issue in the last version.

parallelepipedonSep 5, 2025

CivitAI

On a 4090 with 64G system RAM, I couldn't get this to work with even the smallest quants. It maxed out both kinds of RAM completely. I was admittedly not trying to upscale tiny videos. Low Res in my world is 16:9-ish at anywhere from 1280 to 1536 wide. I really WANT this to work though. And preferably with the low noise side of 15B. Guess I need to look into BlockSwap, etc.

zardozai

Author

Sep 5, 2025

do not touch anything and re download the workflow in its original state and run it again

blobby99Sep 8, 2025· 5 reactions

There is ZERO chance of temporal upscaling at decent speed on a 4090 if your input and output have so large a resolution, you exhaust latent storage in your VRAM. ZERO! Temporal processing requires access to the entire latent data set. What you are doing is catastrophic thrashing of memory from VRAM to system RAM, and more likely to an SSD swap file- absolutely catastrophic.

Too may people here know nothing about computer science, data flow, memory systems, and memory management. Worse, they think because they overspent on a 4090, or 5090, everything must be possible.

What you can do is use a non-temporal upscale method. Recently, surprisingly, it was discovered that SD ultimate upscale, with a fixed seed, is good for doubling the linear rez. Any amount of VRAM is good for a pretty high-rez input video, providing you use a workflow that treats each frame as an image, and saves the upscaled images one by one to a folder. Then you can create a new upscaled video from that folder.

Workflows

Wan Video 2.2 TI2V-5B

by zardozai

Download (Beta) View on CivitAI

base model

Details

Downloads

569

Platform

CivitAI

Platform Status

Available

Created

8/28/2025

Updated

6/27/2026

Deleted

Files

wan225BLatentVideoUpscalerEnhancer_v10.zip

Size:

4.55 KB

SHA256:

705b8e3d5815eb6cc587edaf93edb1aa0236ccb3af0e5a93a6493412dabf1061

Mirrors

CivitAI (1 mirrors)

wan225BLatentVideoUpscalerEnhancer_v10.zip

Transform Low-Res Videos into HD Masterpieces — The Intelligent Way

Introduction: Beyond Traditional Upscaling

Key Features & Highlights

Workflow Overview & Strategy

Technical Details & Requirements

Usage Instructions

Tips & Tricks

Conclusion: The Future of Video Upscaling

Description

FAQ

What is Wan 2.2 5B - Latent Video Upscaler & Enhancer / Transform Low-Res Videos into HD Masterpieces — The Intelligent Way?

What files are available and where can I download them?

Comments (20)

Details

Files

wan225BLatentVideoUpscalerEnhancer_v10.zip

Mirrors