Wan2.2 5B Fun Control - Fast Video ControlNet

Workflow Overview

This is a sophisticated ComfyUI workflow designed for high-quality, controllable video generation using the powerful Wan2.2 5B Fun model. It leverages ControlNet (via Canny edge detection) to transform a driving motion video and a starting reference image into a stunning, coherent animated sequence. Perfect for creating dynamic character animations with consistent style and precise motion transfer.

Core Concept: Use a "control video" (e.g., a person dancing) to guide the motion, and a "reference image" (e.g., a character design) to define the style and subject. The workflow intelligently merges them into a new, AI-generated video.

Key Features & Highlights

🚀 State-of-the-Art Model: Utilizes the Wan2.2-Fun-5B-Control-Q8_0.gguf quantized model for a balance of incredible quality and manageable hardware requirements.
🎨 Precision Control: Implements a Canny Edge ControlNet. The workflow extracts edges from your input video, ensuring the generated animation perfectly follows the original motion.
⚡ Optimized for Speed: Integrates a custom LoRA (Wan2_2_5B_FastWanFullAttn), allowing for high-quality results in just 8 sampling steps without significant quality loss.
🧠 Efficient LLM Inference: Uses a separate, quantized umt5-xxl-encoder CLIP model for text encoding, reducing VRAM load on your GPU.
🔧 Complete Pipeline: Everything from model loading, video preprocessing, conditioning, sampling, to final video encoding is included in one seamless, organized graph.
📁 Ready-to-Use: Pre-configured with optimal settings, including a detailed positive/negative prompt. Just load your own image and video to start creating.

Workflow Structure

The workflow is neatly grouped into logical sections for easy understanding and customization:

Step1 - Load models: Loads the main Wan2.2 5B model, its VAE, the CLIP text encoder, and the FastWan LoRA.
Step 2 - Start_image: Loads your initial reference image. This defines the character and style for the first frame.
Step 3 - Control video and video preprocessing: Loads your motion video and processes it through the Canny node to extract edge maps.
Step 4 - Prompt: Where you input your positive and negative prompts to guide the generation.
Step 5 - Video size & length: The Wan22FunControlToVideo node packages everything, setting the output video dimensions and length based on the control video.
Sampling & Decoding: The KSampler runs for 8 steps with UniPC, and the VAE decodes the latents into final images.
Video Output: The VHS_VideoCombine node encodes the image sequence into an MP4 video file.

How to Use This Workflow

Download & Install:
- Ensure you have ComfyUI Manager to easily install missing custom nodes.
- Required Custom Nodes: ComfyUI-VideoHelperSuite, ComfyUI-GGUF (for loading the .gguf models).
- Download the .json file from this post.
Load the Models:
- Main Model: Place Wan2.2-Fun-5B-Control-Q8_0.gguf in your ComfyUI/models/gguf/ folder.
- CLIP Model: Place umt5-xxl-encoder-q4_k_m.gguf in the same gguf/ folder.
- VAE: The workflow points to Wan2.2_VAE.safetensors. Ensure it's in your models/vae/ folder.
- LoRA: Place Wan2_2_5B_FastWanFullAttn_lora_rank_128_bf16.safetensors in your models/loras/ folder. Adjust the path in the LoraLoader node if yours is in a subfolder (e.g., wan_loras/).
Load Your Assets:
- Reference Image: In the LoadImage node, change the image name to your own file (e.g., my_character.png).
- Control Video: In the LoadVideo node, change the video name to your own motion clip (e.g., my_dance_video.mp4).
Customize Your Prompt:
- Edit the text in the Positive Prompt node to describe your desired character and scene.
- The provided negative prompt is already comprehensive, but you can modify it as needed.
Run the Workflow:
- Queue the prompt in ComfyUI. The final video will be saved to your ComfyUI/output/video/ folder.

Tips for Best Results

Control Video: Use a video with clear, strong motion and good contrast for the Canny detector to work best. Silhouettes or videos with a plain background work excellently.
Reference Image: The first frame of your output will closely match this image. Use a high-quality image of your character in a pose similar to the first frame of your control video.
Length: The length in Wan22FunControlToVideo is set to 121 based on the original video. If your video is a different length, you must update this value to match the number of frames.
Experiment: Try adjusting the LoRA strength (e.g., between 0.4 - 0.7) or the Canny thresholds to fine-tune the balance between motion fidelity and creative freedom.

Required Models (Download Links)

Wan2.2-Fun-5B-Control-Q8_0.gguf: https://huggingface.co/QuantStack/Wan2.2-Fun-5B-Control-GGUF
umt5-xxl-encoder-q4_k_m.gguf: https://huggingface.co/city96/umt5-xxl-encoder-gguf/tree/main
Wan2.2_VAE.safetensors: https://huggingface.co/QuantStack/Wan2.2-Fun-5B-InP-GGUF/tree/main/vae
Wan2_2_5B_FastWanFullAttn_lora_rank_128_bf16.safetensors: https://huggingface.co/Kijai/WanVideo_comfy/blob/main/FastWan/Wan2_2_5B_FastWanFullAttn_lora_rank_128_bf16.safetensors

Conclusion

This workflow demonstrates the powerful synergy between the Wan2.2 model, ControlNet, and efficient LoRAs. It abstracts away the complexity, providing you with a robust, one-click solution for creating amazing AI-powered animations. Enjoy creating!

If you use this workflow, please share your results! I'd love to see what you create.

Workflow Overview

Key Features & Highlights

Workflow Structure

How to Use This Workflow

Tips for Best Results

Required Models (Download Links)

Conclusion

Description

FAQ

Details

Files

wan225BFunControlFast_v10.zip

Mirrors

Workflow Overview

Key Features & Highlights

Workflow Structure

How to Use This Workflow

Tips for Best Results

Required Models (Download Links)

Conclusion

Description

FAQ

What is Wan2.2 5B Fun Control - Fast Video ControlNet?

What files are available and where can I download them?

Details

Files

wan225BFunControlFast_v10.zip

Mirrors