MotionForge: Simple WAN2.2 T2V + LightX2V 4-Step + 5B Refiner & 32FPS

Unleash the power of pure text-to-video generation! This "MotionForge" ComfyUI workflow is your all-in-one solution for creating dynamic, high-quality videos directly from your imagination. No starting image is needed—just a powerful prompt.

This streamlined pipeline leverages the best of the Wan2.2 ecosystem:

Text-to-Video Generation: Harnesses the massive Wan2.2-T2V-A14B models for robust initial video creation from your text descriptions.
Lightning-Fast Motion: Integrates the revolutionary LightX2V 4-Step LoRAs, drastically reducing the number of steps needed for smooth, coherent motion.
Style Fusion: Optionally applies a FLUX style LoRA to add unique aesthetic flair to your generations.
HD Latent Upscaling: Refines and enlarges the video using the efficient Wan2.2-Fun-5B-InP model, enhanced by the FastWan LoRA for quick, high-quality results.
Cinematic Finish: Delivers a final, buttery-smooth 32FPS output, upscaled and ready for display.

Go from a simple idea to a stunning animated video in one seamless process.

✨ Features & Highlights

True Text-to-Video: Generate videos from text prompts alone—no input image required. Perfect for bringing entirely new concepts to life.
Ultra-Efficient 4-Step Generation: The included LightX2V LoRAs are a game-changer, producing high-quality motion in a fraction of the usual steps.
Style Customization: Built-in integration for a FLUX style LoRA, allowing you to easily tweak the artistic output of your videos.
Two-Pass Quality Pipeline: Uses both a High-Noise and Low-Noise model path for optimal detail and motion clarity.
HD Upscaling & Refinement: The dedicated 5B upscaler node cleans up and enlarges your video for a professional finish.
Optimized Performance: Includes cleanGpuUsed nodes to help manage VRAM throughout the complex generation process.

📦 Required Models (Please Download First!)

For this workflow to function, you must download and place the following models in your respective ComfyUI models folders.

1. Core Wan2.2 T2V GGUF Models:

Wan2.2-T2V-A14B-HighNoise-Q8_0.gguf
Wan2.2-T2V-A14B-LowNoise-Q8_0.gguf
Wan2.2-Fun-5B-InP-Q8_0.gguf (for upscaling)
- Source: https://huggingface.co/QuantStack (check for T2V-specific GGUF files)

2. Motion & Style LoRAs (for A14B T2V):

lightx2v_I2V_14B_480p_cfg_step_distill_rank128_bf16.safetensors (The key to 4-step generation!)
- Source: https://huggingface.co/Kijai/WanVideo_comfy/tree/main/Lightx2v
Wan2.2-Lightning_T2V-v1.1-A14B-4steps-lora_LOW_fp16.safetensors
- Source: (Typically found alongside other Wan2.2 LoRAs on Hugging Face)
aidmaMJ6.1-FLUX-v0.5.safetensors (Optional style LoRA)
- Source: (Search Civitai or Hugging Face for FLUX LoRAs)

3. Upscaler LoRA (for 5B):

Wan2_2_5B_FastWanFullAttn_lora_rank_128_bf16.safetensors (Drastically reduces required steps for upscaling!)
- Source: https://huggingface.co/Kijai/WanVideo_comfy/tree/main/FastWan

4. VAE & Upscaler:

Wan2_1_VAE_fp32.safetensors (for initial generation)
Wan2.2_VAE.safetensors (for the upscaler sub-graph)
RealESRGAN_x2plus.pth (Standard upscaling model)
- Source: https://huggingface.co/dtarnow/UPscaler/tree/main (or any standard model repository)

5. CLIP Encoder:

umt5-xxl-encoder-Q8_0.gguf (Typically bundled with the Wan GGUF downloads)

⚙️ Installation & Usage

Download the Workflow: Download the provided .json file from this Civitai page.
Download All Models: Ensure you have all the models listed above downloaded to the correct folders.
Load in ComfyUI: Open ComfyUI, drag the .json file into the window, and the workflow will load.
Check Loaders: The workflow uses ComfyUI-GGUF and ComfyUI-VideoHelperSuite (VHS). Please ensure you have these custom nodes installed.
Craft Your Prompt:
- This is a Text-to-Video workflow. Leave the start_image input on the WanImageToVideo node disconnected.
- Modify the positive and negative prompts in the CLIP Text Encode nodes. The provided example creates a fun "cat surfing selfie" video.
Set Your Video Size: Adjust the width and height in the WanImageToVideo node (default is 400x544).
Queue Prompt! You're ready to go. The workflow will handle the rest, from T2V generation to upscaling and interpolation.

Pro Tip: The workflow uses a two-stage KSampler. The first stage (4 steps) creates the motion, and the second stage (4 steps) refines it. You can adjust the cfg and steps in these samplers to fine-tune your results.

Conclusion

The "MotionForge" workflow demystifies high-quality text-to-video generation. By combining the latest specialized models and LoRAs, it offers a powerful yet surprisingly efficient path from a text prompt to a polished video. It's perfect for creators who want to explore the limitless possibilities of AI-driven animation without any initial imagery.

We can't wait to see what you create! Share your results, like, and follow for more powerful workflows.

✨ Features & Highlights

📦 Required Models (Please Download First!)

⚙️ Installation & Usage

Conclusion

Description

Details

Files

motionforgeSimpleWAN22T2V_v10.zip

Mirrors