Description:

This workflow allows you to generate video from text.

You will find a step-by-step guide to using this workflow here: link

My other workflows for WAN: link

Resources you need:

For base version
T2V Model: fp16, fp8
In models/diffusion_models

For GGUF version
T2V Quant Model: Q8, Q5, Q3
In models/diffusion_models

Common files :
CLIP: umt5_xxl_fp8_e4m3fn_scaled.safetensors
in models/clip

VAE: wan_2.1_vae.safetensors
in models/vae

Speed LoRA: lightx2v_T2V_14B_cfg_step_distill_v2_lora_rank64_bf16.safetensors
in models/loras

ANY upscale model:

in models/upscale_models

Description

Interface adjustment :

Backend :

reduction of the number of custom nodes from 12 to 8,
improvement of the automatic prompt function with the replacement of 8 words like "image" or "drawing" by video to avoid making static videos,
added clip loader in GGUF version.

New "Post-production" menu.

Rollback on native upscaler.

New model optimisation :

Temporal attention for improve spatiotemporal predictive.
RifleXRoPE reduce bugs on videos longer than 5s. This allows you to increase the maximum video length from 5s to 8s.
MagCache,
Blockswap.