Wan Video
Note: There are other Wan Video files hosted on Civitai - these may be duplicates, but this model card is primarily to host the files used by Wan Video in the Civitai Generator.
These files are the ComfyUI Repack - the original files can be found in Diffusers/multi-part safetensors format here.
Wan2.2, a major upgrade to our visual generative models, which is now open-sourced, offering more powerful capabilities, better performance, and superior visual quality. With Wan2.2, we have focused on incorporating the following technical innovations:
👍 MoE Architecture: Wan2.2 introduces a Mixture-of-Experts (MoE) architecture into video diffusion models. By separating the denoising process cross timesteps with specialized powerful expert models, this enlarges the overall model capacity while maintaining the same computational cost.
💪🏻 Data Scaling: Compared to Wan2.1, Wan2.2 is trained on a significantly larger data, with +65.6% more images and +83.2% more videos. This expansion notably enhances the model's generalization across multiple dimensions such as motions, semantics, and aesthetics, achieving TOP performance among all open-sourced and closed-sourced models.
🎬 Cinematic Aesthetics: Wan2.2 incorporates specially curated aesthetic data with fine-grained labels for lighting, composition, and color. This allows for more precise and controllable cinematic style generation, facilitating the creation of videos with customizable aesthetic preferences.
🚀 Efficient High-Definition Hybrid TI2V: Wan2.2 open-sources a 5B model built with our advanced Wan2.2-VAE that achieves a compression ratio of 16×16×4. This model supports both text-to-video and image-to-video generation at 720P resolution with 24fps and can also run on consumer-grade graphics cards like 4090. It one of the fastest 720P@24fps models currently available, capable of serving both the industrial and academic sectors simultaneously.
Wan2.2-T2V-A14B
The T2V-A14B model, supports generating 5s videos at both 480P and 720P resolutions. Built with a Mixture-of-Experts (MoE) architecture, it delivers outstanding video generation quality. On our new benchmark Wan-Bench 2.0, the model surpasses leading commercial models across most key evaluation dimensions.
Wan2.2-I2V-A14B
The I2V-A14B model, designed for image-to-video generation, supports both 480P and 720P resolutions. Built with a Mixture-of-Experts (MoE) architecture, it achieves more stable video synthesis with reduced unrealistic camera movements and offers enhanced support for diverse stylized scenes.
Wan2.2-TI2V-5B
The TI2V-5B model is built with the advanced Wan2.2-VAE that achieves a compression ratio of 16×16×4. This model supports both text-to-video and image-to-video generation at 720P resolution with 24fps and can runs on single consumer-grade GPU such as the 4090. It is one of the fastest 720P@24fps models available, meeting the needs of both industrial applications and academic research.
GitHub: https://github.com/Wan-Video/Wan2.2
Originally HuggingFace Repo: https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/tree/main/split_files/diffusion_models
Description
Wan 2.2 14B for Text-to-Video on-site Generation
FAQ
Comments (42)
Do you HAVE to use sage attention with wan 2.2?
Optional, I ran it using xformers
No, but it's twice as fast as normal generation.
Akalabeth how tf do you get sage attention working on windows? ive destroyed a few comfyui portable's trying to get it to work
cleonthethird me too, but then I found this guide. thank me later :))
Akalabeth You fucking rock!
cleonthethird Glad to help!
{Q&A} why are you afraid to make a real furry model/lora?...
What?
theally I meant why didn't you make a furry model or a lora because maybe there are people who like to do such things (and there is no tier list of the best furry models yet?)
Zannark the whole site is infested with that degen crap, get some help bro
grainted furry is not fetish man
grainted google or ai ask
@Zannark what is yiff then?
@EthrealSkull Yiff is almost the same as furry but I guarantee you there are also cute yiffs (like Protobean).
xD What is this question? Other people make loras.
@7Zack mean theally creating official support furry/dragon for model
@Zannark Why would they do that. They made a base model. Other then can finetune it.
@7Zack In other words: do you know any good furry/scalia mods for nsfw animation???
@Zannark What I would do:
1. Make the image with another model such as Pony or Illustrious:
- https://civitai.com/models/1281214/oops-all-toons-illustrious-v10?modelVersionId=1445455
- https://civitai.com/models/257749/pony-diffusion-v6-xl?modelVersionId=290640
2. Use the furry image you made do Image to video via Wan 2.2. Describe what you want it to do.
If NSFW, then Suggested to use wan lora such as:
- https://civitai.com/models/1501763/furry-nsfw-wan-21-14b-img2vid-and-nsfw-motion-in-general?modelVersionId=1791128
- https://civitai.com/models/1307155/wan-22-experimental-wan-general-nsfw-model?modelVersionId=2073605
In the end you results would be something like this:
https://civitai.com/posts/19453230
https://civitai.com/posts/21133602
@7Zack furry lora is only for 2.1 not for new
@Zannark As far as I know, 2.1 loras do still have an effect on 2.2 Wan. There is a 2.2 Wan furry lora, but for 5b : https://civitai.com/models/1827161/furry-nsfw-wan-22-5b
@7Zack 5b?
@Zannark It a smaller version of wan 2.2
@Zannark Even if it was, why does that matter? Everyone has their thing, you enjoy yours. So long as you aren't hurting anyone, WHO GAF what someone else likes?
So many small minded people here - not you, the people who complain about such things. Can only imagine what THEY must be trying to hide lol. XD
Wan2.2 S2V 14B is missing!
what are the s2v models?
@Rickets newer model with larger training set
Is an B5 FP8 version planned?
Hello, I’ve been using your 5B T2I→I2V generation model, and I really enjoy your work. I appreciate the effort you’ve put into training it. After testing, I noticed that the sexual actions could be improved. At the moment, the penetration often looks almost static, without a clear sense of thrusting or continuous motion. As a player, this makes the scenes feel less immersive. It would be great if future updates could add more dynamic motion to enhance realism and the overall experience.
I didn't care for the 5B model. Use the larger models and reduce the resolution and maybe try some of the speed loras. You'll get better quality that way. You can always upscale and interpolate frames afterwards for a better appearance.
thanks for your work
Is it possible that Loras are simply ignored? At least character Loras that work with other base models have no effect here (14Bt2v)
I did quite a few tests to see the biases. It turns out that doing only "person" in English results in a woman nearly 100% of the time, and doing only "人" in Chinese results in a man nearly 100% of the time. It seems that switching up the language could open new possibilities if you find yourself in a rut.
I've heard people mention that this model may follow Chinese language prompts better than English. If I'm not careful about a prompt with a woman, 9 times out of 10 she may be Asian.
Hi ! All of my render using this model are very saturated. Even when I'm prompt " natural colors " and " oversaturated colors " in negative. My render are in good quality but the colors lack of realism. Any advice ?
Thanks :)
You can always up the guidance scale so the model adheres to your prompts better. You need to be more specific in your prompts (even weighting them) and be very particular about the style you want. (artistic styles will be more vibrant and saturated). Here's more info:
Quality Control Breakdown:
Color Balance Achieved By:
Positive: natural color grading, ethereal magical atmosphere
Negative: (overexposed, underexposed, harsh shadows:1.5), (simple flowers, dull colors:1.6)
Exposure Correction:
Positive: perfect exposure, soft twilight glow
Negative: (overexposed, underexposed, flat lighting:1.5)
Saturation Control:
Positive: iridescent holographic glowing foliage, vibrant yet calm
Negative: (dull colors, no color:1.4) - ensures vibrant but not oversaturated
Sharpness Enhancement:
Positive: sharp focus, ultra-detailed, 8K
Negative: (blurry, out of focus, soft focus:1.4)
Noise Reduction:
Positive: clean image
Negative: (noisy, grainy, film grain, compression artifacts:1.3)
Another pro tip:
Camera & Lens Specific Prompts (Adds Realism):
shot on Phase One IQ4 150MP (ultimate sharpness) shot on Fujifilm X-T5 with Classic Negative film simulation (color) shot on Sony A7R V with G Master lens (detail) shot on Hasselblad 907X (color & sharpness) shot on Leica M11 with Summicron lens (character)love this ! keep at it <3
For some reason it makes only noise instead of video in I2V mode. Used to work fine just month ago
i've notice this locally on my PC as well. I have never been able to get it to work. only 2.1 seems to work
Seems they are disabling certain functionalities, updating can mean models lost. Its free... guess it wont be soon.
resenha