[Edit:
Version v5.0 works with latest comfyui (v0.15.0).
If you have any problems, please refer to the FAQ at the bottom of the page or have a look in the comments.
Many thanks to everyone who tested this workflow. Thank you very much for the many inquiries and, of course, for all the knowledge and experience you have contributed. here馃憤馃檪
Special thanks to:
@SeoulSeeker for the "Dead Simple MMAudio" workflow wich are the basis of the audio part here,
@taek75799 for the really well working enhanced models
@Bakazaya pointing to the color issue in version v3.0 and running lots of tests,
@bluntfeather sharing latest experiances with installing Comfyui-Easy-Install,
@nitrovtx for remain persistent in matters of quality and running a lot of tests,
@Icey64 for providing the link to "Comfyui-Easy Install",
@boinobin730 for asking for a First to Last Frame option, running pre tests and responding fast as hell 馃檪 and
@SnowShoes311 thank you so much again for all your buzzing 馃構]
Features:
Optimized Wan 2.2 workflow, runs perfect on RTX 3060 12 GB VRAM GPU and 32 GB RAM,
"Text to Video", "Image to Video" and "First/Last Frame 2 Video" generation in one workflow, all with easy audio generation,
easy installation/model downloading, all necessary sources are specified,
easy to use workflow, clearly structured, all necessary steps are explained,
easy switches for mode selection,
easy prompt selection for fast prompt creation/testing,
easy switching between "standard" and "enhanced" models,
very fast and smoth high quality outputs up to aprox. 1440 x 960 with 60fps,
2x fast upscaler,
4x fast framerate multiplier,
MMAudio Sampler (generates sound accordingly to the video action),
Triton and Sage Attention option,
A 5 Second long high quality video generation takes about 10 - 15 minutes (see below).
Tested generation times:
As a rough guide value for RTX 3060 GPU: generating a 5 second long high quality 1440 x 960 60 fps video with 6 steps it will take:
t2v: around 10 - 12 minutes,
i2v: around 15 minutes.
Comfyui-Easy-Install with Triton + SageAttention:
This workflow should work with any latest comfyui version >v0.6.0 (Desktop, Embedded, Windows/Linux).
However, comfyui is developing rapidly, and it often happens that some of the custom nodes used are not updated quickly enough or not updated at all. Manual workarounds are sometimes necessary. Furthermore, care must be taken to ensure that there are no conflicts with other nodes.
If you're having difficulties with your existing comfyui system or if you want to run video generation on a separate (parallel) comfyui system, like I do, I would recommend you the following installer: https://github.com/Tavris1/ComfyUI-Easy-Install.
Complete installation of comfyui including manager and some pre configured custom nodes is just one click - really 馃檪
Installation of Triton + SageAttention is just a second click - really 馃檪 And since it's so easy now, I would definitely recommend it to you for video generation.
Cause it is an embedded version, you can install it parallel to your existing comfyui version without the risk to ruin your working system.
After installation just configure the "extra_model_paths.yaml" file to use your existing models.
After a fresh installation of Comfyui-Easy-Install you might have some issues too, but there are known workarounds - please see the FAQ below.
For testing/understanding/experimenting/changing the workflow:
Click "Toggle Link Visibility" to see the links.
click the Subgraph symbols to open the Subgraphs.
for quick testing you may lower the settings for: steps, clip lenght and video resolution,
be really carefull with modifying Groups or Subgroups (even Titel or Color) cause they are essential for switching,
feel free to try and test other models. Just give me a hint if you find models which deliver better results and fitting the 12 GB VRAM limit.
And as usual: Have Fun 馃檪馃檪
Short Conclusion:
This workflow is based on elements of a variety of allready published workflows. My "job" was only to put things together, optimize it for a small machine and create a most simple and hopfully user or even "beginner" friendly workflow.
I`m not an "expert" - just a user who wants to get it running on "available" hardware.
There are many things I don't really understand. If you find mistakes or better solutions please give me a hint.
And I really hope that even "beginners" have a chance to go the first steps...
Frequently Asked Questions (FAQ):
For quick and better overview I will try to merge all known issues here - step by step (please be patiant). If your issue is not listed here, please have a look in the comments first. Most issues have been allready discussed.
Comfyui Nodes 2.0:
Turn off Nodes 2.0 in comfyui (use comfyui menue). Actually not all custom nodes are supported.
Comfyui crashes after generation while vae decode, upscaling or frame rate multiplying (Rife VFI) without any error report:
This is a RAM problem (not VRAM). Increase your swap file (min. 64 to 128 GB) or set it to automatic management on a fast drive with at least 100 GB free space.
JW Nodes (JWFloatToInteger, JWIntergerDiv, JWImageResizeByLongerSide), soundfile missing:
For the workaround look here and here:
python -m pip install soundfileFresh Comfyui-Easy_Install Installation (missing soundfile and Pytorch v2.9.0 issue with SageAttention on Windows:
For full conversation look here.
Open cmd in python_embedded folder:
python -m pip install soundfile python -m pip uninstall -y torch torchvision torchaudiopython -m pip install torch==2.8.0 torchvision==0.23.0 torchaudio==2.8.0 --index-url https://download.pytorch.org/whl/cu126Slider Nodes - how can I modify the "default" values:
Right click the slider node, choose Properties and set the values you like 馃檪馃檭
Description
Switched back to Wan 2.1 Lighnting LoRA as "standard" setting, cause this seems to work well in most use cases and does not tend so strongly towards "slow motion" effects.
Changed KSampler to "euler + simple" as used by the official workflow.
Fixed some minor "bugs".
FAQ
Comments (68)
I gave this a shot tonight to see how it worked. I keep getting an error regarding a node - can't load image based on number.
Can't find it anywhere in custom nodes. Suggestions?
[edit: ok, I see. You can`t install the node. Search for mikey-nodes in the comfyui manager].
That`s the option for random batch loading images. Turn it off or just enter your local image path if you like to use it 馃槈
edit: swap file issue; increased from 2->32GB
This workflow crashes (the process closes, without any relevant logs) while running the upscaler step. After i bypassed it, the workflow works perfectly. (rtx 4070 super 12gb vram). I ran it without sage attention with default settings, I2V with first and last frame.
[edit: right test resolution is: 720 x 480]
Didn`t know your gpu, but it should run without problems. You have selected the exact same upscaler? Any other heavy tasks in the background? At least 32 GB RAM?
Just to be sure - please do a test run with T2V: 5 seconds, 6 steps, 720 x 780 resolution and report back.
I notice people suffering for OOM - out of memory - errors, or general crash of browsers. I was one of those poor souls. Even with 12GB VRAM and 32RAM. Then I noticed, for God's sake (and please spread the word!)
1 ) INCREASE your SWAP file (windows or linux)! Not 8gb, but, say, 30GB or more (here I set Minimum 30gb, Maximum 30gb)! If you have multiple SSDs, you can organize it to be the fastest one. Some says parallelism is better (15GB SSD1, 15GB SSD2) but I guess there is no difference because swap file are accessed sequentially.
2 ) If the worflow crashes in VAE DECODE, use a VAE DECODE TILED node, with 256 tiles. If OOM, set 128. If OOM, then BYPASS vae decode, BYPASS video combine and everything later. And then create a new node (SAVE LATENT) and connect to it instead of vae decode. It will save in \latents folder. Then you COPY to \input folder, and use another worflow with: LOAD LATENT, then video combine, upsample and the others you bypassed. Then continue your work and be happy.
@schsch聽First of all, my above comment was a typing error: right test resolution is 720 x 480 of course - please sorry. If you used the wrong resolution, please could you test it again?
Ok, to configure a right swap file size is generally essential if you run into RAM errors. To use a tiled vae decode is possible, but with the above max. resolution it should work without it.
@schsch聽this solved my issue; i had 2GB swap file and i increased it to 32GB; works perfectly now, thanks!
@pikapikarr529 Can`t really see much on your small screenshot. But if there are now sliders then definatly somthing is wrong at your side: browser refresh? node refresh? all nodes installed without errors? etc.
@arkinson聽I have the same issue. If I reload the nodes the sliders reappear but disappear within a second.
@ajreyz1384聽@pikapikarr529 Please look her: Sliders fail with new update 路 Issue #38 路 Smirnov75/ComfyUI-mxToolkit This is an older issue, but in my opinion you might have a comparable problem: a conflicting node. You could try to find the node or even open a new github issue or just replace the slider nodes with similar input nodes for example.
fixed by disabled mixlab node
@pikapikarr529聽Thank you that fixed it.
Awesome work! thanks for your service :-)
Hi - thank you so much 馃憤
Insane workflow. Finaly i'm able to generate videos after wasting many hours on bad tutorials.
I do not understand what the 4 pos prompt text area are for. I tried to use text2 and got TypeError: 'NoneType' object is not iterable
Hi - thank you so much. They are just for your comfort 馃檪
For example: use No 1 for generating and No 2 for creating a new prompt meanwhile and No 3 to "save" your best one for quick use. Of course, allways choose a prompt with some text in it, otherwise you will get the 'NoneType' object errorr.
So I was just troubleshooting the same error in Comfy, and your post here made a light-bulb go off over my head.
Arkinson, it would seem text to video only works with enable text 01.
enable text 02-04 shoots out the TypeError 'NoneType' error. Something to look at for the next iteration.
[edit: you are right, see my next comment]
@thegrotfarmer353聽Workflow version v2.1: activate text node 02 for example and type in a prompt. This will work with T2V and I2V. If you get the "TypeError 'NoneType'" this simply means there is a missing input somewhere. Most "user" errors are: Activated more then one text node, no prompt typed in, using wrong models, not selected a model, accidentally deleted connections, etc. Please check twice.
@thegrotfarmer353, @pinoka you are right. In version 2.1 I accidentally set a wrong prompt connection, which means that for T2V only text 01 works. I never realised that.馃檮 Thank you for your hints馃憤 I will publish a fixed version sonn.
I get soon a RTX 5060 TI 16GB but wanted to test your Workflow on my old Card with only 6GB VRAM.
It's working.
Time: 35+ min
Generation Steps: 4
Without Upscaling or Triton.
Hi - thank you for your feedback. I would never have thought of even trying that with 6 GB VRAM 馃檮 And thank you so much for your buzzing 馃構
@arkinson聽I have to say thank you for the this great Workflow. So take the Buzz and a big thank you 馃憢馃徏
@Cookie_Collector Good Luck with the new "toy" 馃槈聽
Hi, I'm new to ComfyUI. I'm getting a blurry screen instead of a normal final render. Do I need to use the high and low pass lightx2v loras along with the other loras I want to use? I downloaded everything, followed the instructions, then filled the prompt, set the weight to 3 for the high pass and 1.5 for the low pass, but I'm still getting a blurry image. Just a blurry nothing. Res is 480x480. I have RTX 3060Ti 8gb
Hi - welcome to comfyui 馃檪 Video generation generally is pretty advanced. If you are really a "bloody comfyui beginner" I would advice you to start very simple (look for the simplest workflows you can find) and learn step by step.
Blurry outputs: I would guess you use wrong Models/Loras/weights. Please do a simple test: use my version v2.1 and start a simple T2V generation without any additional Loras. That should work out of the box.
@arkinson聽thanks man. So what is worked for me - for I2V I choose default Loras and Q2 high and low models + additional nsfw high and low Loras, all weights are 1.0, on all Loras, even on the default ones. And now it works. So I should choose 4 Loras for nsfw generation, am I right? 2 default ones on main node that is above 02 Batch Image description node and 2 additional for 04 Additional Loras node. 720x480 3 sec render at 6 steps took 4 minutes on my GPU
@wemite聽I`m glad you got it running 馃檪 Yes I know, in the begginning it is all a little bit confusing.
You must always distinguish between Wan 2.1 and 2.2 Loras. Even we are generating with Wan 2.2 models you can use Wan 2.1 Loras too. The "trick" with Wan 2.1 Loras is to use higher Lora weights (see my desription in the workflow "04 Additional Loras").
The "default" lightxv2 are Wan 2.1 Loras. and are necessary for the fast 4-6 step generation. You allways "have to use" them. I would recommend you my "default" weight settings: 3.0 High / 1.5 Low.
In the Power Lora Loader node "04 Additional Loras" you can add all the Loras you like. For the weights just see the above mentioned description in the workflow.
Please keep in mind, these are just the "best/optimal" settings for most users to get started. If you want something more "confusing" to play with, you might have a look here: https://civitai.com/models/1852904/wan-22-workflow-optimized-for-rtx-3060-12-gb-vram-gpu?dialog=commentThread&commentId=955554
Btw. it is really amazing you got it running with 8 GB VRAM and the Q2 models 馃憤
@arkinson聽thank you so much. It really helps <3. Before I worked in 3D animation for 3 years. I certainly had to work with nodes a lot, and I also worked a little with A1111 back in 2022. But the node system in AI + ComfyUI made me think a lot :D
@wemite聽I believe the most annoying thing is that there are now thousands of nodes and almost everything is poorly documented or not documented at all 馃檮
@arkinson聽btw its working perfect with Q4_K_M now. and faster lol, idk how. It's rendering with perfect details for 480p. With Q2 it was very blurry, not even close to that what I have now.
@wemite聽Thank you for the information. Yes, there seem to be significant differences between the models. To be honest, I don't understand the whole nomenclature of the models 馃檮Unfortunately it is mostly all try and error.....
480p - So you can generate 5 seconds with 480 x 640 with 8GB VRAM? Sounds good 馃檪
@DerDaAgropesca Hi - thank you so much for buzzing 馃檪
I tried several other workflows, but I got tired of generating videos because I encountered node errors or it took over 30 minutes to generate a 5-second video.
When I tried v2.1, by simply adding a few nodes, a model, and Lora, it ran smoothly and I was able to generate videos of satisfactory quality in a short amount of time.
I'm using the desktop version so I haven't been able to try Sage Attention, but even so, using i2v on an RTX4060Ti with 16GB VRAM and 32GB RAM, I was able to generate an 8-second video in about 15 minutes, and a 5-second video in 8 minutes.
Thank you so much for sharing your great workflow.
Hi - thank you so much for your feedback 馃檪 I had exactly the same troubles you described a few months ago - endless testing of mostly undocumented workflows with lot`s of errors and mostly made for 24 GB VRAM. So, after getting the basics running on 12 GB VRAM the idea was born to publish a "mostly" organised workflow with clearly mentioned specifics 馃檪And seriously, I would never have thought that it will get more than 7k downloads 馃檮As I can see from the comments, it seems usefull for a lot of users even some have still problems to get the "basics" running.... It is very interesting - and I have learned a lot for myself over the last weeks 馃檪
Yes - Sage Attention is not as important as often portrayed. But if you like experimenting I can really advise you installing the comfyui embedded version parallel to your desktop version. It is a perfect testing environment with no need to touch anything on your working desktop version.
And thank you for buzzing 馃構
I spent all day trying multiple workflows for my specs (3080ti 12GB) and this is by far the "simplest" one to get going, so nice work. Are there any recommended "general" LoRa aside from basics to use? Any favorites?
Hi - thank you so much and I'm glad it's useful for you 馃檪For myself I mostly use character Loras or concept Loras to force special actions for example. Cause especially for I2V it is sometimes hard to get the right movements without Loras. Except the allready included lightx2v Loras for speed I did not use any "general" Loras. But maybe others coud give some recommendations e.g. for "styles", "details" etc.
Hi team. I'm having trouble getting Comfyui to save the metadata, so that when you upload to civitai it automatically recognizes the resources you use. Despite having the save_metadata on true in the Video Combine Final video node, it is still not auto adding resources.
What is your secret? Saving as a different file type? Dark sorcery?
@thegrotfarmer353 Hi - as far as I know there is actually now way to save a video with readable metadata for civitai like the "save image with metadata" node. If you set save_metadata on true in the "video combine" node it saves the metadata of course (including your workflow) - but they are not readable for civitai. So the dark sorcery is to manually add your data in civitai 馃槈I know, that is is really annoying 馃槙 There are may be some "tricks" to get it. If someone finds a simple solution, please let me know.
@arkinson聽Well thanks for getting back to me. I'll start furiously clicking.
Hey everyone, I downloaded all the files from the links inside the workflow, but when I run it, it silently crashes when it reaches the CLIP loader node. What should I do?
@aigeneViolinist Any error messages? Your system is up to date? Right Swap File size? Just look here: https://civitai.com/models/1852904/wan-22-workflow-optimized-for-rtx-3060-12-gb-vram-gpu?dialog=commentThread&commentId=957940
Hi, thanks a lot for this workflow.
Can I use this Lora with your workflow and how would it work?
https://civitai.com/models/1811313?modelVersionId=2190476
?
Thanks a lot!
@tsfsdfdsf32323 Hi - thank you. You can try/use any Wan Lora. Just look at my desription in the workflow - see at additional Loras.
Any tips on getting the camera/PoV/viewer to move?
T2V or I2V???
@arkinson聽Sorry; I2V
@darklordofnoobs432聽i entered camera pans and zooms in in my text entry and it worked ok for me
@darklordofnoobs432聽OK, T2V ist mostly very easy. For I2V it is sometimes really hard to get the desired motion. You can try "stronger" prompting or increase the weights of your prompt like usual, for example: "(pan camera from left to right:1.1) and (zoome in:1.2) slowly". But mostly it depents from your start frame image. Trying another image is often the best/quickest solution.
And just to be sure: use my standard setting with the old Wan2.1 lightx2v Lora, cause some newer Wan 2.2 lightx2v Loras produce extreme slow motion or nearly "no" motion.
This is great! 10 second HQ video in 420 seconds. Thank you for this.
@Softfish23 Thank you so much. I`m glad you like it 馃檪 It seems you use a very fast gpu 馃檮cause 7 minutes generation time is about 3 - 4 times faster than mine.
This is by far the best workflow for us 12GB VRAM poorfags. Thank you!!!
@revel72349621 Than you so much - and happy generating 馃檪
I commented on this 2 months ago and it still the best workflow for wan 2.2
@ThaiAI0 Thank you so much again - I`m glad this is helpful 馃檪 And thank you for buzzing 馃構
The best workflow, can you also add faceswap for a more consistent face?
Hi - I have no experiances with video face swapping. So what are your expectations?
- swapping a face in an existing video?
- generating T2V with a specific face?
- enhance face consistentcy in I2V generation? ect.
Wich tools do you allready use?
@arkinson聽enhance face consistentcy in I2V generation, i know there is a comfyui module (comfyui-reactor-node) but got banned, you can find it again but with nsfw filter, which can be disabled
@fzdxgfchgvjhkjl聽Mmh - this seems to be a "normal" face swapper for images. My question was more about your experiances with video generation 馃檮
Especially in I2V I have nearly no problems with face consistancy for myself - except you try to generate very long clips. But in this case you have a lot of addional problems, like looping etc., too. A good workaround is to use the first to lastframe option.
I'm suddenly getting crashes when the blockswap occurs on the low ksampler. Any ideas what could cause that?
@Applefees Any error message? RAM/VRAM issues? Changed/updated system/installed other nodes??? etc.
@arkinson聽me too. in console says
Warning: Ran out of memory when regular VAE decoding, retrying with tiled VAE decoding.
C:\Comfy\ComfyUI_windows_portable>pause
Cannot thank you enough - thought I'd never get i2v with my 3060. Now I'm getting good results using as low as 256x256 start images. Fkg amazing, thank you!!
Thank you so much and happy generating 馃檪
Very good, working great!
But I'm having trouble with Image2Video.
I verified that only the following nodes were enabled:
- Enable 02 Images to Video
- Enable 01 Images to Video
- Enable 01 Single Image
1 text prompt:
"Dancing"
When I generate the video:
- it gets the image size
- it generates a video only from the text prompt with the image aspect ratio.
any idea on how to fix it?
@Cheguevara Mmh - hard to guess what your mistake is, but your promt seem not to be descriptive. Please run a simple test: As a start image use an image of a man in a blue jacket for example and create a simple descriptive prompt like: "the man in the blue jacket is dancing". This should work out of the box.
@arkinson聽in reality I was using the Unet t2v instead of i2v... anyways thanks for the help and great work!!!!!!!!!