What does this workflow do?
This will take your input image, crop/resize if needed to the ideal Cosmos render size, then automatically create an appropriate prompt for Cosmos to work its magic! The result will be a hopefully-amazing video that your family can cherish for generations.
This process is dependent on both Florence (for automatically describing the image) and an LLM (for creating a video prompt from the image description).
Further instructions and links are included in the workflow.
Extremely simple operation after initial setup (model load/LLM configuration):
1. Load an input image.
2. Queue prompt. Really, that's it. Every other setting should be good to go.
Have fun, and I look forward to seeing your creations!
Expectations setting: This model is HEAVY. Using the 7B model, with the included (optional) optimizations, I am running at about 15 minutes and 20GB VRAM usage on a 4090 to generate a 121 frame video at 1280x704.
Description
Reduced jitter/wiggles/glitches in videos thanks to tips from the Banodoco Discord:
Added max_sigma setting
Changed sampler to Euler A/beta
Improved notes throughout to help people get started and find help for common issues.
FAQ
Comments (25)
Could you provide the link to the AI service model you use that does the automatic prompts please
https://github.com/glibsonoran/Plush-for-ComfyUI for the node pack in Comfy. You must follow the directions on that page to set environment variables BEFORE starting Comfy in order for Groq to work.
https://console.groq.com/keys hosts free cloud inference for models with an API key (you'll need to create an account to get the API key).
you gotta meet my girl florence. she really kicks the llama's ass
im getting over 400s/it speed for 121 frames at 1280x704 with 16gbvram w/o any optimization on windows. thats over an hour to generate. I heard comsos is supposedly faster than hunyuan. my setup must be borked?
Yes that sounds extremely slow. Ensure your torch, CUDA, drivers etc are up to date. I can't provide individual support but to update torch you would activate your virtual environment (venv) if using one, then use the install command here https://pytorch.org/get-started/locally/
@EnragedAntelope I checked my torch version and it's says is 2.5.0+cpu. When I tried updating torch it says Requirement already satisfied for everything. I'm currently on cuda version 12.4 and Python 3.12.7. My graphics drivers are up to date as well. Not sure what else it could be. I'll try reinstalling my drivers. Thanks for the help tho
@bhopping try uninstalling and reinstalling to make sure you get to at least 2.51 cu124
If it is running on CPU then that's def a big part of the speed issue
@EnragedAntelope thanks I'll go ahead and try that
@EnragedAntelope the generation speed seems to be the same. I'm not too worried about it right now. I'll do some more digging tomorrow, ty : )
I am getting 40.46s/it, I am on an Asus OC 4090 RTX, takes me about 26 minutes to run this workflow. Also, I am not using Sage, not sure if that adds more time for others. I am showing the use of 22.5 GB of VRAM.
@yajukun Seems slow, I also run at about that amount of VRAM but complete in about 15mins or less (with Sage, which should save a decent amount of time).
@EnragedAntelope Ahh...Ok, I guess I better tinker with Sage and get it installed. Thanks.
so happy to see that predator's planet has better dental than earth. good for her. not sure how far that dagger will get her but chompers on point.
This doesnt seem to work out of the box, even after you install nodes. says flash isnt working, how to instal flash 2 where to download?
I think I included a note in the workflow?
You can just change to SDPA in Florence node instead of Flash Attention. It will take very slightly longer for the Florence part but really not bad.
Thanks, this works great except I had to bypass Sage. Are we stuck with landscape or does it support portrait resolutions?
Yes you can do portrait, but it only supports resolutions from 704 to 1280. So you get kind of an odd aspect ratio compared to what you're used to but it can work. I have an update for this workflow that helps to automate that, just seeing if I can improve any further before posting.
Not sure what kind of voodoo black magic you guys are using to get Sage working on Windows but it ain't working out for me. I think I need to just reinstall Comfy from scratch if I want the speed increase. Ugh.
Sage can be a real pain. But a nice speedup! Definitely try to follow the guides floating around.
BTW, thanks for your posts, great stuff!
@EnragedAntelope I'll give it a shot this weekend when I have more free time to start from scratch. Thanks.
@EnragedAntelope Ok, after a lot of pain and suffering I think I have my new install working with Sage. Still testing but now it's showing 18.x s/it. Is that about right? It looks like it's going down as it runs, now it's 17.8 s/it showing at %35. Seems stuck at 17.81 s/it, won't go lower.
@EnragedAntelope So after about 5 runs, it looks like my generation times have decreased from 26+ minutes to slightly under 12 minutes with Triton/Sage. Thanks bro, you were right, it really was worth the hassle.
@yajukun Really happy you got it working! Sage will help with other workflows as well (you can add a Kijai sage attention model patcher after loading any flux/sdxl/ etc model to speed those up too - hot tip :) ).
That is a massive speed increase, really great to see.
https://old.reddit.com/r/StableDiffusion/comments/1h7hunp/how_to_run_hunyuanvideo_on_a_single_24gb_vram_card/ i got it working with the installation walkthrough in this post
@klapperklaus YES! That's the same post I followed, I did up to steps 4g. Thanks!