Hunyuan Video
Kijai marked files only for use with Kijai Nodes You do not need them for Comfy Native
Full Guide to picking the correct file above
Workflow for 8GB Card users
Uncensored llama will work with COMFY Native
Using the Kajai marked models on COMFY native will cause rainbow or black output.
I do not recommend the the FP8 VAE unless you are trying to fit all models into GPU, see the guide for 4090 full GPU launch commands.
Technical details regarding "Uncensored"
The model used for Hunyuan was based on llava-llama-3 8 billion parameter LLM. The Intel vision tuned model was used to refine the tokenized model restoring over 5 million values.
Description
FAQ
Comments (118)
I love the effort you put in to all this, but what exactly does this do entirely? Video2Video?
It is a multi use model with text2img, txt2video, and video2video, it might have text to 3D also
Hey, so could you help me understand what this means. I already have a running workflow using hunyuanvideo wrapper.
I use the previously existing 8fp.
I have a gtx1080ti (11gb vram)
And now I have this "scaled" instead but I don't notice any difference.
Any ideas what this does in practice ?
according to the Tencent site, it can save ~10gb of vram (https://github.com/Tencent/HunyuanVideo/commit/5ab5edef9ac734966d5358e793470900ba7db189). Now, that's assuming you configure it appropriately, which i personally haven't figured out how to do yet ;-) i think we'll find out more soon
The stored size is still a factor when it comes to loading the model, right now they do not have dynamic block swapping but that may come in the future. - For those with a ADA generation card FP8 with FP8 attention could be faster then BF16 but for most that is not the case assuming the model can fit into VRAM
no idea what's wrong with the fp8 on my end, can't make it work, only output video noise and give error : unet unexpected: ['double_blocks.0.img_attn_proj', 'double_blocks.0.img_attn_qkv', 'double_blocks.0.txt_attn_proj', 'double_blocks.0.txt_attn_qkv', 'double_blocks.0.img_mod.linear', 'double_blocks.0.img_mlp.fc1', 'double_blocks.0.img_mlp.fc2', 'double_blocks.0.txt_mod.linear', 'double_blocks.0.txt_mlp.fc1', 'double_blocks.0.txt_mlp.fc2', 'double_blocks.1.img_attn_proj', 'double_blocks.1.img_attn_qkv', 'double_blocks.1.txt_attn_proj', 'double_blocks.1.txt_attn_qkv', 'double_blocks.1.img_mo (about a full page of it, just a sample here) anyway bf16 work (very slowly but heh)
Did you make sure scaled FP8 was selected, is comfy updated
I updated it internally, I'm not sure about fp8 scaled though, where does this appear ? Maybe I need to download a new COMFY from the site (I use portable version, so i'm not sure the update is absolute latest when I do)
@NoArtifact Im not sure if https://github.com/kijai/ComfyUI-HunyuanVideoWrapper works with portable
@Felldude Yeah I think y comfyUI version maybe the problem, i'll check with other version later, thanks for the info
edit : And I'm lucky, the latest portable version is only 7 hours ago fresh ;)
@Felldude you keep mentioning this but you never specify which node you're talking about, I mean.. I see that I have a llava_llama_3_FP8_scaled.safetensors selected, but other than that I have no idea what you mean.
any chance you could explain why the need for such a massive llm for this?
There is smaller llm's, ones based on Llava llama 3b just like the one they provide, but much smaller, i tried to look at the wrapper to see if I could force it to use a smaller llm, but yeah, over my head.
They use a math equation but simply the parameter size of the LLM needs to roughly match the model
there's any way to use the fp8 version with the comfy native node(load diffusion model)?
Only when comfy adds the architecture definition to the main git
I honestly cannot tell a difference in quality between the FP8 and BF16. Both put out really good video.
It seems like you can get more motion "jitter" with FP8, at least in my experience.
@tylerburden100 Possibly, after more testing I have noticed what appears to be... I guess just less general movement and detail. Minor, minor details, but less.
I can't get the fp8 model to work with comyui native, is this version specifically for kijai's nodes?
You need to set FP8 Scaled and right now I think only Kijai's nodes work
ok.. nobody is describing what is happening. I am getting nothing but noise on my output, is that the same as you?
I'm just getting noice, do you have workflow example ?
It uses Kijai nodes
Unfortunetly, i am getting really bad quality whatever i do :( Nodes, models are good and intact. Output is like took a blur brush shower. :D
I have the same with the 25GB model, going to try the FP8 now
@FrenzyX fp8 is the same for me. :(
Ended up downloading the models for the Kijai wrapper nodes, those seem to work well for me.
@FrenzyX Aren't they embedded on comfyui or did you download it from github, again ?
@Reelai I think his checkpoints are packed differently, tried a mixed approach first, but ended up using all the checkpoints provided for the nodes from github. Also for me the encoder didn't automatically download, so I manually downloaded that as well. Might have been caused by me not having git lfs in advance, which I do now. Any way, took some tinkering and problemsolving but I am up and running now.
increase sampling steps count
@FrenzyX I found out what is the real problem. If you have 3 seconds of video ( no more than 3 seconds) details will be okay while denoise is 1.0. However, if you have more than 3 seconds of video, you need to lower denoise gradually.
@Reelai thanks, that's good to know, might try it out again in the future, as it might enable some workflows that I can't get to work atm
We're gonna get img2vid locally before gta6
Hi,
I used the 'hunyuan video fp8' vae, and the model I downloaded, but when I try running the example workflow from the hunyuan Video Wrapper in my custom nodes for comfyui (hyvideo_t2v_example_01.json), the output is a static... where is my mistake ? :(
Did you ever manage to fix this? I am having the same issue, and only with this model
@Unhing3d yea, the problem was with the clip, i was using bf16 in configs for fp8 model, i downloaded the other model for the right config (fp16 or bf16, i forgot) and it worked
can you upload this to a platfom on browser? I have nowhere near the GPU needed :(
I have no clue if it is hosted cloud, unless block loading or Q4 quant is supported I don’t have the card to run it either
is there an extantion for webUI Forge that will let me run this without ComfyUI?
Not yet, if you've been using forge for awhile, you can learn comfyui relatively easily. The layout is different but you'll recognize things and get use to it. Highly recommend.
@bhopping I've used Comfy in the past, and other node based systems like Blender's and UE5's. the thing i didn't get about it was trying other peoples workflows, many nodes weren't available for some reason. but i might as well try it again for this
@Hamsome_Skidword yeah that def makes it more confusing. Make sure you got custom node manager so it can detect what nodes to install when using other ppls workflows
@bhopping well... my 6gb card wasn't enough, guess i'll wait for either a lower cost model, or until i can buy a super computer
@Hamsome_Skidword Hmm, I found a reddit post saying someone did it with only 6gb here on this reddit thread. https://www.reddit.com/r/StableDiffusion/comments/1ho2elu/all_in_one_custom_workflow_vid2vid_and_txt2vid/ I can't confirm if this works but he seems to share his workflow and has a youtube video on it as well in the thread.
When I started, I got a lot of vram errors. Turning the res down, vae decode tile and length helps with that. Hope this works out for you.
@Hamsome_Skidword Also which hunyuan video model were you using?
@bhopping i was using this model, but maybe that was the problem? maybe i should use the FP8 model from GitHub. and thanks for the help
@Hamsome_Skidword Np. The bf16 model is pretty resource heavy for the very slight quality improvement (if any) and theres even a fastvideo version of bf8 on github/huggingface which lets you get away with 7 steps from what I believe.
@bhopping i'll definitely have to check those out
do I need this if I am running a RTX 6000 Ada? I have plenty of VRAM so far. Will this be faster? Also, I am using ComfyUI portable.
is 4gb gpu good enough?
no,4gb is too small
I believe you need a lot. Something like 24GB +
@nogo 16Gb is possible with FP8 - 8Gb should be possible with Q4 and if the model doesn’t break it might work in 4GB with Q2
cheapest nvidia 16GB card I can find is about 650€+
not a cheap endeavour
@Felldude ca. how long does the average low res text to video generation take for a short clip?
@nogo If your trying to do CPU something like 30 hours
@Felldude I mean how long if you have a capable GPU
@nogo I have not seen generation times for a 4090 I would guess an average of 15 seconds per it
Are there any tips to prevent the video from rendering slo-mo video? I've always got the output to be at 24fps, but the actual motion of the characters is often in clow motion, sometimes not.
good job!!! is it done with a workflow?
You: Alright everyone, lets all have some fun with a video gen model.
Us: Seeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeexxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
New to trying out Hunyuan Video, can someone explain exactly where I need to put this file or if there is something else I need also?
If you're using ComfyUI, save it on the checkpoints folder
@daily_insight after we need to select it from "CogVideoX Model Loader, PyramidFlow Model Loader or on LTX Load Checkpoint"?
Can it run on Forge
I'd like to know, too. Any good articles on what it is?
yeah..how can we run video on forge?
Don't believe so... only ComfyUI atm
When I try using this, the output ends up being rainbow pixels that looks like a static image on a TV. I am using the correct VAE. I am also using clip l and llava_llama3_fp8_scaled
yeah, me either
Same. Trying some other settings and will report back if I get it to work.
No dice. Tried changing every setting and still just getting static.
Are you using the Comfy Native Nodes? I experienced what you're describing with this Model Card's FP8 file when using the native node workflow from Comfy's site/blog. I switched to the BF16 (25.6 GB) file from HuggingFace and was then able to generate actual video. FYI, though, I have a GPU w/ 20GB VRAM myself, and have no way of knowing whether you might face lower VRAM limits. If you do have lower VRAM, you can indicate one of the FP8 representations in the drop-down in the native node loader, and then I think it does a 'cast' to that lower precision. YMMV. Good luck!
Same here.
Same, looks like I can only generate with the full bf16 model.
Since I first made my finding about BF16 rectifying the issue of generating just static with Comfy's officla workflow and only native nodes, I found some similar troubleshooting discussion in a comment on a different Workflow posted here: https://civitai.com/models/1081086/comfyui-hunyuan-text-to-video-using-loras . The issue (and a 'static'-filled generated video example) were posted there by a user 'supersuika'. This other Workflow's creator noted that the one supersuika had used (I think the official Comfy org one) ontained the 'Flux Guidance node and sd3 node', which his workflow do not, and I take it his does not succumb to that problem whewn using FP8 (I haven't tried his yet myself). So you might either fiddle w/ removing those nodes, or trying his workflow instead. I haven't turned my attention back to HYV as yet, so I can't say. Hope this helps peopel conclusively figure out why this combination bombs out like it does.
Oh, I should add that I don't know what side effect might result (in functionality, or quality of outcome) from omitting either or both of those nodes. I assume they're there for some reason. I just haven't gotten around to experimenting yet. Forgot to point that out specific possibility of something else breaking when I made my last posting.
me either,why fp8 will happen this ?
same here, Why is FP8 like this? It seems that only the BF16 model can be used
Yeh me too, any ideas what is going on?
Same here, can't get it working
@Jhaik did you find a fix ?
@Baskets521 If it is any consolation I get rainbow also, when trying to use the COMFY native nodes, the kaji nodes work for me but they don't have CPU offloading so I go from 6-15 seconds per IT to 60-90 on the CPU
I have the same issue. You're better off using the FP8 model from Kijai if you want to be able to use the native Comfy nodes: https://huggingface.co/Kijai/HunyuanVideo_comfy/tree/main
Just get: hunyuan_video_720_cfgdistill_fp8_e4m3fn.safetensors
@DomDomTomTom Yes that one works, but what's the difference? Is it worse than this other fp8 model?
@sensdiff I serialized in the timestamps provided by tencent for the vision tree but COMFY prunes those, I still don't know why it appears to be fine for some and not for others
@Felldude What?
@sensdiff The ignored block message you see when using COMFY native, same thing with the VAE, the comfy UI native trims a considerable number of blocks from all models including the VAE, kaiji nodes do not
@Felldude It works when using it with the kijai nodes, just make sure to select "fp8_scaled" as the quantization option.
Same problem. What is the fix?
Hi. Maybe it works on a 12gb nvidia geforce rtx 3060? And which of these models could work? Thanks for the help.
fp8 model should work people have gotten away with just 8gb of vram. there's workflows suited for exactly 12gb here https://civitai.com/models/1048302/hunyuanvideo-12gb-vram-workflow
There could be better ones out there tho
I run it on that GPU.
@scooter_de That's great. What workflow did you make it work with? and anything to keep in mind in configuration?
@darkd I used this workflow https://civitai.com/models/1079810?modelVersionId=1212334
@scooter_de Hello thanks for your help, but I have a problem generating the video, the video is just noise, maybe that happened to you and you were able to solve it?
@darkd I'm trying to publish my workflow. I reduced what I had found here to the bare minimum. That way a user could start with it and extend from there. I found many example here too complicated if one only wants to try the basic functionality.
I just posted my workflow here: https://civitai.green/user/scooter_de/models?section=published
@scooter_de Thanks, I'll check it
Guys the model load stuck at 35% with "model_type FLOW" stuck on cmd, anyone know how to fix it ?
I'm new here and I have doubts about using it in comfyui
I had downloaded the full version (but it doesn't seem to run on my 4080 super with 2x8gb ram, any tips?
Full> hunyuan_video_vae_bf16 (400mb) + llava_llama3_fp8_scaled (8.8gb) + hunyuan_video_t2v_720p_bf16 (25gb), is that right? https://comfyanonymous.github.io/ComfyUI_examples/hunyuan_video
Now I downloaded the Full Model fp8 version (12.8gb) but my program still can't run.
The Llava Llama TE version available here (Full Model fp16) weighs 16gb, while llava_llama3_fp8_scaled weighs only 8GB, which one should I use? use?
The VAE version available here is also twice the size (Full Model fp32 (940.27 MB)...
My problem is the RAM Memory I believe, I'm trying to use the workflow available for 12GB Vram but I'm not able to configure it correctly
Can you help me choose the correct Moddel, Text encoder and Vae?
I'll be able to run with only 16gb ram + 16gb Vram?
You might have a system ram issue, if you have a solid state or nvmec try setting the virtual memory to 50GB
it works but took me 5673.56 seconds haha a little long buuut it works
This speed has no value anymore
can you show me a screenshot of the workflow. i keep getting static rainbows.
You're doing something wrong.
Does anyone tried it in Forge? is it works at all?
Was wondering that too. Forge is my UI of choice.
@Dzban Same, i just can't use any other. Sucks the devs kinda bit abandoned the project.
If you're familiar with web1111/forge I highly recommend trying out comfy. You'll recognize a lot of things but ofc the UI is different but it'll click no more than a day. I was too impatient for it to come to forge so I switched and also noticed quality improvements in my images as well
@bhopping Sounds interesting. What kind of quality improvements you get with comfy? isn't it just another UI for the stable diffusion? I'm pretty sure my PC wouldn't handle comfy, that's why i'm using Forge.
@chrisss1 I noticed the faces at lower resolutions looked a lot clearer, so I didn't have to waist time upscaling. It could've just been bc i was using karras instead of uniform for the schedular but I'm pretty sure it's bc the optimization come at a cost of slight quality? Other than that it's pretty much the same tho, I still like using forge every now and then
@bhopping Yeah it's probably due to samplers. I am getting pretty great results in Forge and forge also have many options to improve quality at a cost of performance.
Using auto1111. Thinking of installing comfy. But don´t want to break the auto1111 setup in case I want to use it after the installing comfy. Think it is possible to run both?
@Norrb ComfyUI is a whole separate installation and won't affect your web1111. You can technically have them both open though, but they're two different programs. Also you can configure comfyui to share models, loras, and etc with your web1111 so you don't clog up your space. Give comfy a shot, the learning pays off
@chrisss1 Yeah once hunyuans open-source plan is all done, I'd assume forge would finally get hunyuan support bc of how popular it is
@bhopping Hopefully!
@bhopping Thanks for your answer! I´ll give it a try then.
Details
Files
Available On (1 platform)
Same model published on other platforms. May have additional downloads or version variants.

