Hunyuan Video (Safetensors) - New Uncensored Llama - Comfy-VAE-FP32

NSFW

Hunyuan Video

Kijai marked files only for use with Kijai Nodes You do not need them for Comfy Native

Full Guide to picking the correct file above
Workflow for 8GB Card users
Uncensored llama will work with COMFY Native

Using the Kajai marked models on COMFY native will cause rainbow or black output.

I do not recommend the the FP8 VAE unless you are trying to fit all models into GPU, see the guide for 4090 full GPU launch commands.

Technical details regarding "Uncensored"

The model used for Hunyuan was based on llava-llama-3 8 billion parameter LLM. The Intel vision tuned model was used to refine the tokenized model restoring over 5 million values.

Description

FAQ

Comments (36)

BROKLESNARFeb 1, 2025

CivitAI

Hello, is it possible to run this on the free version of Google Colab?

PhraxasFeb 1, 2025

CivitAI

Is there a page that explains which models/workflows are good for various GPUs? I have an RTX 4080 16gb VRAM and everywhere either talks about 8, 12, or 24 gb.. so I'm having a hard time figuring out where my card fits in. Do I use FP16 or FP8? Obviously I can give it a go but if I have one wrong variable set or a Lora is particularly expensive and I don't know about it, I won't really know if it's wrong because the card is not fit for the task or if I'm the problem.

harmonicdiffusionFeb 1, 2025· 19 reactions

I suggest you give it a go. No one is going to do your personal testing for you

PhraxasFeb 1, 2025· 9 reactions

@harmonicdiffusion I think you misunderstand. I am not asking anyone to do personal testing for me. I am asking for information from the one posting it to explain what it is designed for. If there are other users who have a similar build, they can say what their results were from their own experiences inferencing for themselves.

Engineers post heights of bridges and tunnels before trucks go through. Every ride indicates what height you must be to ride. It's not unfathomable that a model creator give a quick note of what works with what settings.

funscripter627Feb 2, 2025· 4 reactions

@Phraxas It's really hard to say because it also depends on if you have sageattention or bitsandbytes installed for example and there is stuff like offloading to the CPU. I don't know how it works but when I load the CLIP model without bitsandbytes quantization all my regular RAM gets taken up and if I run it with, it's only taking around 10GB of my 32GB RAM.

I'd say go for the FP16 and watch your resources carefully. Things get slow if all your dedicated VRAM is used and it starts using your shared VRAM. The trick is to stay at around 90/95% of VRAM usage. FP8 takes up less VRAM than FP16.

Felldude

Author

Feb 2, 2025· 5 reactions

Video size is a huge factor, I can render a 360x220 video at a decent speed (A few minutes) on a 8GB card but if I try native resolutions it would take hours or days

LG89Feb 2, 2025· 6 reactions

I'm using a 3070Ti 8GB laptop GPU and 64GB of RAM, running it with native nodes in ComfyUI. I start ComfyUI with --lowvram and --disable-smart-memory arguments to help offloading on RAM.

I can load the full models (bf16 diffusion model, fp16 llava-llama-3, fp32 VAE), my RAM is about 75% during sampling. I just had to tweak the tile_size and temporal_size in the Vae Decode (tiled) node to avoid OOM.

Since I've installed sage attention 2.0.1 in my python environment (I'm using a custom node to apply it only in the sampler, but you can use --use-sage-attention arg to enable it globally), it takes about 10-12 minutes for 20 steps (~34s/it) at 640x480 121 frames (~5s video). Without sage attention, it's about 20 minutes.

You can use the workflow from the ComfyUI example page: https://comfyanonymous.github.io/ComfyUI_examples/hunyuan_video/

PhraxasFeb 2, 2025· 1 reaction

@LG89 Ok, so it sounds like FP16 models should be entirely feasible for 16gb VRAM.. but no matter what it seems like it'll be leaking over into the RAM which means slowish times. 10-12 minutes isn't so bad. I'm sure I'll look back on that statement a few years from now and laugh but it works for now.

LG89Feb 2, 2025

@Phraxas 16GB of VRAM is still pretty low, especially with models like Hunyuan; the bf16 model is 25GB, and if you add 15.7GB for llava-llama-3 fp16 and ~1GB for vae fp32 we are talking about ~40GB of memory, it's normal for ComfyUI to offload on RAM (I guess even if you had a 4090).
At least with your VRAM you could reach +10sec video at 640x480 :)

Felldude

Author

Feb 3, 2025· 1 reaction

@Phraxas Make sure your system virtual memory is set high mine is at 120GB on the NVMec drive or I wouldn't be able to load LLM's even with 32GB of system ram some models take 100GB+ when loaded

PhraxasFeb 3, 2025

@Felldude Ah, yeah, good point. This is precisely the kind of thing I could have easily not set and not known why it wasn't working.

funscripter627Feb 3, 2025

@Felldude That's why bitsandbytes is so good imo. It reduces RAM usage of the CLIP model by a huge amount for me.

bitzupaFeb 4, 2025

so any luck? I also have 4080 and i can only get it to work with very low res videos like 340x240, it takes about 30-40minutes to generate 1s clip in 720p.

Cant figure how im suppose to get it to work. Even the "fast" models takes ages, in fact fast models take 2-3h for 1s clip.

Is something wrong with my setup or i just dont understand how to configure it?

btw. illustrious, flux etc work normally so hardware is not faulty.

Felldude

Author

Feb 4, 2025

@bitzupa Do you have the latest version of CUDA installed, many people update comfy and not CUDA

PhraxasFeb 4, 2025

@bitzupa I'm busy with another task that is taking time, but once I'm done with that I will be rewarding myself with play time with Hunyuan.

bitzupaFeb 4, 2025· 2 reactions

@Felldude yeah, did clean reinstall everything when i was switching from a1111 to reforge few days ago.

Read about it a little and many people suggested using basic default Hunyuan workflow and magically it works, even base Hunyuan works with no issues in decen res.

When i try to use custom workflows and other models it all works fine until i go above 640x480, whats more funny setting 480x640 pretty much doubles generation speed god knows why :/ but if i try 720p it takes forever or just hangs at some point. I guess its bleeding too much to ram but then again i use 12GB work flow on 16GB gpu and still can't get it to fit.

Testing GGUF now to see if i can fit it all in vram and confirm if its simply due to running out of vram.

EDIT:

yeah so it was overflowing to ram even on 12GB workflows. Dont understand how its supposed to work on 12GB while it just barely fits into 16GB and even that seems to be random, sometimes i can get it to fit along with lora, sometimes not at all awhile using same model/lora. I dont understand it. Anyway as long as it loads model completely into vram it all works great.

landlord123Feb 14, 2025

@bitzupa any update ? what's best model for 16gb? thanks!!!

bitzupaFeb 17, 2025· 1 reaction

@landlord123 well I use hunyuan_video_FastVideo_720_fp8_e4m3fn and basic workflow for fast, as long as it fits in vram it works great. GGU models are not worth it, they do take less space in vram but quality is terrible.

bartoszaugulFeb 1, 2025

CivitAI

What Kind of update it is, only changing pt into safetensors? also Kijai's fp8 version works with native nodes.

Felldude

Author

Feb 2, 2025

Kijai's nodes will load the full vision model and blocks that Comfy native will not

LetTheBassDropFeb 1, 2025· 4 reactions

CivitAI

Never in a million years would I expect a video model to be easier to use than an image model. FLUX I've never ever been able to successfully get working even with premade workflows there is always some error. But, right out of the gate this works and it follow prompts so well. How they got it to work like that is amazing.

firemanbrakeneckFeb 2, 2025

Same. Wasn't it due to the distillation?

Felldude

Author

Feb 2, 2025

@firemanbrakeneck It has a striking similarity to flux

firemanbrakeneckFeb 2, 2025

@Felldude Beg pardon, I don't follow - how so? You mean structure wise or in output?

CatzFeb 2, 2025· 1 reaction

@firemanbrakeneck The way promoting works in plain language, compared to chunks of small descriptions between commas.

Felldude

Author

Feb 3, 2025· 1 reaction

@firemanbrakeneck The block structure is similar

PetePabloFeb 4, 2025

How, link guide you used please? Xx

firemanbrakeneckFeb 4, 2025

@Felldude Hmm, I think I see it, down to some of the layer counts.

Do you think that if the sequence length were turned down to 1, hunyuan might be cross compatible with flux loras (at least not throw errors if not in terms of content)? Or even if not?

At the very least, it does seem to function okay as an image generator with that setting.

Felldude

Author

Feb 4, 2025

@firemanbrakeneck Its possible though it might not be block name to same block name

firemanbrakeneckFeb 4, 2025· 1 reaction

@Felldude At least renaming is dead simple, comfy does it on the fly, I wrote a small script to convert a few loras and they seemed to work correctly; diffusers' qkv separation for instance is more of a hassle to handle, still not really sure how to do that.

If that's all it takes, and it actually works, even for just some of the existing loras, it'd be really something.

Felldude

Author

Feb 4, 2025· 1 reaction

@firemanbrakeneck I think it would work for image, not so sure about video - One of the things I was going to look at was pruning all the video blocks to try and make a image only lite model out of hunyaun

firemanbrakeneckFeb 5, 2025

@Felldude That's clever. Shall it work out of the box by setting model type to flux, or are there other incompatible parts besides the video blocks? Or is that the sort of thing requiring more careful examination?

Felldude

Author

Feb 5, 2025· 1 reaction

@firemanbrakeneck I have a feeling it would mess up the model to remove all but the single and double blocks/MLP and timestep but you never know

finalfightFeb 7, 2025

I disagree wholeheartedly with this, Getting Hunyuan to work has been terrible, and the documentation from people is genuinely all over the fucking place. For flux once i figured the whole Clip_L and T5 shit, everything else fell in line.

KiefstormFeb 11, 2025

@finalfight agreed

firemanbrakeneckFeb 16, 2025· 1 reaction

@Felldude I guess somebody gave it a shot.

https://civitai.com/models/1258103/hunyuan-flux-sim-schnell-unchained

Checkpoint

Hunyuan Video

by Felldude

Download (Beta) View on CivitAI

hunyuan

tencent

base model