Hunyuan Video
Kijai marked files only for use with Kijai Nodes You do not need them for Comfy Native
Full Guide to picking the correct file above
Workflow for 8GB Card users
Uncensored llama will work with COMFY Native
Using the Kajai marked models on COMFY native will cause rainbow or black output.
I do not recommend the the FP8 VAE unless you are trying to fit all models into GPU, see the guide for 4090 full GPU launch commands.
Technical details regarding "Uncensored"
The model used for Hunyuan was based on llava-llama-3 8 billion parameter LLM. The Intel vision tuned model was used to refine the tokenized model restoring over 5 million values.
Description
FAQ
Comments (36)
guys does anyone know which model goes with what model in order to get uncensored results on image2video?
all hunyuan models are nsfw... start with an nsfw image...
@tedbiv arent these all txt2video though? Not img2vido
@nogo i2v models can be downloaded from here https://huggingface.co/city96/HunyuanVideo-I2V-gguf/tree/7ab7da2ba52e699e66592d7ba1bf44eb91e30c67 but all hunyuan models are uncensored.
@nogo here's a video with one of my loras. the nudity is not from my lora, the body rotation is. this is base fp8 hunyuan plus my lora. https://civitai.com/images/67958761
@nogo other version/types are here https://huggingface.co/Kijai/HunyuanVideo_comfy/tree/main
@tedbiv cheers, I've been wondering where to find the image2video model
I like this but I'll have to wait for the feeling in my hand to return before I can try it out on my machine.
Many Thanks.
If I knew what I was doing I'd be a danger to society, but luckily I'm only a menace.
Just remember: Don't be a menace to South Central while drinking your juice in the hood.
how does this work? when i try something in stable diffusion it gives me just a grey blank image. and i dont have a text to video option in this program. sorry if it sounds dumb but i dont know very much about coding and all that
You need comfy UI
What is Comfy-Diffusion-FP8 different from hunyuan_video_720_cfgdistill_fp8_e4m3fn.safetensors
both are diffusion models I believe.
not necessarily using this model, but it's normal for a rtx 3060 12gb take ABSURDLY LONG to render a video even in t2v? like 10+ minutes, and 40+ on i2v
If your having to blockswap do to running out of VRAM it will go from 5-15 seconds per IT to minutes or more. Try a smaller video size
@Felldude Thanks for the answer, but can you explain a little more? I'm still kinda confused
@davifpinto Reduce the latent size to 272x480
@Felldude I suppose i do a upscale later, right? I was using 512x512 or 768x768. Do you know any good workflows?
@davifpinto square is not recommended, you can likely find a few articles discussing with workflows
I have 64Gb Ram with same GPU and it filled 100% when rendering videos. But will less Ram videos not even created.
That’s not enough vram… I have enough trouble with 24gb.
At those timings you’re running out of VRAM and using regular RAM which is just not reasonable.
@az420 12GB is enough you can even do it at 8GB, the biggest factor is video length and size.
@az420 Not enough vram? Jesus
@davifpinto I stand by my opinion, 24gb only goes so far... I'm not even considering a 5090 because 32gb doesn't seem like it'll materially change anything either. We need a 48GB or more consumer card to really hit the next level.
But yeah you can generate short low res stuff with lower VRAM.
Gotta make dinky stuff then supe it up or it's OOM central. @az420 I'm hoping AMD can pull a miracle out of their ass in the form of a 64gb a card and outdo CUDA in one fell swoop, fingers crossed
@admiral_underpants indeed! they need to do SOMETHING anyway... they've been playing catch up for way too long. It can't be hard for them to just put much VRAM into it... i mean, even 'slower' might be acceptable if we can actually fit things in VRAM.
I'm using a 3060 too, and ComfyUi Portable. It was taking 90 minutes from start to finish to do an i2v with most time taking to load models. I moved Comfy to a dedicated M2 SSD (had a spare slot on my board), and went from 16gb to 32gb ram (still with 12gb VRAM). It now takes 15 mins max to do a 4 second clip. Not sure if this helps, but it might.
There are so many options here, is there any way to find recommended files vs gfx VRAM / system RAM? The sheer amount of massive files and bewildering options I have a headache just trying to figure out whats best. I have a 12GB 3060 and 32GB RAM. If anyone can give me a hint on the file?!
Diffusion Comfy FP8 is what I would use - VAE and CLIP you could use BF16 - for the TE the scaled version that is linked is recommended
thanks for the info!
Unsure what to use haha I have a i9 10900k cpu . 64gb system ram and a RTX 4080 gpu . Sorry for the noob question haha
@djacid Even the 24GB 4090 does not leave enough room with the BF16 UNET model to not have to block swap via the CPU - I personally would use the FP8 diffusion (UNET) and the FP32 VAE and with a FP32 CLIP and TE (Just make sure the high VRAM flag is not set or it will try to use the GPU for the CLIP)
@Felldude thank you so much for the super fast response. I will try this out later on today
@djacid Yep, I wrote a guide and linked it
@Felldude thank you so much, yea i was unsure where to get everything from haha, I'm still new to video gen so was a little unsure if i got the right files and if they were in the right place,