Source (gguf): https://huggingface.co/city96/HiDream-I1-Full-gguf/tree/main from city96
Source (fp8): https://huggingface.co/calcuis/hidream-gguf/tree/main from calcuis
The VAE and text encoders can be downloaded from Comfy-Org here!
This model can be used with the https://github.com/city96/ComfyUI-GGUF node!
💪Train your own model: https://runpod.io?ref=gased9mt
🍺 Join my discord: https://discord.com/invite/pAz4Bt3rqb
Description
FAQ
Comments (61)
Has anyone gotten an abliterated version of llama working with the quad clip loader? Does it even make a difference?
KeyError: 'conv_in.weight'
I'm getting this when trying to load the Q8 GGUF view UNET loader, what am i doing wrong?
Memory Allocate Error, 4070ti
Same GPU: "It won't fit in, Daddy!"
@ZweiBelle Which one are you running? I have the 4070 RTX and it runs the Q8.
https://civitai.com/posts/15645418 Getting bad results with Q2k. Not exactly sure whats causing this...
Which model is best for an Apple Macbook Pro Max M3, 48GB RAM?
In theory you should be able to run the F16 Model. Full vs Dev is only a speed vs quality difference. They both have the same Vram requirements. I switch between both. If you want scenes with lots of detail, then Full works best but if you have a singular focus in mind like single object or 2d sprites then I think Dev is better.
Yes, it has more detail, but the photorealistic images all look pretty blurry and soft like Flux-Schnell. A lot of fine-tuning will probably be necessary in the future.
I am seeing some issues. I have some concerns about this upload. I am running the 4 bit from here:
https://github.com/hykilpikonna/HiDream-I1-nf4.git
and it's quality is top tier for a 4 bit model. The one from github is running directly from a python script, and not from ComfyUI. I mean it could be my settings, but I don't think that is the case. I will do some more testing and comparison, but I am starting to question this upload.
@Ada321 it really changes the game. i use amount (detail 0.15 --> start from 0.10 to end 0.80)
why is f16 only like 32mb?
GB
That would be nice.
will this work with a 3090?
Yes, I believe you should be able to get the q8 GGUF working on a 3090.
@rusty2930 If you have the VRAM I got the f16 version working on mine (24G VRAM).
@MysticMindAi how did you manage to do that?
@RalFinger idk but it works. Tho, the thing is I use the fp8_e5m2 option for the Load Diffusion node. Otherwise, it takes 30% longer to generate. Still works either way.
All I'm waiting for now is a Wavespeed and/or Teacache update for support. For 50 steps, it's nearly 3 minutes per gen. At 30, it's about 1:30, and that would work for illustrations, drawings and the sort but not for realism so much.
for me it runs on a RTX4060 with 8GB VRAM, 64GB RAM. It tooks time but it works without OOM
even the Full model-FP16.gguf works on my Asus RTX 3090 (using 23.6 GB vram) with fully loaded clip text encoders + the model itself
@simartem07 oh nice! how long for each generation and do you see good quality difference
@cutetodeath78409597 i am not sure if its the best setup, i am using python 3.13.2 with pytorch version: 2.6.0+cu124. Using Flash Attention. Set vram state to: LOW_VRAM. DeviceNVIDIA GeForce RTX 3090. Using hidream_full_f16_gguf --> 50 steps, each step takes about between 4.50 s/it ~5.60 s/it. Mostly ~ 250 sec. total for single image (1440 x 1440 px image resolution)
@simartem07 My setup is pretty much the same. The only thing im not utilizing is low vram, and it takes several seconds longer ~260. What scheduler and sampler do you use? I swear when i make full shots of subjects i get this low res look when you zoom in. That's using uni_pc/simple.
@simartem07 Hmmm, so when i bumped up the aspect ratio/img res to 1440x1440 my inference shot up to 8.65s/i. :O I wonder if my settings are off now besides the low vram. It took aprox 7 minutes (~441 seconds). xD
@MysticMindAi you are right, i double checked inference values and my previous post had a typo, values 4.50 to 5.60 s/it is for 1024x1024, and it takes up to 8-14 s/it for 1440x1440. Sorry for that :-)
to be honest i am not yet stuck on a sampler/scheduler pair yet because trying to figure out which works best for what kind of generation style (realistic, illustration, etc..) but mostly using dpmpp_2m/beta for most generations. It's been a pain in my comfui env. to test xyz plot, but if i figure out an easy way, would it be helpful to iterate same seed generations with multiple sampler/scheduler values. Because not everytime a particular one works the best, i agree with you on some generations showing blurry and noisy outputs
thanks y'all for the first hand insight (3090 user, 5090 pre-order)
i'll fiddle around HI Dream once i get that MFBFGPU hopefuly.
@MysticMindAi full workflow?
@JustPara would you like my workflow?
@MysticMindAi yes please.
@MysticMindAi Sorry for answering late. But yes, still possible?
@JustPara no worries. Give me a bit.
@JustPara sent link in chat
@MysticMindAi Thanks^^
Transfering my Wan video knowledge for the file types as this seems to be the same for HiDream:
FP16 > Q8 - Q6 > FP8 > Q5 - Q2. In order, the higher quality, the more Vram you will need. Q version are ment for lower Vram GPU. Q6 is equivalent to FP8, but Q6 is a bit lower quality. So if you can't run FP8/Q6, use a lower Q version for your Vram.
Basically the lower you get, the lower precision your input will be interpreted, but you'll save Vram for higher resolution image. The term is Quantization.
Thank you for sharing your experience!
Appreciate the knowledge share but unless someone has done this quantization wrong Q8 should provide quality similar to FP16 and superior to FP8. I don't know about video models but that's how the Flux models performed which is what HiDream is based on so we should expect the same to hold true here.
@KingLord Thanks for the headsup! I've just looked into it further and you're right. FP8 is at the same level of Q6. I'll update the above order
WOuld I be able to run this on an old RTX3060 12Gb vram?
yes
@RalFinger 30 days have pasted since i clicked que, lmao... but really how long for say 20 steps at 1080
@mystifying i don´t know
@RalFinger cool cool, had to check
It works on a 3060 but you can expect a full 40 mints for 4-5 seconds of video. Video quality is excellent
Are you going to be adding the Hi-Dream FP8 Dev Models?
hey J1B, wasn´t planing on uploading them on civit, too much work
@RalFinger Can you switch the type of this to checkpoint? I found that images posted here aren’t searchable as hidream when the type is workflows. Thanks.
If anyone is looking for the 28 Step Dev model I have uploaded it here: https://civitai.com/models/1515789/hi-dream-dev
The fp16 version of Dev (32GB) is still uploading right now.
@floopers966 thank you for noticing, it must have been changed when I uploaded the workflow
@J1B i linked that model on the model page here
Recommended steps ?? sampler for GGUF model ??
read the repo
Works on RTX 5070 Ti.
Get HiDream Working:
(Note: When using GGUF models the GGUF node needs to be updated to support HiDream)
Hidream Advanced ComfyUI Workflow
If you are getting an image, yet it is blurry or completely bad, check your sampler and scheduler combination.
Sampler — Scheduler combination testing (Civitai)
Sampler — Scheduler combination testing (Reddit).
Iteration speeds are here.
good comparison, thank you!
@simartem07 Thanks. Your welcome.
HiDream can't do anthro it seems. At least nothing past ears and tail. It's an either/or. It's either a human with ears and tail or full of wolf etc. Gonna need a lora.
Details
Available On (1 platform)
Same model published on other platforms. May have additional downloads or version variants.
