๐ The Perfect Trio: Best Quantization Options
1. ๐น Best Small Model: Q2_K
โก Ultra-Fast inference speed
๐พ Tiny Size: 8x smaller than original
๐ป Perfect for low-resource devices
๐ Ideal when speed > perfect accuracy
2. ๐ธ Best All-Rounder: Q4_K_M
โ๏ธ Perfect Balance: size vs quality
๐ง Strong reasoning capabilities
๐ Community Favorite for daily use
๐ฏ Go-to choice for most applications!
3. ๐ท Premium Quality: Q8
โจ Nearly Identical to original model
๐งฉ Preserves complex reasoning abilities
๐จ Superior creative generation
๐ช Best when quality is non-negotiable
๐ ๏ธ Complete Installation Guide
๐ Setup Structure
๐ ComfyUI/
โโโ ๐ models/
โ โโโ ๐ diffusion_models/
โ โ โโโ (basic)๐ SPEED_Q8.gguf
โ โโโ ๐ text_encoders/
โ โ โโโ (basic)๐ clip_l.safetensors
โ โ โโโ (option1)๐ t5xxl_fp16.safetensors
โ โ โโโ (option2)๐ t5xxl_fp8_e4m3fn.safetensors
โ โ โโโ (option3)๐ t5xxl_fp8_e4m3fn_scaled.safetensors
โ โโโ ๐ vae/
โ โ โโโ ๐ ae.safetensors๐ Essential Components
This merged model offers a balanced solution for AI-driven image generation, emphasizing both speed and quality. Whether you're processing single images or large batches, it delivers high-quality visuals efficiently.
๐ค Text Encoders - The Brain Behind Natural Language Understanding
Note: You only need to choose ONE of the T5XXL options below based on your hardware capabilities
T5XXL Options (choose only one):
๐ญ VAE - The Visual Artist
๐ฎ Pro Workflows
๐ Flux Advance: Advanced optimization
๐ง Simple Workflow: Quick setup - just download & install missing nodes
๐งช T5 GGUF: For T5-based models
๐ Special Thanks
Big thanks to city96 for pioneering the GGUF journey! ๐
๐จโ๐ป Developer Information
This workflow guide was created by Abdallah Al-Swaiti:
For additional tools and updates, check out the OllamaGemini Node: GitHub Repository
Description
some times for Phones
FAQ
Comments (28)
Maybe I still think the quality of Flux1-dev is better, can you provide the Q4_K_M of Flux1-dev?
sure why not i'll be happy if you make 10 images for that model ,,, i'm downloading the big one now to not lose any !
Yes!
@AbdallahAlswa80ย Wow, thank you so much
I'm confused, I have the seed randomized but each image is almost exact same composition almost as if I'm using img2img but I'm on text2img. What am I doing wrong?
do you mean K2 ,, more compression mean less seeds try manually and try big difference when try another seed
...this is the same like as city96 , or it's a merge with dev ? howmany steps so ?
8-4 , upscaler 2-4 i made simple work flow for that >>> description
this is the same like as city96 , or it's a merge with dev ? q2 , q4 howmany steps so ?
no this is the merge which give close result of dev in 4-6 steps
K-m city96 hasn't made those edition ,, but honestly without her/his help i couldn't do these quantizations
@AbdallahAlswa80ย sure bro , thanks for him, and you also
but i have that black image problem need to know the reason , i will stick using the gguf for sure wether q4 or q8
@amazingbeautyย what do u use ?
@AbdallahAlswa80ย i will use q8 merge as it a bit more accurate ~ better , that still runs like schnell at 4~5 steps , my ram will handle it. but if it will run with cpu well..i'm just still getting black image output using gguf unet , appreciate any help
@AbdallahAlswa80ย more to know i will share my experience on this , while using initial models even the fp8 15gb ones , it was loading ultra super slowly and it was filling whole ram like crazy and also filling pages~swap. but the gguf i tried , it load very fast like any sdxl model i was using and it filling way way less ram. that's why i said i will pick the q8 version , once it will work with my cpu and not giving just black images.. ( i still not tried your model , but tried 3 others gguf all still output black images with me)
q4k_m <-- best version of Q4 ;) and Q8 <-- nice
Thanks
But schnell is worse than dev in understanding prompts and quality unfortunately ...
i'm confused by your naming conventions (also inherently by Flux model types now that they're exploding) - are these ALL gguf? you mention balanced "q4_1" but that's not listed anymore..?
caveat, i use Draw Things GUI, so gguf isn't compatible yet. i'd love a smaller (balanced) model Dev+schnell, but i'm unsure of what's happening here. cheers
this models all in gguf formats ,quantize all models of gguf is useless , because we search for the best , so i choose in q4 the best is K_M , and the most close to fp16 is q8 , and the best quantize of small is K2 ,,, and all of this quantized i made it its not there yet !
I'm not quite sure yet what I think about this series of models. It's an odd tradeoff for me - not nearly as fast as Schnell but only slightly better quality imo. I found the Q2K to be not useful as it needed at least 12 steps to generate useable images.
The Q4KM was much better and it generated pretty good images in about 8 steps. To be honest though, the quality is just not as good as some other fine-tuned models that have come out recently. Also, I didn't find that there was any discernable VRAM savings with Q2K but maybe that was just something weird with my machine.
to be honest , already base models is , dev , schnell and merged of them ,and all civitai flux models are one of them , its very difficult to fine tune those models in general because they already trained on very high quality data , so the succeed of fine tune those model is very difficult , what ever here i quantized the merged model to the best quantized types ,(descriptions) , to reach best result use Q8 with T5-Q8 , Q2K is made for phones and if you decide to use it , (i think you change cfg ! than one ) for the quality the equation go like this (more complex prompt > more bigger model for unet and T5)
use this workflow https://civitai.com/models/658101?modelVersionId=751231 i didn't see your setup i resulted perfect images with k2-M in just 4 step) please share your result of each steps
also please mention models that have better result
I'm not a model creator so I have no idea what goes into doing this successfully. I'm merely running tests on the models that you wonderful people create and then looking at the results. As I said, I found the Q2K not good for my uses but, like you said, it is intended for much less powerful devices. The Q4 I did have some success with after playing with the guidance and different samplers. I just found that it didn't give me any quality or speed benefits over other models like ArtFusion or FluxUnchained SchnFU.
Sorry, I've already deleted the model so I can't try that workflow but please don't let my mild criticisms deter you in any way. Also, I am mainly focused on photorealism so it might be that your models work better for others who are focusing on different art and photo styles.
@Grumblebuttย try q8 and then make the comparison , what ever those models are merge could you drag the model in comfyui and see ! the origin was here https://civitai.com/models/629858
@AbdallahAlswa80ย I did try the Q8 this morning and it seems to work well so that version is definitely the one to use out of all of them. As for the other models I mentioned, I'm not sure how to check what they're based on so I'll leave that up to you if you want to check. Also, I tried your workflow but am missing some nodes that aren't getting installed when I check in the manager. Not sure why and I don't currently have the time to debug it but I might try again sometime.
i saw your workflow use give a try for this https://civitai.com/models/658101/flux-advance?modelVersionId=751231
Do you do any Schnell only optimizations?
what the question?



