Goddess Project
--Formerly Uncensored Females--
Standalone Checkpoint - Goddess works in FORGE only
DO NOT LOAD a separate VAE, TE, or CLIP unless using GGUF
This version is a mixed precision Flux Dev model, with limited UNET changes to allow for feminine anatomy.
Run this model in automatic FP16 Lora mode NOT NF4
This model is full precision in BF16 UNET with mixed precision (NF4) on the TE blocks
This model fits in a 24GB card and could be run in GPU only mode as such
High speed BF16 with a slightly lower prompt accuracy compared to the 33GB full model
Links (For GGUF ONLY)
Updated CLIP - Standard CLIP-L - FP8 CLIP-L -- ** Version Comparison **
Per the Apache 2.0 license FLAN is attributed to Google
This model is a training using many individuals with known ages and 2257 forms, it has also been merged to try and ensure that no known individuals can be reproduced. However FLUX seems to like to learn faces even with less then 10% data rather then merge them into a new face.
Description
FAQ
Comments (9)
请问,你是用什么软件制作flux模型的?我不想只限于lora制作!
How its possible at all to use 15 Gb size model on 3050 8 Gb? Some weird lies in description.
A few months ago those would have been "lies" as you say - You need to look up CPU offloading and block management
I can run Flux.1-Dev Fp32 (22Gb) on my RTX 3070 8Gb without any problems. It's just slow (3-4 minutes to generate 1 image)
Most of what you are told about AI is by enthusiasts with NO tech understanding of coding, maths, or computer architecture. With an LLM, your bottleneck is memory bandwidth, and you want the language model in VRAM. But with image generation from a BIG diffusion model, you won't even get ONE iteration per second, so the cost of constantly moving the model from system RAM to VRAM in chunks, per iteration, is not a bottleneck. Your SYSTEM RAM works around 50GB/sec, 16-channel PCI-express v5 can match this, but your pcie v4 will be 32GB/s, still fast enough for the entire diffusion model per second.
Do NOT think you understand how all this works- educate yourself!
@blobby99 Most of the code is written by AI in and very un-optimized, but with billions of lines of code its not practical to optimize any more then its practical to hand caption that many images.
The clip and now the T5 don't need to be loaded into memory it usually only takes seconds to load them in and out, having the entire diffusion model in VRAM might save you seconds or minutes depending on how many times you change the prompt.
just use the gguf and you will wonder
but really good question
Details
Available On (1 platform)
Same model published on other platforms. May have additional downloads or version variants.




