CivArchive
    SapianF - Nude Men & Women for Flux (Now De-Distilled!) - v1.0-FP8
    NSFW
    Preview undefined
    Preview undefined
    Preview undefined
    Preview undefined

    The First Flux Model Allowing Nude Men & Women to Co-Exist!

    Trained Locally on a 3090 using the SD3 Branch of Kohya's SD3-Scripts!

    Whilst definitely still a proof of concept compared to something like Pony, it (often) does what it was designed to do quite well!


    V2.5 Update

    This version is a merge of training runs done on Flux De-Distill and Flux Dev2Pro, both of which seek to remove distillation from Flux Dev. Models were merged w/ a ratio of 0.7:0.3 Dev2Pro:De-Distill. The dataset has been unaltered from version 2, hence why it's v2.5 as opposed to v3.

    The result is FAR greater image quality and generally better prompt adherence at the cost of increased generation times. Example images were generated using a modified version of the DynamicThresholdingFull Node to allow more precise Threshold Percentile values. This can be done by adding an extra 0 to step on line 11 of dynthres_comfyui.py.

    Initial Q8 GGUF release has near indistinguishable quality from FP16 and should run on most hardware. Requires the ComfyUI GGUF custom node. The FP16 version will be uploaded towards the end of the week; FP8 version may or may not be uploaded depending on demand.

    Notes:

    • Whilst the model can somewhat work with the default Flux guidance of 3.5 and a CFG of 1, it is highly advised to remove the Flux Guidance node entirely and set CFG to something above 1

    • Takes anywhere from 2-3 times as long to generate an image compared to previous versions, but the drastic increase in quality makes this worth it IMO

    • It is recommended to use a step count between 40 and 60

    • Due to using the same dataset as v2 it still does have some issues carried over:

      • Female genitalia seems undertrained

      • No NSFW poses

      • Pubic and body hair seems prevalent even if you specify to avoid including it

      • Still has trouble distinguishing between circumcised and uncircumcised

      • Erect/Flaccid is sometimes not interpreted properly with more complex prompts or when generating an image with multiple male characters. It is my hope these issues will be remedied with version 3


    V2 Update

    Introducing Better Prompt Adherence and Anatomy!

    This version was trained on the original SapianF model with an expanded dataset (3x as large) with more aggressive masking and prompting, along with a lower learning rate (22e-6 vs 25e-6). The result is a greater understanding of concepts like erect vs flaccid, dense pubes vs shaved pubes, and an overall improvement to genital anatomy, especially when it comes to male characters!

    The dataset for males now contains 175 images, and the female dataset now consists of 75 images, both with a larger variety of poses, angles, and concepts.

    Images were masked more aggressively with lower non-masked values, forcing the model to focus on the genitalia specifically, with captioning for these new images is far more focused on the subject.

    Learning rate was also decreased to allow the model to be trained for a longer period of time to allow it to better learn the concepts described.

    Model was trained for 6 epochs, with epoch 4 producing the most consistent results. The blocks of this model were merged with the original model by hand with the goal of keeping elements consisted whilst transferring over the concepts which were trained.

    Notes:

    • Whilst a big improvement overall, it's still not perfect. Depending on the prompt and seed elements can still produce suboptimal results, though this is much more rare

      • It is likely that the only way to fully fix this issue is with even lower learning rates, longer training times, a further expanded dataset, and a higher batch size during training, most of which aren't really going to be possible on my hardware for the time

    • It is recommended to run the model with the improved CLIP-L model and LongCLIP model released by zer0int1 a couple days ago. The ComfyUI node for LongCLIP can be found here


    About

    There are plenty of Flux checkpoints out there now that allow for both nude men and women to be generated, with one caveat...

    These models are trained only on members of a single sex, meaning that if it's trained on nude men, any attempt to generate nude women will result in male genitalia being added unprompted. Similarly, attempting to generate nude men on model's trained on nude women will result in female genitalia being added to nude men unprompted.

    So I set out with what I thought would be a simple task: train either a LoRA or Checkpoint to generate both nude men and nude women.

    LoRA training was quickly ruled out due to consistently suboptimal outputs, but after much testing full checkpoint training has clearly yielded better results!

    Training/Dataset

    The dataset consisted of 45 images of nude men, 30 images of nude women, 15 images of nude men and women together in an image (tasteful), and 50 regularization images generated with my regularization workflow. Images were primarily front and side facing, and consisted mostly of standing and sitting poses from a variety of angles. Dataset was resized to 1024, 768 and 512 for multi-resolution training. Masked training was completed by manually drawing a white mask over the genital areas, and setting the rest of the masked area to 30%.

    Of the non-regularization images, 60% were captioned using Joy Caption with modifications as needed, and 40% were manually captioned using natural language descriptions by hand.

    In my testing female anatomy seems to train substantially faster than male anatomy, so image repeats for the men & women and men datasets were double that of the women and regularization images.

    Learning rate was 25e-6, and was run for approximately 7,500 steps. Took 10 hours to train on my 3090! Training was completed on both the regular Dev model, along with a Dev model that had a female-focused NSFW LoRA applied prior to training. Both models were merged together in ComfyUI afterwards.

    Considerations

    • As said earlier, this is still a proof of concept. NSFW is very difficult to train with Flux, and the limited dataset I have can often require some seed searching

    • The model as of now has only a rudimentary understanding of sub-concepts like pubic hair, erect/flaccid, and circumcised/uncircumcised, and thus results are often of questionable quality. You can often still force the model to work with these sub-concepts by playing with guidance settings and searching through different seeds, but more focused training will likely be required

    • If you are only looking to generate images of a particular sex, models trained on that specific sex will usually produce better quality images

    Description

    FP8 version of the initial release!

    Have thus far had trouble with my attempts at uploading the FP16 version of this model to Civit. Will try to upload it as soon as possible so that it can be used for your own LoRA and Finetune creations!

    FAQ

    Comments (11)

    LatentDreamAug 27, 2024· 2 reactions
    CivitAI

    thanks for sharing.. i think you should provide more examples to convince users download 11gb. i know.. initial steps of vanilla models moving to nsfw is hard and a lot of users prefer to wait till something really worth .. thats why i'm saying this.

    Also, you mentioned a LORA, is the lora avaible?

    TheGreatOne321
    Author
    Aug 27, 2024· 3 reactions

    Depends on what you're planning on using it for. If you're planning on generating images that contain both nude men and nude women, I'm fairly certain that my model is the only one that can provide quality images in that department, finetune or otherwise. It also functions as a great base model for further developments of future finetunes and LoRAs. As I said in the description though, if you're only planning on generating nude men or women exclusively, then you likely will get better results out of other finetunes and LoRAs uploaded on here.

    Never uploaded any of the LoRAs and don't plan to as I could never get an acceptable quality out if it. The 4th example image on here showcases the difference in quality between the best LoRA I could train on this dataset and the finetune, using the same seed, prompt and other generation parameters.

    TrueToLife_FauxtoAug 27, 2024· 4 reactions
    CivitAI

    Thanks for the explanation of your approach to fine tuning. I’m rather surprised that the dataset was so small, though, with decent results. I imagine with a larger, more diverse dataset the results will be much better still.

    TheGreatOne321
    Author
    Aug 27, 2024· 2 reactions

    Yes, this would almost certainly be the case. That being said, quality tends to be more important than quantity, and dataset curating and captioning is the aspect of model creation that I dislike the most lol.

    Definitely would be open to assistance on that front for future iterations, and have no problem with people building upon the model on their own if they prefer

    ammorAug 27, 2024· 4 reactions
    CivitAI

    Thanks for sharing it ! I have two questions about this finetune :

    1 - For this version, did you train the text encoder ?

    2 - when creating a finetune with kohya I get a 23gb file. How do you "extract" the FP8 version ? Is there a script to do it ?

    Thanks again.

    TheGreatOne321
    Author
    Aug 27, 2024

    I haven't trained either text encoder, though I am planning on training the CLIP-L as soon as Kohya allows me to do so.

    Load up the model in ComfyUI using the load diffusers node, switch it from "default" to "e4m3fn", then run that model into a save diffusers node.

    12734Aug 27, 2024· 6 reactions
    CivitAI

    this is incredible as info and more of a complete guide than any other post I've seen. Can you please post your actual kohya config file? or a link to a pastebin.

    TheGreatOne321
    Author
    Aug 27, 2024· 4 reactions

    Sure thing! Here's the PowerShell command I use to start training:

    https://pastebin.com/bSLJKzBM

    And this is how I have the dataset file configured:

    https://pastebin.com/WtGKUwxF

    I'll also get on uploading the script I used to quickly draw the masked images. It's basic, but it works quite well.

    12734Aug 27, 2024

    @TheGreatOne321 thanks a lot. very handy for reference. I've found that there's always a tiny value thats easy to miss and I'm not about to risk 10h of training on a mistake.

    I've found onetrainer has a good masking tool, might end up using that to mask and kohya to train.

    TheGreatOne321
    Author
    Aug 27, 2024· 3 reactions

    @I_dont_want_karma_ one thing to be aware of with OneTrainer, at least as of the last time I used it, is that it's mask tool doesn't support grayscale, meaning it'll only produce pure black and pure white masks, which isn't always ideal. It'll also rename the mask files with the suffix "-masklabel", which doesn't work natively with Kohya since Kohya requires the mask and training image to have the same name.

    This is the custom script I used to quickly make my masks:

    https://pastebin.com/4C6SGvZH

    Controls are visible in the Window it opens. Just be sure to change the parameters at the bottom of the script for your specific use case, and install the dependencies with the following command:

    pip install pygame numpy pillow

    WhatTheGuyAug 27, 2024· 8 reactions
    CivitAI

    wow, I think it's the first time I can generate a man and a woman side by side without mixing up body parts all the time! Big step forward =D !

    Checkpoint
    Flux.1 D

    Details

    Downloads
    1,322
    Platform
    CivitAI
    Platform Status
    Available
    Created
    8/27/2024
    Updated
    5/13/2026
    Deleted
    -

    Files

    sapianfNudeMenWomenForFlux_v10FP8.safetensors