Physiogen is my attempt at fine-tuning SDXL for NSFW use.
Remember....
Preview images are intentionally unedited as to show the actual output of the model.
Ensure you are using an SDXL LoRA. SD 1.5 LoRAs will not work.
Heavily weight your tokens if you aren't seeing desired effects. For example: (breasts:1.8), (large breasts:2.0). You will especially need this if you are trying to apply a style.
I'm finding that SDXL takes prompts quite a bit more literally. For example, if you are trying to generate a half body headshot, specifying pubic hair or ass/hips may cause the wrong style of photo to be generated.
You may use this in model merges, I just ask for credit.
Please read the release notes over there. 👉👉👉
Or up there 👆👆👆if you're on a mobile device.
If you would like to chat, please find me on the ✨ Civitai Discord ✨
Description
Please consider this model an alpha or test model. Let me know in the comments what you think. I'm still working on refining front/back nudes, so there is no hardcore support yet.
Notes:
Improvement on 0.1 - more careful selection of training images
Works best with front nudes - back nudes are still getting there.
No hardcore support
New captioning prefix (see below)
I've refined the captioning a bit more. I selected my favorite pictures from my original ~400 image dataset, and ran this with 181 images (if I recall correctly). Also resize a bunch of photos since they were way too big.
After picking my favorite photos, I prefixed each caption with the following. ? means optional. Full body never uses the headshot modifier. It may still affect the generation, so if you use full body, ensure headshot is not present in the prompt.
[half body|three quarter body|full body]? headshot? photo of a [man|woman] ...
Some of the captions used on training images:
(analog photo by Rineke Dijkstra:1.1), half body headshot photo of a young woman, latina, tan skin, long hair, breasts, looking at viewer, smile, black hair, nipples, upper body, large breasts, teeth, grin, lips, head tilt
(analog photo by Rineke Dijkstra:1.1), photo of a middle aged woman, milf, breasts, looking at viewer, short hair, brown hair, thighhighs, navel, brown eyes, medium breasts, underwear, nipples, standing, panties, nude, pussy, lips, black panties, pubic hair, makeup, undressing, panty pull, mature female
three quarter body headshot photo of a young woman, long hair, breasts, looking at viewer, thin waist, smile, black hair, thighhighs, brown eyes, jewelry, medium breasts, underwear, nipples, standing, collarbone, panties, earrings, white panties, white thighhighs, lips, topless, underwear only, garter belt, door
headshot photo of a middle aged woman, looking at viewer, smile, blonde hair, brown eyes, jewelry, teeth, necklace, grin, blurry, lips, depth of field, blurry background
three quarter body headshot photo of a young woman, looking at viewer, smile, brown hair, black hair, navel, brown eyes, swimsuit, bikini, small breasts, teeth, grin, lips, black bikini, side-tie bikini bottom, blue background, curly hair, big hair, afro
half body headshot photo of a woman, breasts, looking at viewer, smile, blue eyes, blonde hair, simple background, white background, medium breasts, nipples, upper body, nude, lips, freckles
half body headshot photo of a woman, breasts, looking at viewer, smile, short hair, blue eyes, blonde hair, simple background, white background, jewelry, nipples, upper body, nude, earrings, small breasts, teeth, mole, grin, freckles, body freckles
FAQ
Comments (11)
Only women apparently?
For now, yes. I'm still learning the ropes with fine tuning. Expect nude men in an upcoming version.
@MachineMinded Roger that, just checking lmao
@gaydiffusion I'm working on rebuilding my dataset and recaptioning everything again. So, keep your eyes out in the next few days. For my own curiosity, is there any "type" of man you want to include? How would you prefer to prompt for nude males? Thanks for the question!
@MachineMinded To my dismay, most prefer uh gay slang like 'twink', 'bear', 'daddy' etc. I recommend avoiding those terms and building up a vocabulary to describe their physique and characteristics instead, be literal. As well, don't favor superhero physiques over others, variety is best. Some flexibility on the penis is also favored i.e. uncircumcised vs circumcised, flaccid vs erect. And the tight end, so to speak. And for my curiosity, are you just doing booru tagging? XL seems very powerful with natural language captions, so I'd love to hear about the experiments that got you to this point. You can also find me on the Unstable Discord, just give a holler in #men-only. I'm currently rebuilding and sorting my own data sets as well, and exploring LLM captioning.
@gaydiffusion So last week I started off with a small set of women NSFW images, and just tagged them with whatever came out WD14/booru. I left them completely unedited but I also trained the text encoder. There have been recommendations from kohya to only train UNET, but I have only done that on LoRAs. I'm torn between going down the natural language path and booru. I feel like in my head, booru makes more sense because it's more or less a list of things I want to see in the image. But, I really would prefer to do what the community wants - and if that means using natural language/sentences instead of tags, then I can switch to that. Anywho, between using booru and training the text encoder, things have been working pretty well. I am going to rework all of my captions though, to follow a format like
[half body|three quarter body|full body]? headshot photo of a [naked]? [man|woman], [hair style] [hair color] hair, [facial expression], [upper body description, could be breast size or chest description in general], [small|medium|large] nipples, [shaved|trimmed|hairy] [pussy|penis], [arms/hands action description], [clothing description], [background description]
I need to run a test with training only UNET and further test out the natural language capabilities. I'm disappointed with BLIP/BLIP-2 for NSFW images. EDIT: I should have mentioned that I did notice how well SDXL responds to natural prompting, instead of tagging... so I should probably explore that a little more in-depth.
@MachineMinded Interesting, yeah at the launch QA they recommended not touching the text encoder as it's right on the cusp of frying as is, lots of training done on it already. And I understand your struggle re tagging, I'm learning to err more on the side of what's best for the model tho and coax people into changing their prompting style instead. There are options other than BLIP now (which can be fine-tuned btw), check out: https://replicate.com/nelsonjchen/minigpt-4_vicuna-13b
can we use lora models with it?
You should be able to. Other users have reported issues using LoRAs. Which LoRA are you trying to use? Is it available on civitai?
@MachineMinded any person Lora I tried 2x but didn't work so how should i search?
@NewbieKÂ Make sure you are using an SDXL LoRA


















