V2: Updated dataset and added Wan 2.2
With V2 of the LoRA I significantly increased the dataset from approximately 100 images to around 300 images. I also added a significant number of images with a vagina visible to help increase the accuracy when creating that. I also made sure to either crop or edit out any watermarks present on any of the images of the dataset. The LoRA still struggles to generate good looking vaginas in both Qwen and Wan, though it has seen some improvement.
Unfortunately, the LoRA still struggles to generate vaginas, but it has improved from V1 and the general realism of the LoRA seems to have gotten better for Qwen with the larger dataset.
For Wan 2.2 the LoRA is incredibly realistic, and is great at creating the style and body type, however it can generate disfigured images with bad anatomy at times. It also has a tendency to add watermarks in the corners of images for some reason, and I am not sure why this is happening. I will work on trying to remove that for V3.
For V3 I will go through and caption all of the images in the dataset to hopefully be able to fix the watermark issue as well as the sometimes poor anatomy on Wan 2.2. I will also be removing the trigger word for V3.
Trigger word: nud3
V1: Released for Qwen Image
This is my first attempt at training a LoRA. I trained it on a bunch of random nude photos that I thought could make a good LoRA. For some reason it does struggle with generating a vagina, but the breasts are typically pretty good. It also leans toward caucasian women with larger breasts.
I used it on weights between 0.5 and 1.0 and it seemed to work most of the time. You'll probably have to experiment a bit as I'm not quite sure what a lot of the different settings do. (I'm very new to all of this)
Trigger word is "nud3"
Open to feedback :)
Edit: If you want better looking girls without the added nudity, you can still use the LoRA at a strength of 1 but just don't use the trigger word and it will tone itself down a lot while still keeping some of the aesthetic.
Description
FAQ
Comments (15)
Great job. How does one train a Qwen lora on Civitai? Is there a guide? The list of models available for training doesn't seem to include Qwen.
Thank you! Im thinking of trying again with a more refined/better dataset soon so we'll see what happens!
As for the process of training, I trained this on my own computer using AI Toolkit (it took about 12 hours to train). I'm not familiar with how lora training works on CivitAI (and I don't think have enough buzz for that haha).
Wow! Looks great ! I know CivitAI hasn't rolled out QWEN LORA training yet.. did you you train this on other site or on your computer? What is the hardware configuration you have? Processor, RAM, GPU.. I am curious.
I'm currently training a lora on Qwen-Image-Edit (which is more demanding than the non-edit model because its trained with 2 images) and its only consuming 11.3Gb VRAM. With a 12Gb NVIDIA GPU 3000 series or newer - you can train on this model using Musubi Tuner.
- First install it and read everything. Musubi has an instruction page dedicated to each model it supports.
- Run the VAE cache and Text_Encoder cache scripts first because this way you won't need to load those during the actual training.
- On the actual training command be sure to use the following options:
--fp8_base --fp8_scaled --fp8_vl --sdpa --mixed_precision bf16 --gradient_checkpointing --blocks_to_swap 45
(You should change the value of 'blocks_to_swap' depending on how much VRAM you have - check the Musubi Qwen's page)
PS: for 'blocks_to_swap' to work I had to run the 'accelerate config' command and follow these instructions. Then I also had to do the following: NVIDIA Control Panel > Manage 3D Settings > CUDA - System Fallback Policy > Prefer No System Fallback. You may want to revert the changed value in the NVIDIA Control Panel back to default after training.
Thank you! I trained it on my own personal computer using AI Toolkit. I have an AMD Ryzen 7 3800x, 48 GB of RAM, and an RTX 3090 (24GB VRAM).
Using AI Toolkit I basically just followed the tutorial on the Ostris AI YouTube channel. (the settings were 3000 steps, weighted timestep type, and 3 bit with ARA for quantization. I unloaded the text encoders and used a single trigger word.)
Overall the whole process took about 12 hours to train, and it used about 18-20 GB of VRAM.
@penchopenchov AI-Toolkit recently added support for text_encoder cache as well so you can set it to create cache like musubi does - which is neat and saves some VRAM. The problem is that its still very poorly optimized for low VRAM cards since for some reason the creator just doesn't seem to care about adding block swap support - which has been requested for several months. With AI-Toolkit you are training Qwen at very low quants even with 24Gb card - and though the 3bit ARA helps, its likely not better than just training at FP8. The overall advantage over block swapping in this case is speed (up to 2x faster) but does that matter when it hurts the output quality? I guess for experimental purposes training on AI-Toolkit would be better in this case.
@Jilhbril very interesting, I don't know much about all of this so I'll have to look into it more and do some more research, thanks!
@penchopenchov 请问aitoolkit 没有块交换,24G显存练qwen lora 不爆显存么?我也是24G 3090 试了几次每次都爆 就放弃了 换别的训练器了
@pter surprisingly enough it worked really well, though I doubt I could fit the full version without the 3bit ARA haha
Does this also work in Qwen Image or Qwen Image Edit?
I've only tested it with Qwen Image. I honestly don't know how LoRAs work with Qwen Image Edit. You're welcome to try and let me know if you'd like though!
All loras are trained for specific models only.
@aidude464 I see, I think it might work on a version of qwen image edit if the underlying framework is the same, but im not sure as im not familiar with this enough :) thanks for your input!
im trying it right now on newest qwen image +|
if you only put nud3 it doesnt make it nude but at least it doesnt fuck the image up like many other loras do :)
yeah, as of right now, nud3 is more of a trigger word to activate the "style" of the LoRA. I am currently working on a V2 of this LoRA for QWEN with an improved/increased dataset. I should be able to upload those files soon. I'm glad to hear you're enjoying the LoRA!




