This one vector embedding will help you generate pictures of anime girls in tight yoga pants, viewed from behind.
The model used for the generation of these images was Anything 4.5. Anime models with danbooru style tags are the recommended use case for this embedding, as this embedding doesn't bring out the "puss" in other types of models. You should still get well figured women in tight yoga pants though.
Check out the images for the prompting information.
Description
This embedding was merged from two custom embeddings that were trained on five images each- one on five images at 512x512 of real live women(they exist!) in the desired pose, and another on five 512x768 images of anime girls in the same pose, that were created with the help of the first embedding. They were then merged to make this far superior embedding.
The first embedding was trained on SD 1.4, and the second on Anything 3.0. Merging was done with the Embedding Inspector extension for Automatic1111 Webui.
FAQ
Comments (9)
Hi. I am very new to embeddings and textual inversion.
Do you use a Colab notebook or do you just train them inside the AUTOMATIC WEBUI train section?
Any link to a good tutorial that you recommend? i'm a bit lost. the few IT tests i did didn't give me good results.
Thanks!
I trained it inside the Automatic Webui train section on my home computer.
https://www.youtube.com/watch?v=dNOpWt-epdQ I think is a good video. The author covers a lot of areas. I'd recommend that you follow his tutorial to make your first successful embedding. I don't agree 100% with his methodology (e.g. turning of cross attention during training), but still I think his tutorial is the best that I've seen so far.
Something to keep in mind is to keep everything as simple as possible, and to only change one thing when you trying to tweak your settings for better results. The original paper on Textual Inversion used a learning rate of 0.005 and 1-3 vectors on 3-5 pictures per subject/style. They found one vector for a single word with 3-5 pictures for 5000 steps was the most effective. No captions, no variable learning rate. https://arxiv.org/pdf/2208.01618.pdf is the paper.
Just keep it small.
Can't get it to obtain the pose at all, even describing it on top of how you use the embedding. What model are you using to preview? Anything?
Anything v4.5. Make sure to set clip skip to 2. Check out the "i" on the pictures to see the prompts that were used for each one. Seeds are included.
What settings do you train your embeddings on?
I haven't been able to train pose embeddings nowhere near as accurately as these
5000 steps, 0.005 training rate, subject.txt, 5 pictures, batch: 1, latent sampling: once. The first embedding I did was trained on 1.4 using pictures that clearly showed what I wanted the embedding to reproduce. The pictures were real life photos, not anime. This first embedding did not produce what it was trained to (usually a deep-fried yoga butt), except, strangely, on anime models where it had a strong influence in moving the pictures towards exactly what I wanted, without any deep-frying. After using that embedding to produce 5 anime versions that were very clearly what I wanted, I then trained that on anything 3.0 at 5000 steps, 0.004 training rate, subject.txt, 5 pictures, batch: 1, latent sampling: once. After testing and comparing the two embeddings, with the anime one being superior for anime models, I decided to merge the two using Embedding Inspector, ran more tests, and the merged embedding was better yet.
It's important to mention that I went in wanting to make an embedding for realistic pictures, and wasn't able to achieve that.
@seekerofthethicc What prompts did you use for the training images? I mean how did you described those images?
@VKTralala I didn't use captions for the training images. The phrases that were used for training were what was in subjects.txt, which are the standard set of normalization phrases- stuff like "photo of a [name]" "a rendition of a [name]" and so on.
is it possible get a look like your pictures with real people?


















