Beginner's guide! to Textual Inversion & publish.

Beginner's guide! to Textual Inversion & publish. - Beginner's guide!

NSFW

Articles are now live for everyone! No need to download guide, live > Here <

Guide will still be updated here so you can get it in image format.

Beginner's guide!

I'm a beginner myself and don't see myself teaching. This is just to learn how to make Embeddings like I do them.

Once you've learned the basics, you can continue to learn more from other places.

Have chosen to make the guide in pictures and text as it is easier to just copy what I do.

tldr: You should be able to skip all the text and just look at the pictures.

Just a reminder!

You should never download without checking for viruses! Even if it is said to be safe.

I can recommend using: https://www.virustotal.com/gui/home/upload

The images are in a .rar file. Use the program below if you can't open it.

https://www.win-rar.com/

Use as you wish. No asking permission needed share or do what ever.

Long live pirates! F them greedy.. Have fun! :D

Description

Beginner's guide! v1

FAQ

Comments (41)

dajushaMay 10, 2023

CivitAI

After preparing datasets(imgs+txts), and all training tab settings, then clicking "train embedding" button, it stopped in seconds and showed "RuntimeError: a leaf Variable that requires grad is being used in an in-place operation.

Applying xformers cross attention optimization."

How can I continue?

Alyila

Author

May 10, 2023

Google it and looks like a bug. Did you use: Use deepbooru for caption ? If so its my bad have never try just added it because i heard them from someone else. Stick to: Use BLIP for caption.

dajushaMay 11, 2023

@Alyila found reason, oom CUDA, 4gb Vram not enough for training I guess.

Alyila

Author

May 11, 2023· 1 reaction

@dajusha oh.. No 4gb is way to low! Sorry.

TheOnlyHolyMolyMay 11, 2023

CivitAI

It says the zip file is corrupt (i.e. defective).

Alyila

Author

May 11, 2023

try to rename it to .rar site dident want me to upload as .rar only .zip

chimisukeMay 11, 2023

why still rar on 2k23? try 7zip it's free (also it unrars the rars)

Alyila

Author

May 11, 2023· 1 reaction

@chimisuke Hey don't you judge me young man. Let me be old.. xD

JCX_GiveMay 11, 2023

CivitAI

Good and clear tutorial! It's similar to the method I have used for a few TI's. I do put batch size at 3 and gradient at 2 for training. Also, I notice you do not really adjust the description in the txt files after BLIP captioning? Sometimes there's weird shit in there and I think it also helps to separate some things with comma's and add more info about the background (e.g. blurry or black background) depending on the picture.

Alyila

Author

May 11, 2023· 1 reaction

Thanks for stopping by! :)

As a beginner, I change the text and blur the background of the images. I didn't get quite the same results then.

And this is a beginner's guide and I don't feel it's important when you are new. If you change too much or do something wrong there, it often gets worse.

If you just take your time and find good pictures from the beginning, it shouldn't be a major problem.

sp00nsMay 11, 2023· 3 reactions

CivitAI

Great guide!

Seems like a little mistake in either the image or the text part. For step 9 on the train section, the image shows the "Drop out tags when creating prompts." as 0.1, but in text you mention to put it as 0.9 so not sure which is correct.

Alyila

Author

May 11, 2023· 1 reaction

That's an error i miss! Thanks for letting me know. Will fix it in next update.

sp00nsMay 11, 2023

@Alyila No problem. Is it meant to be 0.1 or 0.9? Thanks.

Alyila

Author

May 11, 2023· 1 reaction

@sp00ns 0.1 is the correct one.

NebulaT13May 11, 2023· 1 reaction

CivitAI

Should the Prompt Template on the training page be subject_filewords.txt instead of style_filewords.txt if you're doing a person?

Alyila

Author

May 11, 2023

It's something I never thought of myself. Sounds like an obvious choice. But I have always only run with style_filewords.txt which was on default selection. So it works fine for a person to, just let it be or try subject if you want. Thanks for bringing it up! Will read a little more about it and see what the difference is between them.

Alyila

Author

May 11, 2023· 1 reaction

Update after reading more:

select style_filewords.txt if you are training an artistic style

select subject_filewords.txt if you are training an object, person or animal.

It works well either way, but the right way is to choose subject if you are going to make a person. Will add it to the guide. Thanks!

NebulaT13May 11, 2023

@Alyila Yeah I wonder how much of a difference it makes... could even try working out a custom one. I think it just a file filled with some prompts so could even do a mix of both.

Alyila

Author

May 12, 2023

@NebulaT13 you find the files here: stable-diffusion-webui\textual_inversion_templates

So you can just open them and see the difference. Just made a new model using subject and not sure i can se any changes atm.

Mix doesn't sound good if you check the files, it's probably good to choose a focus on what you want to use it for. I´am sure there is custom one you can use to.

1179958May 12, 2023

CivitAI

I have 6gb VRAM can I try? or its risky?

Alyila

Author

May 12, 2023· 1 reaction

Feel free to try, dont think you can sadly. But you can try A1111 on colab, You have to pay for it then. =/

cybermiawMay 12, 2023

CivitAI

Why using style_filewords.txt in your settings instead of subject_filewords.txt which is more effective for training a subject?

Alyila

Author

May 12, 2023

I'm a beginner myself, already talk about this with another here. And it will be changes in next update.

Here are the correct ones:

select style_filewords.txt if you are training an artistic style

select subject_filewords.txt if you are training an object, person or animal.

FaeIllustMay 21, 2023

CivitAI

Is there a downside of using textual inversion/embedding instead of LoRA? This guide makes it look like textual inversions are much easier to make, and the file size is small too.

Alyila

Author

May 21, 2023· 2 reactions

You have to decide that yourself. It depends entirely on how they are trained, I have seen Loras who are both worse and better. I have made all my models as in the guide and it is super easy for anyone to do. Therefore, I would recommend it for beginners. But on paper, Loras should be better. Give it a try! :) please show your result later if you do something.

Good luck!

FaeIllustMay 21, 2023

@Alyila Can you elaborate why on paper LoRAs should be better? Can it hold more information or something (because of the filesize difference)?

Alyila

Author

May 21, 2023· 3 reactions

@FaeFlan LoRAs (Language Representation Models with Adaptive Inference Time) are language models that dynamically adjust their inference time based on the complexity of the input. They aim to balance computational efficiency and model accuracy by allocating fewer resources for simpler texts and more resources for complex texts. LoRAs are efficient, optimize resource usage, and offer a trade-off between efficiency and accuracy.

Textual inversion, on the other hand, is a text transformation technique that involves reversing the order of characters or words in a given text. It can be used for data augmentation or generating creative variations of text. Textual inversion is useful for tasks that require diverse text generation or exploring unique linguistic patterns.

Both approaches have their strengths depending on the specific use case and requirements. LoRAs excel in computational efficiency and resource optimization, while textual inversion is useful for generating diverse text variations and exploring creative transformations. Choosing the "best" approach depends on the specific needs of your task.

FaeIllustMay 21, 2023· 1 reaction

@Alyila I did a quick test earlier but on a cartoon character and on Novel AI as base. It didn't turn out very well but I'll try again maybe for a real(istic) person. I think the tagger just didn't recognize the subject very well. I get that stable diffusion 1.5 is a good base for real people, but for cartoon and anime characters, would you recommend to use stable diffusion as well or is it also ok to use Novel AI as base? (For LoRAs I think most people train on Novel AI if they are doing anime, I would assume it's the same for textual inversion)

Alyila

Author

May 21, 2023· 1 reaction

@FaeFlan Thanks for update! Atm i cant help you much more, I have never try cartoon/anime. Will do that for my next training so i can help you and other here that want to learn that to. If there is a big difference between them, I will make a separate guide for cartoon/anime style.

Alyila

Author

May 21, 2023

@FaeFlan Check my result which I posted here. Used the same method as in the guide just change to style_fileword.txt It won't be quite right but can still get ok results.

The biggest difference is which checkpoint you use. But you can also try training with a different base and see what works best for the particular style you are looking for.

FaeIllustMay 22, 2023

@Alyila I just saw it, it looks pretty good. Thanks for trying!

FaeIllustMay 26, 2023

@Alyila I tried out your guide again with a real person. It was pretty good but I am curious about three things now 1) Why did you pick 1500 steps as optimal instead of more? usually I train my LoRA at 6000 steps but the best results come at 3000 steps or so. 2) I used 72 images instead of 30 as you suggested in your guide, should I change the number of steps or keep it at 1500? 3) The model I picked actually has pretty chubby cheeks but the image I generated gets pretty angled faces. Should I train for more steps or change something in the prompt to get the chubby cheeks back?

Alyila

Author

May 26, 2023

@FaeFlan
1: This is a beginners guide feel free to test what works best for you and your project. 1500 works fine most of the times and if you do to much you can over train it and get bad results to. Saves time to ;)

2: Getting more is not bad. If you get good pictures you don't need that many and its harder to find more photos. Can test your way here to, the number is not set in stone just +15.

3: Your model is almost always trained good just bad checkpoint for your model. Its all on what checkpoint you use. ChilloutMix will almost always have have more sharp face. Try analogMadness v40, or any other. Here is a model i made that are a bit more chubby in the face: https://civitai.com/models/74739/awondrr-sg

I try some like chubby, bbw etc. So you can use prompt to but you will not get correct size as the model. Think it have to do with the checkpoint to and what models they use for training.

FaeIllustMay 26, 2023

@Alyila Aah, I see. I will try with analogMadness 4.0 later as well. Another thing that popped to my mind, the automatic cropper crops to the focal point (usually the face) but what if you have a certain feature you want to include e.g. some specific tattoo on left leg. Should I then manually crop and resize everything so that the dataset includes both the face and the leg tattoo?

Alyila

Author

May 26, 2023· 1 reaction

@FaeFlan

You can be lazy and only use, automatic cropper crops to the focal point. I do always edit my photos to get what i want in frame. It training on 512x512 so just crop it to a square that contains what you want in it.

Ai is not good at tattoos from my experience. So focus on face is better or a mix of both.

digitalwallMay 25, 2023

CivitAI

How long does it take to train your TI?
Btw love your SG TIs.

Alyila

Author

May 25, 2023

Thanks! :)

Well depends on what card you use: ~35min for a RTX 3060 12gb

Takes more time to get all pictures and edit them. And for me how upload i need to render a lot of pictures for display that takes time to. So total time is around 2h for me.

MitsunariMay 27, 2023· 1 reaction

CivitAI

Why my trained TI does not work in other models other than v1-5-pruned-emaonly?

With my trained TI in v1-5-pruned-emaonly, I managed to get somewhat legit results, but with other models, like Deliberate or Dreamshaper, it looks like complete different character.

Alyila

Author

May 27, 2023· 1 reaction

You only train your Ti using v1-5-pruned-emaonly, when you render your model try different checkpoints to see what works best for your model.

MitsunariMay 28, 2023

@Alyila Thanks for your reply, but if I use models other than v1-5-pruned-emaonly during training, it give random weird results. I try a couple of models they are all like this.

Alyila

Author

May 28, 2023· 1 reaction

@Mitsunari

I do not recommend you to changes base for training.

You can try 2.1 https://huggingface.co/stabilityai/stable-diffusion-2-1/tree/main

I have not try it for my self . From what i read its more for architecture, interior design and other landscape scenes. If you want people 1.5 is better.

Other

SD 1.5

by Alyila

Download (Beta) View on CivitAI

helper

guide

Details

Downloads

2,742

Platform

CivitAI

Platform Status

Available

Created

5/10/2023

Updated

5/7/2026

Deleted

Files

beginnersGuideToTextual_beginnersGuide.zip

Size:

5.91 MB

SHA256:

5579eb0915a6754752f6a399968bdeacd661187b0da1f3d4afe54856f274533a

Mirrors

Huggingface (2 mirrors)

beginnersGuideToEmbeddings_beginnersGuide.zip

CivitAI (1 mirrors)

beginnersGuideToTextual_beginnersGuide.zip

Available On (1 platform)

Same model published on other platforms. May have additional downloads or version variants.

SeaArt

Beginner's guide! to Textual Inversion & publish. - Beginner's guide!

Beginner's guide!

Description

FAQ

What is Beginner's guide! to Textual Inversion & publish.?

What files are available and where can I download them?

Comments (41)

Details

Files

beginnersGuideToTextual_beginnersGuide.zip

Mirrors

Available On (1 platform)