Jack Sparrow - Realistic + Anime - LoRA
Making models can be expensive. Do you like what I do? Consider supporting me on Patreon š æļø or feel free to buy me a coffee ā
Consider leaving a ā¤ļø to receive future updates.
This is a successful experiment I made for testing out my theory on how to make an anime character from a photorealistic dataset. The answer was simply to use NeverEnding Dream (NED). NED was made basically to bridge the gap between photorealistic models and anime ones, so the theory was to use a dataset made of movie screencaps of Jack Sparrow (about 80), tag them with booru style tags (using a combination of WD Tagger + manual) and train a LoRA on NED.
If you then make the inference on NED it gives a perfectly photorealistic model, but if you use it on AnyLoRA or anime models in general it gives you an anime version of Jack Sparrow with little to no style transfer.
You can probably use this technique to turn any real person into an anime counterpart and vice versa (see: my Ganyu LoRA).
You can use at weight 1 generally, but tune it based on the model. Trigger is jack sparrow.
As most of my LoRA's, this can be customized a lot. In the examples I provided a blonde anime girl Jack Sparrow. Examples 6 and 7 are of course using my One Piece style LoRA. Third image by bloodsplash09.
How to use LoRA's in auto1111:
Update webui (use
git pulllike here or redownload it)Copy the file to
stable-diffusion-webui/models/loraSelect your LoRA like in this video
Make sure to change the weight according to the instructions (by default it's
:1)
Description
FAQ
Comments (27)
for tagging images is it better to add what you want to train or to caption everything else except it? (like for a character with red eyes would I add the red_eyes tag or would i just describe the background)
As far as I have understood it from several good trainers by now, you caption what you want the model to be able to change. You don't tag what you want to always be present.
Basically the more you tag, the more the model can change, but the more you also have to input to get the result you want.
If you want it to create a specific character and their eye colour and clothing is something that shouldn't be changed, then you only tag their eyes and clothing in images where they are something else than their normal.
So a person that has a character design where they are consistently wearing a pink hoodie and they have blue eyes, you would only tag for clothing like "swimsuit" "school outfit" "red eyes" etc etc if the point is to be able to easily generate that specific character.
You basically want the model to burn in and "overfit" on the aspects that you want to always be present, then be flexible on everything else.
@lifaniaĀ Thank you, that is very helpful
very much what @lifaniaĀ said. I tent to tag everything so that you get basically a set of tools with a LoRA. Like you make a Ganyu LoRA and she has "blue hair" but it's her specific share of blue. And this also allows the user to change the hair color.
thats got to be the best anime pirate i've ever seen
Yours LoRAs are impressive, how you can turn them photorealistic or animes, thanks for sharing
@VegaProxyĀ I've explained this in details in the model description. I even said which model you need to use to train. And no, chillout will not work, as it's not compatible with anime models.
@rickmashups read the description :)
@LykonĀ It worked for me very well, thanks, now im gonna try the opposite, an anime character to a realistic model
Can you explain what you mean by inference?
"If you then make the inference on NED it gives a perfectly photorealistic model, but if you use it on Anything4.5 or anime models in general it gives you an anime version of Jack Sparrow with little to no style transfer." I know how train a LoRa but I don't understand how to do this inference.
inference is when you generate images.
So you're saying I just have to train on NED and make images with anything and it will come out anime? Because I tried that and it didn't work. Can you share the settings you used on this one?
@jangolemestre222Ā it's not the settings, it's the data.
@LykonĀ What do you mean did you use anime style images on the data? I can make a Lora of a realistic person, but when trying to make it anime it just doesn't work. Maybe it's the Lora settings you are using it's what I'm thinking
@jangolemestre222Ā not really, you just have to tag photorealistic images as such to avoid the model overfitting on the style.
Did you leave in 'realistic, photo_/(medium/)' etc tags?
And a random lora question: Would sketches/B&W lineart 'hurt' a lora that you want in color as long as they're tagged appropriately? And would it get anything?
Like if you want a certain piece of clothing that was only found in the linearts/sketches would it be able to render it in other styles?
that depends on many other things. It's impossible to evaluate in a bubble.
This Lora is indeed amazing. However, I tried to train a Lora based on NED to obtain a transformation from photo-realistic to anime style, which didn't work that well. It seems that my Lora has absorbed some styles, so the paintings using this Lora have their own painting style, which I don't want.
I wonder whether you could kindly share some settings for training this Lora? I have done the realistic version of lora for the same movie character based on sd1.5, and it works fine on NED. While I use the same settings to train on NED, things are weird. First, when used on anime style base model, the loras trained by only 2 or 4 epochs work better, but they are a bit of under-training, losing some characteristics of the character. But with loras trained with more epochs, it gives me something photo-realistic rather than anime style. Second, on anime style base model, it gives me something photo-realistic, while on the realistic base model, it gives me something anime-style (which are not good at all.) Well, these, in a sense, indeed obtain the transformation from photo-realistic to anime style, but it is not what I looking for.
Specifically, I have a well-tagged dataset for around 80 images and made them repeat 5 times, and set 16 for network dimensions, training up to 16 epochs. Do you think I should slow down the learning rate for a slow baked? (I can't conduct many experiments since I only have a Gtx1050. It works too slow.)
Have you tried adjusting the weight of the net during image generation? Training on NED produces stronger results compared to SD1.5, so you'd want to use it around 0.6.
I don't think the LR matters too much. Have you tried adding "photorealistic" to tags of your dataset of realistic images? This will reduce the style learning.
Also are you using tags or captions? You talked about ds1.5, so I guess you're probably using captions instead of booru tags, which are the ones you should use on NED and anime models.
If those suggestions don't work, try adding 1% fanarts to the dataset.
Let me know.
@LykonĀ Thank you for the reply. You're right about the weight - How can I forget to test different weights! But it does require a much lower weight of lora compared with the realistic one. For me, the Lora trained for 16 epochs must be set under 0.7 to obtain the anime style. I will conduct more tests to find out the best settings.
I did add "photorealistic" and "realistic" to all my training images. And they are tagged with both captions and booru tags, most of which were tagged manually. However, I find that adding anime style images should be a great idea - letting the model absorb the differences between realistic style and anime styles. Instead of using fanarts which may involve the issues of copyright, I will try adding some generated anime style images. I guess that will help.
I might do these later and share it here. Thanks for sharing and great works again, cheers!
@AtomatonĀ you should avoid using captions and booru tags together. THat will never work. I suggest you stick to booru tags, as NED is probably nai based. Keep me updated :D
Hi man! Been trying to follow your suggestion and was able to generate acceptable with the right prompting in txt2img alone. Sometimes some training details i recognize leaks in the resulting images.
Id like to ask for tips specific to your methods about the preparation of dataset/tagging. My goal is to make my custom lora learn the head part only and make it flexible to any costumes. Thank you!
If you get original data leak then it means you might have used too few different examples, or too many examples similar to each other.
To get flexibility you should tag most of the things you want to be able to change.
Very useful and helpful thank you for sharing. May I ask about Regularization/Class images? Should I use class images produced by NED too (which are just digital art of Asian-looking people), even if my training images are real people? Or how should I make my class images?
Uhm this guide is pretty old. I'd use AbsoluteReality instead of NED nowdays.
NED was one of the most realistic models at the time which was also compatible with anime models. Nowdays we have AbsoluteReality that's even more compatible with anime models, while being more realistic.
@Lykon That's a nice model thank you I'll try it. If I need to train ABC man I should first create 1000 reg images using AbsoluteReality with simple prompt "man, realistic", right? Or do you have any suggestion about preparing reg images?
@lekhang Ā I wouldn't prompt only "man, realistic". I'd prompt "photo of ... " and then give a description of the image. Likely with booru tags if you aim for that to be used with anime models.
Details
Available On (1 platform)
Same model published on other platforms. May have additional downloads or version variants.







