You beloved tail~ Ready for a full NAI3 experience? (Actually even better)
Full scale finetune of Pony Diffusion 6 with dataset of 1.8M anime pictures::
Unmatched (in opensouse) knowledge missing in original pony and other models
8k+ artists styles (wildcard), few general styles
Thousands of characters simply by prompt
Full color palette, full brightness range (example 1, example 2), great base aesthetics
No annoying watermarks like everywhere else
Unique angles, foreshortenings, fullbody-wideshots or extreeme closeups without any issues, pretty backgrounds as an added bonus
From cutest and lovely things to deepest and darkest fantasies
Best performance with tails concepts for your fox/cat/dog/dragon/... waifus/husbendos
Well, this finetune has amount of training that is enough to make a base anime model. Despite it, existing knowledge (for anime) has not gone but only becomes better. Accurate approach especially for TE training and a lot of high quality natural text captions (about 600k, mainly made with Claude3 Opus/Claude3.5 Sonet) significantly improves prompt control and understanding. "Feels like a new base, not pony (c)".
And yes, unlike the majority of PD-derriatives which is just a reskin or lobotomized merge, not a single lora was harmed merged. You can add your tweakers if needed, merge difference of other favourite checkpoint or whatever, it works just as a good pony-compatible base.
v0.5.0 Changelog
A new training from PD-base with a large dataset using some new approaches with pretraining, main train, refining
Lot of new data
After some black magic in training, now you can get complete black or complete white pictures without breaking compatibility with existing tools, loras, etc. Actually very interesting experience example
Better and more stable base styles, less "burning" for artists
Fixes, improvements, ...
(Dataset cut-off - beginning of July, requests after it is pending and not forgotten)
Features and prompting:
Well, first of all - TE knows a lot. It will try to make whatever you prompt without ignorance like you may use to. No guide-rails, no safeguards, no lobotomy. Shit it - shit out.
Scizo-prompts from mixes where you have to boost tag weights and add extra ones to get at least some response (something like (sunny day, rainbow, ethereal hair, transparent skin, huge breasts:1.9)) will not work. You will get something insane, creepy or unexpected.
At the same time, if you just copy tags from booru picture without manipulations mentioned above, or describe it normaly with combination of tags and natural text - most likely it will be great in very wide range. Stick to original booru tags to get best results. Deepest and darkest fantasies may require some rolling, popular things are very stable.
Basic:
Same as for all SDXL, ~1 megapixel for txt2img, any AR with resolution multiple of 64 (1024x1024, 1152x, 1216x832,...). Euler_a and CFG 4..9 (6-7 is best). Highresfix: anyGAN/DAT, x1.5-1.6, denoise 0.5, upscale works best with single tile resolution no more then 3mpx. Highres fix and further upscale will significantly improve quality, details, eyes, hands, feet, etc.
Set Emphasis: No norm in settings of your generation tool if you getting strange blobs or distortion.
If LCM/PCM accelerators applied - use Euler/Euler a samplers, DDIM gives a lot of mess and abominations.
Clip Skip 1 unless you are using loras that have problems with it.
Quality classification:
Only 4 quality tags:
masterpiece, best quality,for positive
low quality, worst qualityfor negative.
Avoid using score_x, source_x, ... etc like in original pony.
In most cases they just make things worse, add noise and mess, brake bodies, fingers, change styles and bring back urine yellow-green filter.
They just make things worse, add noise and mess, brake bodies, fingers, change styles and bring back urine yellow-green filter.
Originally that was definitely not the best implementation of quality tagging including some training flaws and requiring tons of tokens. It became clear that it's better to introduce new tags instead of fixing original. At this point they only bring old triggers without serious improvements.
Negative prompt:
(worst quality, low quality:1.1), error, bad hands, watermark, distortedcorrect according to your preferences.
Do not put tags like greyscale, monochrome, yellow background in negative. You will just get burned images, no need to fix washed colors or "yellow filter" here like you may use to. 3d in negatives is also a bad choose in most cases.
To improve backgrounds, add to negative
simple background, blurry background, abstract backgroundbut do not forget to remove it if you are prompting something with simple.
Artist styles:
Used with "by ", multiple gives very interesting results, can be controlled with prompt weights.
by ARTISTNAME1, [by ARTISTNAME2, (by ARTISTNAME3:0.8),...]or/and
[by ARTISTNAME1|by ARTISTNAME2|by ARTISTNAME3|...]Works best in the very beginning of prompt. Can be used as a wildcard (beware, there is a flaw in sd-dynamic-prompts extension that sometimes wrecks up results when used with batch size more then 1). For majority highresfix/upscale improves quality a lot.
General styles:
2.5d, bold line, smooth shading, flat colors, minimalistic, cgi, digital painting, ink style, oil style, pastel stylecan be used in combinations (with artists too), with weights, both in positive and negative prompts.
Characters:
Use full name tag same like on boorus and proper formatting, like "karin_(blue_archive)" -> "karin \(blue_archive\)", use skin tags for better reproducing, like "karin \(bunny \(blue_archive\)". This extension might be very usefull.
Most characters are known by the name, but it will be better if you prompt their main features, like:
karin \(blue_archive\), karin \(bunny \(blue_archive\), dark-skinned female, purple halo, ponytail, yellow eyes, playboy bunny, fishnet pantyhose, glovesNatural text:
Use it in combination with booru tags, works great. Use only natural text after typing styles and quality tags. Use just booru tags and forget about it, it's all up to you.
And yes, it's still based on pony, so it will be worse in IRL concepts, references or some complex expressions comparing to other checkpoints based on vanila SDXL. Check out Tofu, my new model that can manage such things.
Lots of Tail/Ears-related concepts:
tail censor, holding own tail, hugging own tail, holding another's tail, tail grab, tail raised, tail down, ears down, hand on own ear, tail around own leg, tail around penis, tail through clothes, tail under clothes, lifted by tail, tail biting,...(booru meaning, not e621) and many others with natural text. Some reproduces perfectly, some requires rolling. Unfortunately In 0.5.0 some may work worse, but other looks better. Also now it have better performance with all kind of tails, not only fluffy kemonomimis.
Brightness/contrast:
You can just prompt with tags or natural text what you want in it should work, like dark night, dusk, bright sun, etc. Black/white background works, but often it gives not 0,0,0 or 255,255,255 like should. Part of this is related to prompts - just check what pictures are tagged with it. And using phrases like (cute girl in front of completely black background) fixes it. Anyway you shouldn't meet any issues with general use, it works just like NAI3, often even better.
Known issues
Well, unfortunatelly there are:
Some artist styles don't work as it should.
(The reason for this is not entirely clear, because in another model with the same dataset they work fine. Probably it is something related to conflicts with PD 1-token hashes or problems with original TE. It can be fixed in future anyway, please report if you find artists that doesn't have decent effect.)
Some concepts require more training (few tail-related, some rare like "dogeza" or memes)
Watermarks sometimes can be found. Mostly it is related to pony-base, but some may be from dataset
Ciloranko is actually opossum LMAO (error in on of cherry-picked dataset)
To be discovered, still WIP
Requests for artists/characters in future models are open. If you find artist/character/concept that perform weak, inaccurate or has strong watermark - please report, will add them explicitly. Follow for a new versions.
License:
Pony viral, check the original. Fell free to use in your merges, finetunes, ets. just please leave a link.
Future plans:
Well, a new dataset 2.5 times bigger with better balancing and classification is ready, but any mistake of flaw will cost A LOT. Fixes for current version may be quite soon, but before next big training I'm going to collect more feedback and test some new thing. If you have advices, would like to share your experience, tools or methods for training - you are very welcome.
I'm thinking about adding of some furries in dataset. It may be beneficial for anatomy, poses, concepts, but not that easy because of different tagging system and... wide aesthetic range. If you have ideas how to deal with it, suggestions for good looking/interesting furry artists or can share your datasets - please PM.
Training with natural text tagging (in combination with booru tags) looks very promising even for SDXL, and new large models comes with it out of box. Current local VLM does not have decent performance, COG and Idefics3 are nice but strongly SFW, joycaption hallucinating and almost uncontrollable with prompt, Llava is just dumb, others have similar problems. As for commercials - claude is extremely expensive, gemini has strong censorship, gpt4o is quite stupid for such a task.
So there is a little chance that someday you will see a multimodal llm finetuned with sfw/nsfw anime pictures from the dataset, it should help a lot.Oh yes, here is preliminary version and showcase.Flux - promising, very smart, gpu-heavy and brainwashed even for boobs. I've performed some training where "uncesoring" and little knowledge of anime concepts have been achieved, but it doesn't looks good enough. Write if you are interessed in it. But main issues here are training tools (actively developing, hope will get right full t5 training soon) and about 5-7 times more gpu time requirements for it, so probably it's better to wait for a while.
Any suggestions or requests, join Discord server
Thanks:
Artists wish to remain anonymous for sharing private works; Soviet Cat - GPU sponsoring; Sv1. - llm access, captioning, code; K. - training code; Bakariso - datasets, testing, advices, insides; NeuroSenko - donations, testing, code; T.,[] - datasets, testing, advices; dga, Fi., ello - donations; other fellow brothers that helped. Love you so much ❤️.
And off course everyone who made feedback and requests, it's really valuable.
Donations
AI is my hobby, I'm wasting money on it and not begging for donations. If you want to support - share my models, leave feedback, make a cute picture with kemonomimi-girl. And of course, support original artists.
Hovewer your money will accelerate further training and researches
(Just keep in mind that I can waste it on alcohol or cosplay girls)
BTC: bc1qwv83ggq8rvv07uk6dv4njs0j3yygj3aax4wg6c
ETH/USDT(e): 0x04C8a749F49aE8a56CB84cF0C99CD9E92eDB17db
if you can offer gpu-time (a100+) - PM.
Description
Major update
FAQ
Comments (76)
I disagree
You met a case where the previous version works better? Can you please share it?
Thank you very much for the update, the model has really become better, there are not enough retro styles, mainly bluethebone gives out “retro artstyle” on this prompts, but this is not retro at all, rather a parody of retro) If you accept requests for an update in the next version:
by urushihara satoshi
by danmakuman
by kitazume hiroyuki
by kawarajima kou
by kotobuki tsukasa
by hirano toshihiro
biggus
This is probably the best model I've ever seen, and I've tried a lot of them. I’ve never really been into a ton of different styles because that’s super subjective and hard to judge. What always interested me the most is how accurately the model follows the prompt, how complex the compositions can be, and its understanding of concepts. This model knows a huge number of danbooru tags and actually understands them, creating decent results. I often notice issues with anatomy, like extra fingers, hands, or legs popping up, but that usually happens with more complex scenes. In simpler ones, it’s great. Some NSFW concepts need extra training since they’re pretty unstable, but the fact that it recognized the schizo rare tags I provided is already impressive.
I have a question though: I tried generating images with characters from Fairy Tail (for example), found them in the character list, but they don’t really look like themselves. The hair color and style are right, but the face shape is way off. Is this just bad luck with these characters, maybe a small dataset, or is it supposed to be like that?
Thank you! Yes, unfortunately some concepts are still unstable and require additional training. As for characters - it's has same origin. Fairy tail wasn't added by series, only pictures that appears from other tags and there are simply few of them in the dataset. As you pointed, I will add it.
Also, some characters are becomes stable and recognizable with only just 50(!) images in dataset (not only on pony), some still have wrong outfits even with 1000+ (like hata no kokoro). IDK what threshold should be set, currently it's quite low.
Same for artists, actually there are some that are actually nice and not in list, and some mentioned that not good. No way to find it without feedback with such numbers.
@Minthybasis Thank you for the detailed response. While reading your message, I noticed that Civitai hardly promotes your model. It's just sad. I found this model myself when the latest version was 0.4 at the very bottom of the site...
After conducting additional stress tests (at least that's what I call them xd), I identified some unstable concepts. I'm not asking or insisting on fixing them, just leaving feedback in case it might be useful :3. I don't think I have the moral right, as someone benefiting from your already wonderful work, to request anything for free.
deep penetration - I saw that you provided an example with this tag in the gallery. I tested it and found that it only works with the cowgirl position and its reverse variation. In other positions, I couldn't get a good result. it just doesn't do anything. I was once surprised why almost no model understands this tag... For now, I'll have to keep using Lora for this.
guided penetration - I was very surprised that the model "out of the box" understands this concept and the results are very good, but the fingers are almost always broken. I couldn't fix it even with x2 hires fix with 0.5 denoising strength.
anilingus - In the original Pony model, this tag was already not doing well; the model reacted to it, but getting something good was almost impossible. In this one, I managed to generate good images, but there are still quite often mutants and concept violations.
But I couldn't find anything else. There were some other issues, but they were resolved with more detailed prompts, so everything is great. No matter how hard I tried to find something that doesn't work well or that the model doesn't know, I just couldn't. Only the tag "fisting" probably isn't recognized by the model. The model is simply amazing!
@GromForever Well, I update it rarely when something really new comes, while other derriatives are updated like every week, so they bumped more often. And probably there is something wrong with Civ algorythms, moderation, etc. because AFAIK 0.4.0 wasn't even shown unless you are logged in and have NSFW settings, despite not beeing totaly nsfw oriented. Just a funny example - this picture was removed from showcase and put on review. Race queen Sussoro is probably understandable, but this... That is sad, hope it will change.
Thank you for pointing, need to review dataset for this concepts and investigate is it just not trained enough or lack of descent pictures. Yes, it should be improved since it's not that hard. May be one day I'll decide to make a base model and it will be not nice to wreck it due to insufficient dataset or something else.
@Minthybasis Thank you so much. Regarding the promotion of the model on the site, I can only say that it is at least unfair. Updates are rare, but the amount of work put into each one and the time and money spent are just immeasurable.
I have some issues with opening links that lead to Catbox. I even tried using a VPN, but it was unsuccessful. As far as I understand, I'm not the only one with such problems, so I can't view the example you attached. When I also open the page with artist styles examples, no images load there either. sorry
@GromForever Yes, often it is not loading from first visit, needs to refresh page few times. I'll also upload it somewhere else like Mega soon.
@Minthybasis Hello again. While browsing the new lores and models on the site, I came across this concept: https://civitai.com/models/396401?modelVersionId=653711 (on Danbooru, the tags are: tail masturbation and tail insertion). I tried it with your model, but it doesn't seem to recognize these tags. From what I've seen in the description and judging by the model's name, you've paid special attention to concepts related to tails. Is it possible to add this concept as well? I noticed the number of images with this concept and realized that there might be issues with the dataset size, but I'm not an expert in this matter.
@GromForever Wow, already want it! Well, it might take some work, but it's definitely worth it. I can’t promise that this will be right in the next version/models, because of continuos training cycle, but in the next one after it for sure.
@Minthybasis Great, thank you very much, I can't wait to play around with it.
@Minthybasis I found an issue, or at least I consider it an issue. I can't get an image where the character is facing the viewer at a 0-degree angle. For this, there is a tag "straight-on" on Danbooru, but no matter how much I try, it doesn't work as needed. As a result, I get images like "from above" or "from side", sometimes "from below". It works fine in the original pony model. Maybe there is a different tag used here that I don't know about?
@GromForever Like this? https://mega.nz/file/cCphjQLS#AtOFct8Zqoax62iW8H_u7SOtklfpprzTmtNjHO2Uefo Anyway could you please share some of pictures where you got such a behaviour?
@Minthybasis Here is a link to a folder with images: Google Drive Link. I haven't figured out how to display the prompt above the images, so I'll write it here: positive: "best quality, masterpiece, 1girl, green hair, long hair, blue eyes, seductive smile, straight-on, standing, kimono, hair ornament, hair flower, hands on own hips, Japanese festival, outdoors". negative: "low quality, worst quality, blurry background, abstract background, simple background".
grid.png - these are 6 images with this prompt in your model. Only the last and the fourth, counting from left to right, turned out more or less fine. The rest look either like side views or from below.
I tried adding the tag "upper body". The result became a bit better, but it still feels like it's trying to create a view from below or a three-quarter view.
I've also added 2 images as examples from another model showing how it should be (in my opinion).
@GromForever Yes, was able to reproduce, definitely in such case it tends to make strange angles and rotate character. The effect varies with different prompted backgrounds and negative, aspect ratio or even tags order.
If appears, currently it can be partially solved by adding "facing viewer" to positive prompt to avoid character rotation, and "dutch angle" to negative to reduce angle shifts, like here. Also, as you mentioned, specifying pose/frame also improves it. Consider using tag weights if possible or also add "from below" in negative if it spams it.
Part of this effect may stem from image selection for post-training. To avoid typical generic 1girl_standing_looking_at_viewer balance was shifted to more diverse images. But, apparently, approach needs to be improved, thanks for pointing.
@Minthybasis Thank you, your advice increases the chances of getting the desired image! Do you have any plans to add an anime screencap general style? It would be very useful for images with anime characters. I wouldn't be asking if LoRas with this style worked correctly with this model, but based on my small tests, the image quality drops, as if it becomes a bit blurry.
@GromForever Hm, nice idea for general style. Currently models are training with old set, but probably will add it after with classifiers update.
@Minthybasis Thank you very much. I might have become a bit annoying with my persistence. Based on the comments on this model, this is currently the largest thread of all. You are very responsive and communicative, and I might have started asking for too much. If that's the case, I apologize. From now on, I will only write about serious issues with the models. I look forward to new versions or new models. Thanks again.
@GromForever Not at all, feedback is important because me and fellows may have biased alignment and it's easy to miss some things that will cause problems in future.
Just keep in mind that implementing might take time and not all of them will be in upcoming version bcs it's already training.
Do you have showcases of 3500+ styles here? I'd like to easily choose a style I like. Also, thank you for the update. Wish you a happy day every day :)
Here it is https://rentry.co/rravdiok
could you upload the wildcard txts somewhere else? catbox seems to be down a lot lately.
You can find it in "model training data" section on the right.
this is one of the BEST pony model I'd ever seen!
thanks for your work!
Thats pretty epic model, thank you! one of the best!
the best model i've ever used, hope it'll only get better!
Amazing finetune, good job!
I have several requests, hope it's not too much:)
1) The model seems to struggle at creating oekaki-stylized images, can this be improved in future?
2) If possible, can you please add support for characters from project_moon universe to the next version?
3) And also these artists from danbooru with distinct styles, could be useful for merging styles:
parororo
gashi-gashi
mike inel
j.k.
kankan33333
namako daibakuhatsu
ebiblue
bacun
bigdead
buzzlyears
poch4n
garouma
jam-orbital
saltyxodium
demimond23
razalor
danmakuman
terufuu
barleyshake
polyle
hxd
aoki ume
fujitaka nasu
Can't promise it for sure but will try. Characters and artists will be added.
@Minthybasis Sorry for the inconvenience - I've tested more styles and made a list of some weak/missing ones https://files.catbox.moe/ei4drl.csv, would be wonderful to have them in a new version
@pinkpone Thanks, I'll add them to the new dataset. Some of them will be in upcoming version, the rest in next one after it.
Thanks a lot, really appreciate your work for us!
Please upload this model to Tensor.Art, it is the best site to use SD for free.
Every time you upload a new version, also do it on Tensor.Art.
Is this model just as good with furry art as Pony Diffusion? Will you also upload non-anime related art to training or will the aesthetic focus on that?
I would like the model to be much more diverse than just anime because the fine-tuning looks very good.
Sure.
Well, furry untested. Pony base is strong and knowledges are present, but there were just less 3% of furry images in dataset for 0.4.5 version.
May be later I'll investigate because with right approach adding furries should be beneficial but need to be careful due to different tagging system.
"I would like the model to be much more diverse than just anime" ... easy solution then, train your own model.. i see a lot of work has gone into training this model and it's very good. maybe it's just the way you write but that really sounds like you think your entitled to specific features of this model to be a certain way
Sorry, after some conversations decided not to upload there because of PDXL license, it prohibits the use of models on third-party resources with paid generation services. The checkpoint already goes against ideals of the base model author, so I don’t want to take risks.
However, I wouldn't care if someone else decided to upload a copy there.
As always, excellent, superb job! Can't wait to try it more on my main PC (stupid 300kb/s download :S)
extremely underrated model fr
Probably one of the best models with detailed imagery being output. With the correct prompting and multiple art styles used, you can definitely create unique pieces of art.
Been using it since 0.2 and watermarks are very few and far between, but could still use some additional tweaking.
Hand generation is quite good now as well, it would be nice if more natural language is able to be prompted too, rather than using Booru tags only.
Huge dataset being used and works amazing through ComfyUI. Happy to provide detailed feedback if I have time and if Minthyb would also like to take it on board :)
Thank you! Will be very glad to read suggestions and any criticism, do not hesitate. As for natural language - working on new dataset now, wish more complex compositions would be available without extra efforts.
Cause of better datasets, well organize, well tagging. I think this one is under rate and better than many finetune models.
I can't use Clip skip 1 on comfyui but it is fine with 2. Do anyone know how to use skip 1 on comfy?
There is one issue with this finetune that I have found, mixing mature_female with twintails only gives low_twintails, no matter if I put low_twintails on the negative prompt.
Hm, looks like there is a bias for this kind of hairstyle, and likely it doesn't come from pony. Thank you, will investigate.
@Minthybasis No problem, works fine without said tag, you're doing an awesome job taking your time to answer all my findings, thank you!
@Minthybasis Further testing shows that adding "mature female" to negative prompt, and "low twintails" to positive prompt has the opposite effect, no "low twintails".
@blackfuture82729 Could you please upload a picture? Because I can't reproduce this, (not low) twintails with "mature female" is an issue, but with loli or without any extra tag any twintails seems to be fine.
@Minthybasis Sorry, it seems I can't get it to reproduce it now, so I may have had a weird mix on negs. On the other hand, "high-low_skirt" or "showgirl_skirt" does not work, I had a random image pop with them, but explicit prompts seem to not work.
@blackfuture82729 High-low skirt is not so popular, for 0.4.5 it was too few of them in dataset, stock pony doesn't recognize it as well. By a lucky coincidence in dataset of currently trained version it presents explicitly, so expected to be. Showgirl skirt is unstable, same here.
@Minthybasis You're the best! I understand that there are lots of limitations, mostly thanks to bad tagging from WD/VLMs, that stems from the lack of proper tagging of base images. You're doing an awesome job, keep it up!
@Minthybasis I hope you don't hate me lmao. Tagging for braided_ponytail, low-braided_long_hair works fine unless you add mature_female to the negs. Here are two comparison images (slightly nsfw):
https://ibb.co/v3LXCvW
@blackfuture82729 Absolutely no, I appreciate your feedback. Thanks to the previous ones, more attention to improve hairstyles and costumes in dataset for current training have been paid. Early tests shows that new version doesn't show such biases like in 0.4.5, hope it will be fine.
@Minthybasis You're the best!
model doesn't seem to produce good results for feet。https://drive.google.com/file/d/12Z_XC2osa6xiQAbQxfjdNi9yoFqOz1x4/view?usp=sharing
LMAO, feet level - sd3.
Well, foots can be far from perfect and need improvement, that's obvious but not that bad. I'd recommend to remove some negatives and and stick to booru standard tags, at least it prevents from such worst cases. Using artists (especially who draws it), having more detailed prompt and presence of some related tags can also improve results. Here are examples https://mega.nz/file/MSQDQIKY#McTyOBgTQWX7SbMzBfG3Qy-_PyJpfGjS1up4KQY48wM https://mega.nz/file/kOJWzQ7Z#V3qPutV-jopt2YbTYOQqqlRuiMnzs6dyHCYCUqk9tTo https://mega.nz/file/VLB3nAyJ#j2i9LaQae2dGgBYGTTE-GRngXu_5wbbkGVd2ZebWfbE may be it can help a bit.
Fantastic model, I love the base style that this has and its ability to recall characters is amazing. I'm very impressed and would love to see how you'll improve it in the future, so here's a list of what I've found it struggles with.
Artists it does not recognise: darklux [does summon a watermark though], sgk, sarukaiwolf, youkan [seems to be a character?], ogura_anko
Artists it can improve on: toggy_keiichi, nashidrop, krekkov, kusaka_souji, nanohana_(november.), hu_dako, joylewds, mkonstantinov, haku89
Artists with watermarks: blushyspicy, donburi_(donburikazoku),
Some more:
Artists it doesn't recognise: boris (noborhys), mosbles_(mosbles7), matsu-sensei, sakimichan, splashbrush, shinda292, bokcutter, zumi_(zumidraws)
Artists it can improve on: choujiroo, neocoill [watermark], rob_ishi, tsukishiro_saika [I kind of like how unhinged this is though], hara_(harayutaka), cutesexyrobutts, ibuo, awesomeerix [watermark], fellatrix, wamudraws
Artists with watermarks: roresu, asura_(asurauser)
do you have an example of images in the dataset that were tagged with each quality level? its hard to tell what to negative because i dont know what they look like
All pictures have been estimated with system of classifiers, then according to scores quality tags were added. "Masterpiece" - the best, "best quality" - very good. Detailed, accurate and high quality images of any kind, from simple characters to complex scenes with darkest fantasies. "low quality" and "worst quality" for very bad, like amateurs drawing, rough sketches, messy and ugly. Hope this grid can explain, prompt is simple and negative is empty.
@Minthybasis thank you. i was curious because some hentai artists that i like have drawings that i was unsure if it is classified as low quality because of sketch (mdf_an or hews as an example)
@low_channel_1503 Oh yes, some artists had a high share of low ranked pictures. In such case they have been handled manually to avoid problems, so there shouldn't be any from quality tags. However, it's impossible to keep track of everything, if you finding issues with some styles - please report.
Will do, thanks. Looking forward to the next version and good luck
some artists i'd like to see/improve:
enokido_(reido1177)
mitsudoue
shinjiro (focus on latest colored artstyle within last 1-2 years. there is also a doujin released by them recently, not sure if you will be able to add that to dataset)
takeda hiromitsu
marushin_(denwa0214)
toma_(toma50)
quasarcake
ulrich (tagaragakuin)
thanks in advance
@Minthybasis Do you train on tags like rating:general?
@low_channel_1503 No, I haven't found any positive effect from this other than wasting extra tokens. However, tested them just briefly, if you have other experience and opinions about it I'd love to hear it.
@Minthybasis I just wanted to make sure. Some models like arti use it but i dont think it matters much
@Minthybasis do you have a discord server where we can see progress or get updates?
@low_channel_1503 Oh, I'm too busy/lazy to make it, but will try soon. About progress - new release in few days.
@Minthybasis looking forward to it
Bless you for making such a wonderful model! I've used NAIv3 before and this is definitely the closest to it I've seen. When I saw an output of this model I was convinced it was NAI3 until I saw the name of this model. I only hope you'll expand the dataset of artists, there are a few missing. If you could please add fishine. I really love his style.
I have seen some SDXL models that are capable of generating simple text phrases. They are, of course, far from the level of SD3 or similar models, but still. I tried it on this model, but the result was nothing, it doesn't even perceive a single word. How difficult is it to train a model to write a single word on a sign? I also tried to generate an image where a girl’s t-shirt should have a specified print, but the model also ignores this and creates a completely random print.
Taking this opportunity, I want to ask, what stage is the training of the next version/model currently at? :3
Well, it's difficult and easy at the same time.
To make pony-based model write even simple words with kind of stability, a training with related a properly captioned pictures is needed. That's quite a task, starting from picking descent images (anime) pictures with only a small words and phrases and then using of VLM for captioning. Sorting pictures is main pain in ass here. Some effect may appear from dataset part captioned with natural text, but most likely it will be barely noticeable.
However, if not pony but other model is used - all you need is just don't fuck up TE training. It will just work out of box, like you can see in Animagine or other anime models. Here are example (sfw/questinable/nsfw) how it works on early epoch of my checkpoint trained from sdxl-base without any extra efforts for it. (will release it after/before new version of 4th tail depending on progress).
As for current state - like 50-60%, it already looks very nice and knows some characters/styles, but not stable yet and struggles with rare concepts.
@Minthybasis Thank you for the explanation. It turns out that the author of Pony not only broke compatibility with other SDXL models but also the text component. Screenshots of your model look promising. At the very least, I don't see similar stylistic features in Pony models. Well, I'm not sure if it's style, I don't know what it's called, but all Pony models, despite having different styles, still have something that reveals their base model. If everything works out, there will be a strong SDXL anime competitor for Pony and Animagine.
Among the Pony derivatives, it's the best anime model styles- and characters-wise, nothing comes close. Fantastic job.
Details
Files
Available On (1 platform)
Same model published on other platforms. May have additional downloads or version variants.