"We are like dwarfs sitting on the shoulders of giants. We see more, and things that are more distant, than they did, not because our sight is superior or because we are taller than they, but because they raise us up, and by their great stature add to ours." - John of Salisbury
Credit to CivitAI for providing this space for us, and to the community that seeks ever greater levels of perfection.
____________________
V4 Notes:
Starting with 'Tuna v3 as a base, I used Merged Block Weighting and an incredible amount of trial and error, so my apologies to the settings nerds out there but I couldn't explain exactly how I did this if I tried. I used the SuperMerger extension for A1111 and I do recommend it for learning how MBW works without eating up a ton of storage space.
Shout-out to the creators of UmiAI and their team's model mixer for the coaching and data support (already linked in previous notes). All models used in the mix have already been linked and some updated versions have been included, or are private models used with explicit permission.
So, what changes? Compositions are both a bit more wild, and also more responsive to prompting in general. I recommend a mix of short phrases and short keyword lists. Don't go too overboard, especially with negative terms. Hands are a little improved. Eyes and facial details are a bit more styled and less anime-face. Clothing and similar internal sections, shadows and lighting especially, are more detailed. It's working vocabulary has been significantly expanded.
My bias goals have been firmly achieved with minimal loss to it's stylizing flexibility. Left on it's own (meaning without specific prompting), it will default to light-skinned women, some nudity, and a stronger blend of outlining that takes on the best of both 'west and east' anime and cartoons.
____________________
V3 Notes:
v3: Ah, my muse, at last I have found you. Variable skin tones, impressive backgrounds and detailing, much more reliable hands, much cleaner genitals. Even more responsiveness to prompting, and accepts about the same amount of influence from Loras as v2
v2: More detailing added, improved hands and keyword responsiveness, but less influenced by Loras, and still kinda prone to naked white women.
fp16 (really just v1): Simple, clean style, not a lot of details, can be very influenced by Loras etc. Excellent for basic lined illustrations and can still handle more '2.5D-ish' styling. Not so good with hands, prone to naked white women.
____________________
This model is a merge of 3 4 8 different 'cartoon' style models at varying ratios to provide a much more varied 'western animation' stylizing, while still gaining a lot of responsiveness to prompts and concepts. Use any style of prompts, including danbooru tags, malformed sentence fragments, poetry, go nuts. A wild variety of concepts are recognized, keep it as simple or as complicated as you like. Because that's what I like. And women... I do like the women.
There is no VAE baked in. I recommend the standard Stable Diffusion VAEs, Clear VAE, or my own Anime VAE.
This model is very capable of soft-core NSFW content, but may struggle with hard-core concepts. Use Loras as needed.
All images were generated using only prompting, via an advanced wildcard extension called Umi AI. No additional extensions or post processing techniques were used.
If you are concerned about 'apparent age' issues, I highly recommend the following models:
Squeezer - Experimental
(A single LoRA with which you use positive strengths to age down, and negative strengths to age up, which impacts details and body type more than overall composition)
Age Slider
(A set of Textual Inversions that influence apparent age up and down by 3 levels, including negative embeddings for additional effect. They can affect the composition of your gens in chaotic ways, even with the recommended emphasis values.)
____________________
The following models were used in this merge:
(18.75%) UmiAI's Cartoon_Final_v2 unpublished
see Mythology and Babes by DutchAlex and Macross v2(18.75%) Toonify v2
(12.5%) 桥洞底下盖小被,逢人就说对对对
(12.5%) Kittenchow
(12.5%) Mistoon Amethyst
(12.5%) Donated private model and used with permission
(6.25%) 23511-1546-幻色石
(6.25%) TypeB
____________________
Please run your models through the Model Toolkit extension for A1111. It can fix CLIP corruption and prune models to fp-32 or fp-16.
____________________
DISCLAIMER:
Like all checkpoints since the originals released by Stable Diffusion, this model is responsive to age related keywords. It is also capable of producing NSFW content. What you do with this model is your choice. I suggest not making questionable images by using negative prompting as needed. I am marking this model as intended for mature audiences because of this.
____________________
Also available at Tensor.Art: https://tensor.art/models/612849265988992344
____________________
v2 Notes:
This is a complete re-work adding more top-shelf models. No additional model modifiers like LoRAs were used, this is prompting only. I chose to make grid previews with randomizing prompt structures to showcase the raw capabilities of this model. It is biased to light-skinned women and nudity. It is responsive to all styles of prompting, short, long, keyword lists, sentence fragments, go nuts. You should specify what you want, especially with regards to NSFW conditions.
____________________
- Using UmiAI's wildcard system, you can call for strings of text in both the main prompt and the negative prompt. Items appearing between pairs of asterisks ** will be placed in the negative prompt. The prompts below, followed by the 'quality tags' I used for each style type, are how the example grids were made.
<[rngfem]>: "SFW, 1Girl, Adult, ({fat|slutty|cute|muscular} <[rng_intl]>:<[W3.*]>) woman, <[rng_smol]>, (<[rng_hair_multi]>:<[W3.*]>), wearing (<[rng_colors]>:<[W3.*]>) (<[fem_outfit]> outfit:<[W3.*]>), <[rng_gem]> jewelry, <[rng_metal]> accents, glowing {iris|pupils}, <[rng_colors_ext]> eyes, (<[rng_emote]> expression:<[W3.*]>), <[qt_face]>, <[rng_dgrw]>, <[qt_25D]>, <[18+]><[negs_logos]><[negs_body]><[negs_qual]>**naked, nude, **"
<[rngxfem]>: "NSFW, 1Girl, Adult, [naked|nude] ({fat|slutty|cute|muscular} <[rng_intl]>:1.2) woman, <[BEWBS]>, <[rng_hair]>, wearing see-through <[rng_colors]> <[fem_outfit]> outfit, <[rng_gem]> jewelry, <[rng_metal]> accents, glowing {iris|pupils}, <[rng_colors]> eyes, <[rng_emote]> expression, <[qt_face]>, <[qt_nsfw]>, <[rng_dg]>, <[18+]>"
flat: "flat colors, cel shading, hard shadows, outlines, vector art**realism, photorealistic, hyperrealism, professional photography, uhd, dslr, hdr, ultra high-definition, digital single-lens reflex, high dynamic range, 8k, 3D render**"
ani: "depth of field, bokeh, god rays, vivid colors, cinematic hard lighting, smooth shadows"
25d: "subsurface scattering, ray traced, depth of field, bokeh, god rays, vivid colors, cinematic hard lighting, realistic shadows, detailed textures**flat colors, cel shading, hard shadows, vector art, 2D, sketch, background without depth**"
photo: "cinematic lighting, depth of field, bokeh, realism, photorealistic, hyperrealism, professional photography, uhd, dslr, hdr**flat colors, cel shading, hard shadows, outlines, vector art, background without depth, 3D render**"
____________________
CFG Scale 20-30 setting is enabled by the following extension: Stable Diffusion Dynamic Thresholding (CFG Scale Fix)
____________________
The following models were used in this merge:
(25%) Unpublished "Cartoon2-Final" from the author of Macross V2
(25%) Kittenchow
(25%) 桥洞底下盖小被,逢人就说对对对
(25%) Toonify
____________________
____________________
vFP16 Notes:This is my first and likely only attempt at merging models. I have no idea what I'm doing and just ran through some tutorials and hit buttons. This model is a merge of 3 4 different 'cartoon' style models to give a blend of more 'western' stylizing while gaining a lot of responsiveness to prompts. Seriously, no need for a prompting guide, it responds to all styles, including anime tags and 'natural language' sentence fragments. This model works well with most LoRAs and negative embeddings. It struggles with hands a bit still and the occasion extra limbs, but otherwise outputs are solid.
Description
This will likely be the final version. I am super happy with the results, and I think you'll like it too.
FAQ
Comments (12)
Holy Crap! 😅 I'm so sorry to the 50+ folks or so that downloaded the initial v3-inpainting version. I messed up on the file names and uploaded the wrong one. I have now uploaded the proper version as well as updated the sample images.
Whenever you use this model, you might want to disable "restore faces", as otherwise it will try to restore the faces generated into a more realistic style and thus completely them up, especially in lower resolutions.
Took me a loooong time and quite a bit of testing to realize what was happening. I was about to give up on using the model. lol
Also, ShadowStar, you may want to add this warning of mine into the description, where it will be more visible.
Restore Faces is meant for photorealistic models anyway, I wouldn't recommended it for any anime or toon models.
Any plans to create a SDXL model/lora?
Yes, but I'm still waiting for the dust to settle. There's a lot of debate going on about the best training methods and workflows for SDXL and I barely know what I'm doing. Thankfully I've got some knowledgeable friends in the scene that are helping me learn.
@IShadowStar Thanks so much for your contribution. Let's hope to have news from you then. God speed
Awesome model, I love it!
I just want to say thank you so much! I've been struggling for ages to recreate an old 70's cartoon character until I tried your checkpoint. For the first time ever, I can do it.
Idk if I'm doing something wrong, but the v3 inpaint version loves to make the skin gray, for some reason
turns out the inpainting model is worse at inpaint than the base model...
So yeah, I wasn't ever very happy with it and just removed it. Frankly, the latest toys work just fine with normal checkpoints.
Thank you very much for this wonderful model 🤩



















