CivArchive
    Preview 14961293
    Preview 14963543
    Preview 14966400
    Preview 14968731
    Preview 14966401
    Preview 14988337
    Preview 14988938
    Preview 15007237
    Preview 15024744
    Preview 15090548
    Preview 15102735
    Preview 15105202
    Preview 15108484
    Preview 15790158

    What is this?

    I would describe it like so: an abnormally versatile SD 1.5 model with extensive custom training done exclusively at 1024px and higher (thanks to "bucketing"). Built up in a clean, additive, iterative fashion on an ongoing basis thanks to CivitAI's handy online Lora trainer. Can do everything from pretty landscapes to hardcore booru-tag based NFSW in pretty much any style. Not specifically just an anime, realistic, or semirealistic checkpoint, rather moreso whichever of those you want it to be at any given time. All showcase images are direct generations made without any use of detailing or upscaling whatsoever (i.e. you should treat this like an XL model basically when using it), and include full metadata.

    How do I use it?

    You can use either natural language or booru tags (with spaces, not underscores). I tend to use both simultaneously, as in mostly coherent sentences but with many of the words and phrases being specific tags that actually exist. See the showcase gallery for a variety of examples. In terms of resolution, it is at the very least completely pointless in my opinion to ever go any lower than 768x768 with this model (as 100% of my training is done at 1024px without downscaling or cropping anything).

    Personally, I do not ever generate lower than 1024x768 or 768x1024 with this, and more often actually do 1216x832 and 832x1216 when it comes to non-square-format images. For square format I personally stick to 1024x1024. Again, you can download my showcase images at their original resolution with full metadata to get a better idea of what this thing can do, as it is also trained on some less common "exotic" aspect ratios / resolutions too.

    Also note that if you're prompting for 2D-style images, this model DOES recognize a large selection of "by whoever" artist tags (some stronger than others), so if there's one you have in mind just try it.

    Tip: generally speaking, SDE samplers provide better results with this model if you're going for realism. I personally am a big fan of DPM++ 3M SDE GPU Exponential, at around 4.0 - 4.5 CFG. For anything less realistic, however, you may also want to simply try Euler Ancestral (or very occasionally DPM++ 2M Karras) at around CFG 7.0.

    Do masterpiece, best quality, high quality, worst quality, and so on exist in this model?

    Yes, but their impact on the image is much smaller if your overall prompt is for realism or semirealism, they have the most noticeable impact specifically on 2D-style images. detailed background and simple background specifically however DO both have the impact you'd expect on all types of images, generally speaking.

    V7.0 Eta Details:

    Better realism, and prompt adherence should be I think the best it's ever been. Really happy with this version. VAE baked in as always.

    V6.5 Zeta Plus Details:

    It's not quite what Zootvision V7 Eta is intended to be, yet. But it makes some nice, perhaps subtle, improvements. I tried to stress the actual depth of the model in the showcase gallery images this time, a bit more. VAE is baked in as always.

    V6.0 Zeta Details:

    Improved basically everything TBH. Did all the stuff I talked about in the comments, and a bunch more. Made some pretty weird showcase gens just to kinda show off what this thing can actually do a bit more, lol. VAE is baked as always. Also don't forget that this model does in fact know a very large amount of by whoever Booru-format artist tags, it's not only the specific ones you've seen me mention before!

    V5.0 Epsilon Details:

    Trained for an additional 10,000 steps on a variety of subjects (all of photorealism, NSFW, and anime have been at least somewhat refined) against v4.0 Delta. This version also introduces an Ideogram style dataset, which can be triggered by using 'by ideogram' in any prompt. See the showcase gallery for some examples. I think this is a pretty solid improvement over Delta, hope you enjoy it! VAE is baked in as always.

    V4.0 Delta Details:

    Two additional datasets merged in (one for further enhancement of photographic images of people and places, one for some experimental "tricky prompt" rich captioning stuff), both trained on V3.0 Gamma for a combined total of 9040 steps. VAE is baked in as always. All data in the new photographic dataset was tagged with photo \(medium\) in order to build on top of the model's existing understanding of that tag. This is definitely the best version yet, hope you enjoy it!

    V3.0 Gamma Details:

    1000-image "aesthetic" dataset (trained for 10,000 steps on V2.0 Beta) merged in. This dataset can be optionally strengthened by using the phrase very aesthetic anywhere in your prompt. This version has a VAE already baked in, as always.

    V2.0 Beta Details:

    Merged with 1000-image "NSFW Enhancer" dataset (trained for 10,000 steps on V1.0 Alpha). All images were at least 1024px on at least one side, up to a maximum of 1216 (for XL-style 832x1216 portrait / 1216x832 landscape images, of which there were a fair number).

    V1.0 Alpha details:

    My (incomplete) attempt at a truly general-purpose high-resolution-focused SD 1.5 model, in the sense of anything from pretty landscapes to hardcore booru-tag based NSFW porn.

    Uploading to CivitAI in the current state basically for the sole purpose of using their Lora trainer for a few more 1000-image datasets I need to get trained and merged into this thing. Feel free to try it out regardless if you like (it know many characters, see e.g. Jinx in the showcase), however expect relatively different results from later / the final version.

    General (always relevant) details:

    DO NOT blindly assume that Clip Skip 2 is always "correct" with this model, it is not really traditionally NAI-derived at all. Really I'd moreso recommend just trying either Clip Skip 1 or 2 if you've found a particular seed that you mostly like but isn't quite "there" for a given prompt, as in my testing both give good results under different circumstances.

    Description

    Two additional datasets merged in (one for further enhancement of photographic images of people and places, one for some experimental "tricky prompt" rich captioning stuff), both trained on V3.0 Gamma for a combined total of 9040 steps. VAE is baked in as always. All data in the new photographic dataset was tagged with photo \(medium\) in order to build on top of the model's existing understanding of that tag. This is definitely the best version yet, hope you enjoy it!

    FAQ

    Comments (3)

    Kitten123Jun 7, 2024
    CivitAI

    What type of nsfw does it contain

    ZootAllures9111
    Author
    Jun 7, 2024

    You should be able to prompt for just about any concept associated with an actual Booru tag. At the moment it's definitely easier to get coherent 2D images if some things than semireal / real ones, but augmenting them in that regard is something I already started working on with the "NSFW enhancer" dataset merged into V2.

    ZootAllures9111
    Author
    Jun 8, 2024ยท 5 reactions
    CivitAI

    Up next for V5: Ideogram dataset! It will have its own "by ideogram" artist tag appended at the end of the original caption for each image, to keep things compartmentalized nicely. I may also do some further photographic and NSFW tuning.

    Checkpoint
    SD 1.5

    Details

    Downloads
    230
    Platform
    CivitAI
    Platform Status
    Available
    Created
    6/7/2024
    Updated
    5/12/2026
    Deleted
    -

    Files

    zootvisionEta_v40Delta.safetensors

    Mirrors

    Available On (1 platform)

    Same model published on other platforms. May have additional downloads or version variants.