CivArchive
    Preview 22774926
    Preview 22779338
    Preview 22775112
    Preview 22774976
    Preview 22761895
    Preview 22777872
    Preview 22777947
    Preview 22778063

    Please Read Description

    NatViS (Natural Vision) is a photorealistic full-parameter fine-tune of SDXL that uses Natural Language prompting to generate high quality SFW/NSFW images. Trained on 1M+ image-caption pairs on a dataset that’s been expanded and refined for over a year.

    v3.0 is being rebuilt from the ground up to expand the knowledge domain and improve text-image alignment across various prompting styles.

    Curent v3.0 Status: Data Procurement

    As of right now I can only work on the update in my spare time so there's no planned release date.

    Please message me on Ko-Fi (bellow) to give feedback and suggestions. Email and public discord will be up soon!


    Buy me a coffee ❤

    https://ko-fi.com/ndimensional

    I’ve never been a fan of e-begging, however SDXL fine-tunes at this scale are becoming expensive to tune. So I will begrudgingly ask; if you like what I do and would like to support my models. Consider donating on Ko-Fi 💗
    I will be begin posting updates, answering questions, taking feedback, and releasing early access (NOT EXCLUSIVE) models to supporters.

    All donations will be used to fund the creation of new Stable Diffusion fine-tunes and open-source AI tools.


    Changelog

    ============

    11-24-24 NatViS v2.7 Hyper 4Step and link for 4step Lightning (🤗)

    • Uploaded 4step Hyper varient of NatViS v2.7. See About this version for more info.

    • Lightning: 4step Lightning varient of v2.7 can be found HERE for the time being. 8step Lightning will be uploaded within a day of writing.

    • Note: Sample images are limited because of time constraints.

    ============

    11-21-24 NatViS v2.7 Hyper 8step

    • Released 8step Hyper varient of NatViS v2.7 with consistant CFG. See About this version for more info.

    11-18-24 NatViS v2.7

    • Due to time constraints, pre-release changelog can be viewed HERE for the time being.

    • Note: I was bored generating the same sample images over-and-over again and decided to spice things up with some new prompts. Prompts from previous versions will work with v2.7. When I have time, I'll upload a separate gallery for images generated with the old prompts.

    ============

    10-26-24 NatViS v2.5 Lightning 4step (Not Recommended!):

    • Uploaded 4step Lightning version of NatViS 2.5

    • ONLY USE IF NEEDED

    ============

    10-25-24 NatViS v2.5 Lightning 8step

    • Released 8step Lightning version of NatViS v2.5. Read About this version

      • Note: Unlike my previous 8step lightning releases; this version is a simple merge with the SDXL Lightning LoRA. I did this due to requests for low CFG.

        • Sample images may not be the best representation of the model as a result of me not fully understanding the quirks of Lightning.

      • I will be releasing the FULL CFG 8step lightning version as well, since it appears to preserve more of the fine-grained features from the fine-tune.

    ============

    10_23_24 NatViS v2.5

    What's New?

    • Uploaded NatViS v2.5

      • Updates to text-encoder(s) to reintroduce tag/booru-style prompting capabilities that were broken in v2.0

      • Subset of data included from new (improved) dataset, specifically image-caption pairs with short n' punchy captions.

        • Info on new dataset (for future models/update): Includes more variation of caption styles and all automation is manually verified by a human (i.e., me).

      • Introduced more analog photography and classic cinematic film image data to further the push for more authentic realism.

    What's Next?

    • General: Review SD3.5 license to see if it's worth touching.

      • It's not terrible. Will start research into models architecture for fine-tuning/LoRA.

    • General: Release Anti-Pony Alpha model (Anime, Digital Illustrations).

      • In advance, it's not nearly as robust as Pony. This is a test to see if there's enough interest in the idea to pursue crowd funding for training.

      • Trained with character knowledge and quality in-mind, novel booru+ tagging system & natural language prompting, multiple styles/mediums, artist knowledge, no silly quality ranking tags, SDXL compatible (i.e., not overfit and broken)

      • More info will come out soon.

    • NatViS: Release of Lightning variants for NatViS v2.5.

      • Done more effectively this time.

    • NatViS: Finally getting around to creating, and releasing a PDF guide.

    • NatViS: Continue fine-tuning of v3.0.

    ============

    10_2_24 NatViS v2.0 Lightning 4step

    • Uploaded 4step lightning model for v2.0

    ============

    10-1-24 NatViS v2.0 Lightning 8step

    • Uploaded 8step lightning models for v2.0

    ============

    9-25-24 NatViS v2.0

    What's New?

    • Prompting: This update focuses primarily on the text-encoders. Natural language prompting capabilities have been improved to follow less-strict formats and relies less on using specific tokens.

    • Ethnicity and Demonym: Increased accuracy of phenotypes for various ethnicities and demonyms. Not just limited to body structure, but also includes clothing, hair, landscapes, ect.. See here for small examples.

    • Camera EXIF: Inclusion of Camera EXIF data for popular modern and analog cameras that can be prompted. Includes, Camera Name, Focal Length, f-stop, ISO, shutter speed, lens type. Also includes attachments such as ND filters, polarizers.

    • Analog: Improvements to analog and vintage photograph generations.

    • Lighting and shadow: Prompt how light (or thereof) interacts with objects/subjects in the scene. Amongst other general lighting related modifiers. More info soon.

    • Skin Textures: Small improvements to the detail of skin textures with less or no explicit token related to skin detail.

    • Implementation of Pseudo Instruction: This will require a more lengthy write-up.

    • Better male anatomy.

    • Lesbians.

    What's Next?

    • Lightning models will be released within the coming days.

    • Full PDF guide and documentation within the next week.

    • Info on v3.0 within the next month.

    8/4/24 NatViS v1.0 Lightning 4step

    • Uploaded 4step lightning version of v1.0 (See About this version for more info).

    ============

    8/3/24 NatViS v1.0 Lightning 8step

    • Uploaded 8step lightning version of v1.0 (See About this version for more info)

    ============

    8/2/24 NatViS v1.0

    • Initial Release


    Usage Tips

    Note: These are simply recommendations, feel free to experiment.

    Prompting

    NatViS leverages SDXL’s bigG text-encoder to allow for Natural Language prompting.

    What is Natural Language Prompting?
    Since the release of Stable Diffusion v1.4 — people have become accustom to comma delimited lists of visually descriptive tags/phrases. This was a necessity for early Stable Diffusion models due to the architecture and choice of text-encoder. With SDXL’s dual text-encoder/tokenizer architecture we are able to write more naturally descriptive prompts.

    Simply describe the image you want to generate, just as you would describe the image to a person.

    For example;
    Comma delimited list: a woman, standing, outdoors, sun beams, dappled light, apple tree, wearing denim jeans, flannel shirt, brown hair, long hair, looking at viewer, highest quality, atmospheric, 35mm, masterpiece

    Natural Language: A masterpiece, 35mm-style photo of a woman with long brown hair, standing outdoors in dappled sunlight beneath an apple tree. She wears denim jeans and a flannel shirt, gazing directly at the viewer with an atmospheric quality.

    Note: This is just an example to highlight how to write a natural language prompt. For better examples, see the sample images.

    Will NatViS Understand Everything I tell it?
    Absolutely, not.
    Due to various limitations in both the architecture and size of the data I’m able to fine-tune as one person. There will be instances where the model will simply not generate what you want. Often, you experiment with different wording, placement of tokens (i.e., moving a sentence or individual token closer to the start or end of a prompt), remove potentially conflicting tokens, ect… Their really is no definitive solution I can, as it varies from prompt-to-prompt. Unfortunately there will times when no solution/workaround is successful.

    Can I still use Tags?
    Short answer: Yes
    SDXL’s dual text-encoder/tokenizer architecture can process tokens/sequences with both encoders in parallel. Meaning, you don’t have to use natural language prompting.


    Note: Since the training data was purely captioned with Natural Language descriptions, not all the common descriptive tags people are familiar with will be understood by the model. Especially Booru, Booru-style tags.

    I found a hybrid system works well, as seen in many of the sample images.


    For example;
    Say you tried your natural language prompt, but want to make the results a bit more cinematic. Instead of modifying the entire prompt; you can simply append cinematic lighting, harmonious, film still, ect.. To the end of your prompt.

    Quality Tags/Classifiers? (score_up_x)
    Blasphemy.
    You can use quality rank/classifiers if you want. But they will not part of the training data.

    Negative Prompt
    Similar to other SDXL models. Use tags separated with commas and keep it short. Add/Remove tokens from the negative prompt as needed.

    Generation Parameters

    CFG:

    • Recommended: 5-7

    • 7+ to enforce a specific style/medium

    Sampler/Sampling Steps:
    This can be quite subjective, so I will just share what I typically use instead of giving direct recommendations.

    • Sampler - DPM++ 2M SDE

    • Scheduler - Karras

    • Steps - 55

    ADetailer: (Extension)
    Link
    Again, subjective so I’ll just share my settings.

    • Model - mediapipe_face_full (use mediapipe for photorealism)

    • Confidence - 0.45

    • Everything else is default.

    CFG Rescale: (Extension)
    Link
    I forgot that I had this installed, I’m not quite sure if it was enforcing the zero terminal SNR to the noise schedule or not. Since the parameter was null, it shouldn’t have.

    • Phi - 0


    Important

    If you struggle to replicate the sample images, even with the exact seed and parameters. It’s likely because of the noise scheduler. I enabled the fix for this in Webui, but had since reinstalled webui and forgot to re-enable it. This only applies to V1 of NatViS.


    Training Info

    TO-DO
    This will take a while to write up. So in the meantime:
    TLDR; 1M+ images, processed/cleaned via personal Dataset Toolkit I’m developing, captioned via Multimodal Large Language Model (MLLM) with unified feature space (part of Dataset Toolkit, not GPT). Training Data, Configs, Custom Scripts will be made available and open-sourced when the final version is released. Dataset Toolkit has no announced release date.


    Check out my other models

    SDXL Checkpoints: https://civarchive.com/collections/966964

    SDXL LoRAs: https://civarchive.com/collections/966969

    40K Series: https://civarchive.com/collections/956187

    SD1.5 Checkpoints: https://civarchive.com/collections/966974

    SD1.5 LoRAs: https://civarchive.com/collections/966972


    Run On TensorArt (v1)


    🤗Huggingface Repo

    🤗Huggingface Repo - Lightning

    🤗Huggingface Repo - Hyper

    Description

    4step Lightning Version of v1.0 For Fast Inference.

    Recommended Parameters:

    • CFG — 1 -2

    • Steps — 4 -6

    FAQ

    Comments (40)

    fantaseedAug 4, 2024
    CivitAI

    @ndimensional

    Is it correct to say that in terms of realism NatViS is inferior to Clarity XL(focuses purely on photorealism)?

    ndimensional
    Author
    Aug 4, 2024· 4 reactions

    It's hard to say. Both models focus on photorealism, just in different ways.
    Clarity XL is often more cinematic, with more dramatic lighting and vivid colors, by default.
    NatViS is also capable of this, but with more work (via prompt). Defaulting to a more amateur photography aesthetic.

    NatViS shares a small parts of the Clarity dataset, though with different captions.


    Since you mentioned it, Clarity XL will be getting an update this month 😉

    fantaseedAug 4, 2024· 1 reaction

    @ndimensional 

    Thank you for letting me know.

    I thought of NatViS as the standard(natural) and interpreted [Clarity XL] ≒ [NatViS makeup].

    PineAmbassadorAug 4, 2024· 7 reactions
    CivitAI

    I've never seen any finetune like this. It's next level.

    amazingbeautyAug 4, 2024
    CivitAI

    in yes or no please , did your 8step model will works better than regular v1 plus adding the lightning 4 step 1cfg lora ?

    ndimensional
    Author
    Aug 4, 2024· 1 reaction

    I haven't tried using any of the lightning LoRA's on the non-distilled v1 model.
    Technically though, yes. The 8step model should outperform using a 4step LoRA on the v1 model in terms of quality.

    amazingbeautyAug 4, 2024
    CivitAI

    using 4step 1cfg lightning lora with your v1 model will decrease the quality of text encoder or overall quality ? will be major decrease or just a little ?

    eznorbAug 4, 2024· 8 reactions
    CivitAI

    cant wait to see the next version!

    deGENERATIVE_SQUADAug 4, 2024
    CivitAI

    Could you please publish a script for mixing LORA with the model? There are more interesting options for mixing, such as PCM.

    TrueToLife_FauxtoAug 5, 2024
    CivitAI

    Could you perhaps recommend an upscaler and denoising combo that you've tried that works for you?

    vanillahAug 5, 2024· 11 reactions
    CivitAI

    I've been playing around with this some more and gents, I think this is better than ANY established SDXL NSFW photo model out there.

    Prompt understanding is definitely level above. That indescribable aesthetic once you prompt for it.

    I might as well just delete half of the models and loras I currently have, coz this just made them redundant.

    EmilyperversionsAug 5, 2024· 8 reactions
    CivitAI

    By far the most realistic skin texture. I have no idea how you succeeded where others failed. Some came close but not as realistic as this. Congratulations!

    xG00N3RxAug 8, 2024· 1 reaction

    I was getting bored of SDXL, but this checkpoint alone blew my mind and got me back into making gens... amazing!!!!

    kunde2Aug 13, 2024

    @ablestable69420 Absoluetely, this is uniquely great !

    amazingbeautyAug 6, 2024
    CivitAI

    any lose of using your model with 4steps loras other than the normal a bit lose in quality ? any prompt understanding lose ?

    ndimensional
    Author
    Aug 7, 2024· 1 reaction

    Yes, there's some overall loss with the 4step version. Nothing major, but there is loss. The degree to which seems to depend on the level of complexity of the prompt. When I have the time, I'm going to look into updating the 4step version using a different method.

    amazingbeautyAug 8, 2024

    @ndimensional then your v1.0 regular model plus adding the 4step lora manually in comfy will be the same as your 4step model at this moment ?

    ndimensional
    Author
    Aug 8, 2024· 1 reaction

    @amazingbeauty I haven't tried it personally. But on the discord thread linked in the models description, you can find a Comfy workflow for doing just that. Doing it manually looks like it could be better*. I can't say for certain though.

    garifetdinov604Aug 8, 2024
    CivitAI

    Would love to see this model merged with the Boomer Art Model (BAM!).

    ndimensional
    Author
    Aug 9, 2024· 1 reaction

    Interesting idea. I've been toying around with the idea of creating a ProjectAIO merge for SDXL. Something to add a buffer between model updates. ProjectAIO being a model I made for SD1.5, where I merged all my SD1.5 fine-tunes/merges into one model. It was a silly idea that created some interesting results.
    btw, Boomer Art Model v3 is in the data cleansing phase atm and should start tuning within the next week 😊

    garifetdinov604Aug 11, 2024

    @ndimensional Great news!

    _1_Aug 13, 2024

    Merging takes only a minute, why not do it yourself? Or can't you do local

    txtswordAug 10, 2024
    CivitAI

    This is an incredible model. The photorealism is a huge upgrade from Pony Realism. Sometimes it seems overfit on poses and it can be hard to prompt a very specific scene, but the quality is incredibly high. If nothing else this can be a good model to switch to for adetailing faces or just to img2img with low denoising to bring more realism to your scene.

    12734Aug 11, 2024· 7 reactions
    CivitAI

    This is a wonderful model. Very good at cinematic shots.

    I think the model is poorly named and hard to find. It should be a lot more popular than it is.

    Even as a user of it I need to go back to my bookmarks to find it again. maybe it's just me not paying attention...

    MisterMr160Aug 12, 2024· 8 reactions
    CivitAI

    In terms of realism this is the best model I can find by a large margin. Please don't stop this project. This is incredible. Maybe a name change would be helpful and just keep updating it to keep it top of line.

    StreamofStarsAug 13, 2024
    CivitAI

    A very nice and versatile model with an unique hidden strength when it comes to realism.

    7evenSDXLAug 13, 2024· 1 reaction
    CivitAI

    awesome model! what image size do you recommend? for now I have only tried in 832x1216, I get good results :)

    vanillahAug 14, 2024· 2 reactions

    Here you go:

    Recommended Generation Dimensions:

    1344x768 (16:9) — Cinematic Film Stills

    1536x640 (21:9) — Ultrawide Cinematic Film Stills

    1152x896 (4:3) — Fullscreen

    1216x832 (3:2) — Mobile landscape

    1024x1024 (1:1) — Square

    1024x704 (11:16)

    768x1344 (9:16) — Tall (Instagram stories / snapchat)

    896x1152 (3:4)

    832x1216 (2:3) — Mobile Portrait

    704x1024 (16:11)

    7evenSDXLAug 14, 2024

    @vanillah thank you :D

    codadminAug 14, 2024· 3 reactions
    CivitAI

    Holy cow, this model delivers like none other! Absolutely terrific at generating different ethnicities, situations. Great at lighting and textures.

    gl4mdivaAug 17, 2024· 6 reactions
    CivitAI

    Right now by far the best model for creating fuzzy and fur stuff - realistic style. I did test it for some days now....and as you see on my gallery pics below, the results are stunning. The prompting follows a natural structure which first is "not what you have been used too".....but is superior to what you know. The prompt recognition is far above average. I cannot post NSFW pictures here for personal reasons...but "great" is the description what you get -> nothing less! Thanks for sharing it. It unlocks a huge amount of new options. Guys go for it, like it, post gallery pictures, support the creator.

    anubhavk198Aug 18, 2024· 1 reaction
    CivitAI

    BEst model EvEr

    LbhManAug 19, 2024
    CivitAI

    Thanks for this checkpoint. Good quality with few steps.

    fox23vang226Aug 23, 2024· 9 reactions
    CivitAI

    This model is being slept on/underrated. I dislike Pony a lot so Im always looking for the best realistic NSFW model and as of 8-23-24 this model is literally the best NSFW model available on CivitAI; #1. Its even gets hands/fingers almost at the accuracy level of Flux.1.

    fansteraiAug 26, 2024· 11 reactions
    CivitAI

    Amazing work! Looking forward to v2!

    CorbeAug 29, 2024· 4 reactions
    CivitAI

    Excellent model, I prefer using DPM++ 2M SDE Exponential at cfg 4-6 and around 20-30 steps

    huangximingAug 31, 2024
    CivitAI

    this is the sense of light i used,mechanism,action and other best models,very good works,thanks,

    hungdongfbSep 8, 2024· 12 reactions
    CivitAI

    My new favorite model.

    anubhavk198Sep 17, 2024· 3 reactions
    CivitAI

    Any plans for a update?

    ffjggrtbjibvSep 18, 2024· 3 reactions
    CivitAI

    This model is amazing. It has Porn capabilities that rival BigASP but with better quality and much better Prompt adherence. I can't wait for V2