CivArchive
    Preview 109197698
    Preview 109197892
    Preview 109197985
    Preview 109198100
    Preview 109198854
    Preview 109198922
    Preview 109198044
    Preview 109198190
    Preview 109198487
    Preview 109198763
    Preview 109198220
    Preview 109198689
    Preview 109198268
    Preview 109198389
    Preview 109198418
    Preview 109198831
    Preview 109198509
    Preview 109198287
    Preview 109198564
    Preview 109198593

    This is my experiment in creating an SD1.5 merge style model for i2i.I am satisfied with the current state, but I will update it if I feel anything is needed.or I might share ways to enhance this model rather than updating it directly.

    ■Since the model now has both anime and real versions, the detailed explanations have been moved to each model’s tab.

    Both are merged models created by selecting multiple high-quality models with minimal artifacts.

    ■1024px_model

    ●Additionally, due to merging with NAI v2 and Dora fine-tuning, the 1024px model has largely solved the problems associated with the 512px model. It features high resolution and much better tag adherence.Therefore, none of the drawbacks typically pointed out regarding SD1.5 apply to this specific model.

    Personally, I see it as a game-changer that significantly extends the limits of SD1.5. I strongly suggest giving it a try.

    ●That being said, even at 1024px, fine details like the eyes are often insufficient. I highly recommend upscaling further using i2i.

    ●kohya_deep_shrink is also effective for t2i, so it might be a good idea to try using it.

    Doing so can sometimes reduce the breakdown of backgrounds and fingers, leading to more stable results.

    ■512px_model

    With the three models—asian, real, and anime—now available, it could be fun to adjust their mix to find your ideal style.

    ●asian 0.5 + real 0.5 might yield a more mixed, half-and-half look.

    ●asian 0.5 + anime 0.5 might produce a cute, 2.5D-style appearance.

    Feel free to experiment with different ratios.

    ■Since this is just a merge, it shares a common SD1.5 limitation where NSFW tags may not be fully understood or followed.

    I have decided to manage the concept-enhancing LoRA separately.

    https://civarchive.com/models/1253884/sd15loralab

    Of course, it can be used on its own, It is designed for i2i processing the models below.

    https://civarchive.com/models/505948/pixart-sigma-1024px512px-animetune

    ■Depending on the situation, this extension may also improve colors and contrast.

    https://github.com/Haoming02/sd-webui-diffusion-cg

    https://github.com/Haoming02/comfyui-diffusion-cg

    ■Using external tools for level adjustment is also a good option.

    Reducing gamma slightly while enhancing whites can improve contrast even further.

    Using these should help achieve color rendering closer to that of SDXL.

    ■Surprisingly, generating at 768px or 1024px sometimes works fine.If you want more stability, merging with Sotemix could help.But since most LoRAs are trained at 512px, high resolutions can break the output.So it’s safer to use highres.fix or kohya_deep_shrink when using LoRAs.

    Personally, I prefer i2i upscaling over highres.fix, as it tends to produce fewer artifacts.

    ■Please feel free to ask if you have any questions!

    日本語での質問も大丈夫ですので気軽にお声がけください!

    Description

    null

    FAQ

    Comments (4)

    wktraNov 8, 2025· 1 reaction
    CivitAI

    as far as the 1024px merged model, can i simply finetune it as a normal SD1.5 model on onetrainer or kohya? because I tried to finetune the nai model and i couldn't get it to work.

    hjhf
    Author
    Nov 8, 2025

    Apologies for the long message, but I wanted to share my detailed thoughts. I hope they are helpful to you in some way.

    I believe you can train a merged model without issues if you are using LoRA or DoRA.
    Personally, if you want to maintain resolutions like 1024x1536, it might be better to train at 1280px, or perhaps use multi-resolution training (like 1024px and 1280px).

    Training only at 1024px might become unstable.

    Although they aren't specifically for a merged model, here are links to pages where I uploaded DoRA models for NAI V1 and NAI V2.

    https://civitai.com/models/1253884?modelVersionId=1874253

    https://civitai.com/models/1253884?modelVersionId=2133885

    These pages also include the Onetrainer settings I used, which you might find useful as a reference. I feel there were probably better settings, so please don't trust them blindly , but they should serve as a useful benchmark.

    Also, many merge bases are in fp16, which often forces us to save in fp16, but it's likely not impossible to do a full finetune if you wanted to.

    If I personally wanted to further enhance a merged model, I think a better approach would be to:

    1. Create a U-Net only full finetune or a lora or DoRA from a base model like NAI V1 or NAI V2.

    2. Then, extract the differences (diff) from that training.

    3. Finally, add (merge) that diff to the merged model.

    I feel this method can strengthen the model without destroying its original, intended style.

    Recently, I've come to believe that LoRA or DoRA is a less destructive and generally better approach. I actually succeeded in achieving higher resolution capabilities through long-duration training. However, even so, I sometimes wonder if a full finetuning followed by difference extraction would have been better for addressing fundamental issues like resolution improvement.

    However, in the case of my merged model, it is also a merge of NAI V1 and V2, so it's complicated (or: it has its difficult aspects) to say if simply training on NAI V2 and applying that (diff) to the merge would work.

    The merge ratio is about 0.6 (NAI V1) and 0.4 (NAI V2), so it's really hard to say...

    Training NAI V2 at 1024px and extracting the diff could be a good option, or perhaps finetuning the merged model itself for a long duration until you achieve your ideal results might also be good.

    Based on my personal experience, I believe that directly training a merged model might require careful consideration.

    You can probably train it to be faithful to your dataset, but it might be very difficult to preserve the original "appeal" of the merged model.

    (Well, perhaps you could maintain it by using your favorite images generated by that model as regularization images, or If you successfully use specific concepts or characters that you were lucky enough to generate with that merged model as training material, you might be able to maintain its style.)

    My feeling is that the style of a merged model is like a forced "facade" . I suspect its pre-baked style gets overwritten almost instantly by the dataset, and it reverts to the inherent "raw" or "noisy" characteristics of the base SD1.5.

    The nuance here is that it reverts to the model's original, fundamental state.

    For example, if you have a dataset of 50,000 images, the results will revert to that "raw" state early in the training. But after training it for, say, 100 epochs, it will eventually conform to the tendencies of that dataset. It's akin to "re-baking" the model. (The 100-epoch example is just to illustrate the nuance of training until it fully converges on the dataset's characteristics; this implies that long-duration training might be necessary.) Once the ideal style or concept is achieved, that's the end.

    However, if you're just training a character with about 50 images (which is only a few hundred steps), this probably isn't something to worry much about.

    Base models like NAI V1/V2 or the SD1.5 base are less biasedand represent the "original form." You can train them without fearing that the "paper-mache" structure will collapse. I believe this provides more peace of mind, even if the compatibility with your target merged model decreases slightly. This is why I start most of my training from these base models.

    (That said, for realistic-style merged models, they are often very far removed from the SD1.5 base. Basing them on NAI is also a bit tricky due to the stylistic differences. In such cases, I might train on both NAI and the merged model and then select the better result.)

    Alternatively, the U-Net of a merged model already "knows" many concepts. Therefore, it might be better to only train the Text Encoder or embeddings. This approach might allow you to achieve results for specific concepts, characters, or styles without breaking the original style of the merged model.

    wktraNov 14, 2025· 1 reaction

    @hjhf I haven't had time to reply because I've been busy digesting and turning your wonderful advice into some sort of finetuning strategy.

    I am very much touched and humbled by your wisdom. It is very much appreciated and welcomed. I can't thank you enough for your advice. 😭

    hjhf
    Author
    Nov 15, 2025

    You're very welcome! I am also always thinking about training methods that can improve the model while being as non-destructive as possible, but I haven't found a great solution yet. What I shared is mostly based on my own rules of thumb, and there isn't much I can say with certainty yet. So please feel free to experiment and try various things!

    Checkpoint
    SD 1.5
    by hjhf

    Details

    Downloads
    295
    Platform
    CivitAI
    Platform Status
    Available
    Created
    11/6/2025
    Updated
    5/22/2026
    Deleted
    -

    Files

    sd15ModelLab_anime1024pxV10.safetensors

    Mirrors

    sd15ModelLab_anime1024pxV10_trainingData.zip

    sd15ModelLab_anime1024pxV10.zip

    Mirrors

    sd15ModelLab_anime1024pxV10.zip

    Mirrors