CivArchive
    Preview 63387750

    This is a series of Swin2SR upscale models that I have trained on various hires images that I generated, hoping to enhance skin textures instead of smoothing them out, especially in photorealistic and digital art styles. I have tested them in ComfyUI and they should be compatible with Auto1111 and other tools that support Swin2SR.

    https://github.com/mv-lab/swin2sr

    Versions

    Three models are available. All of the models are available in both .safetensors and .pth formats.

    • custom x2

      • trained from scratch for 25,000 steps with batch size 16 on images that I generated

    • custom x4

      • trained from scratch for 28,000 steps with batch size 16 on images that I generated

      • not fine-tuned from the x2 model

    • DIV2K + custom x2

      • trained from scratch for 10,000 steps on the DIV2K dataset from the SwinIR repository

      • trained for an additional 40,000 steps on images that I generated

    The x2 models can be applied 2 times (x4) with minimal loss of quality and can be applied 3 times (x8) with some visible blurriness. The x4 model can be applied 2 times (x16) with noticeable blurriness.

    Quality

    The PSNR of these models is good compared to the corresponding scores for the BSRGAN, SwinIR, and Swin2SR models released on their respective Github pages. The best upscale model that I tested is the SwinIR x2 model trained on images from Lexica, https://openmodeldb.info/models/2x-LexicaSwinIR, which still exceeds the scores for my models. However, these models produce fewer artifacts around corners in the test pattern. I am hoping to improve on these models more in the future and will be experimenting with a patch size of 64 as well.

    PSNR of 45dB is roughly equivalent to saving a JPEG with 90% quality: https://en.wikipedia.org/wiki/Peak_signal-to-noise_ratio

    As I understand it, if you took an original image and saved one copy as a JPEG with 90% quality, then resized a second copy to 50% size and upscaled it using the custom x2 model, they should have the same loss of quality.

    Most of the tests were run with a tile size of 256, except for BSRGAN. The test script for BSRGAN does not support tiling and ran out of memory for the x4 testing due to the size of the images. Real ESRGAN does not provide a test script, but I will include it if I can find one.

    Comparison

    Test pattern is from Wikimedia: https://commons.wikimedia.org/wiki/File:Philips_PM5544.svg

    Custom x2:

    DIV2K + Custom x2:

    Lexica x2:

    Training

    All of these models use the Swin2SR architecture with a patch size of 48. They are trained on the same dataset of about 520 high-resolution images, generated by me using Flux.1 Dev and a hires workflow in ComfyUI. Low resolution images were created using bicubic interpolation.

    The custom models were trained on a RunPod pod with 2x A40 GPUs with 96GB of memory in total, using a batch size of 16. The DIV2K + custom model was trained on an A6000 with 48GB of memory, using a batch size of 8.


    Created by Thalis AI โ€” dark fantasy, cosmic horror, and the spaces between.

    โ˜• Support on Ko-fi โ€” every tip keeps the radium gems glowing.

    ๐ŸŒ™ Join on Patreon โ€” early access, prompt packs, and the deeper corridors.

    ๐Ÿ”ฎ Full catalog โ€” 100+ LoRAs for Flux, Illustrious, and more.

    Description

    FAQ

    Comments (5)

    arkinsonJun 12, 2025
    CivitAI

    Great work! ๐Ÿ‘ I did some tests with your custom x2/x4 versions. In comparison to my "standard" upscalers (UltraSharp/NMKD/ESRGAN/....) I am really impressed. Getting much more details and I really like that "natural" "photorealistic" look. I got just some minor problems upscaling small blurry faces - but I am pretty sure: no upscaler likes that ๐Ÿ˜‰

    ThalisAI
    Author
    Jun 13, 2025

    Thank you. This was trained on AI generated images, but I tried to include a lot of hires skin. For the small, blurry faces, the best method that I have found is a face detailer with a low threshold. You may get some other shapes, but if they are very blurry, even a low strength detailer can often find a face somewhere in that blob.

    arkinsonJun 13, 2025

    @ThalisAIย Yes of course - a face detailer mostly works fine (I used it often with Pony or SDXL). In my case I tested your upscaler with some really mean images - just for "science" ๐Ÿ˜‰ Generally I did not use a face datailer with Flux. 99 % of my gererations works fine with a single path KSampler and a final 3x upscale. And here your upscaler works really well. The quality of the upscaled skin parts and faces is amazing, also the overall impression. I use your upscaler as a "default" now ๐Ÿ˜œ๐Ÿ˜œ

    ThalisAI
    Author
    Jun 13, 2025ยท 1 reaction

    @arkinsonย Very cool, I am glad it is working well. I also do not often use a detailer with Flux, but I find that a little bit of hires (15-20 steps, 0.2 strength) can make it a little bit more crisp and add some nice details.

    CoffeeMageApr 28, 2026
    CivitAI

    Simple and fast with no rough edges.

    Upscaler
    Upscaler

    Details

    Downloads
    3,751
    Platform
    CivitAI
    Platform Status
    Available
    Created
    3/14/2025
    Updated
    5/13/2026
    Deleted
    -

    Files

    swin2srUpscalerX2And_customX2Safetensor.safetensors