CivArchive
    HDR VAE (Anima - QWEN Image) - BF16
    NSFW

    Qwen Image VAE

    • Full FP32 Training of Decoder

    • Works in ComfyUI

    Feel free to to suggest onsite support, to civitai staff. I don't think they have any agreements like with FLUX


    Overview

    This model is a fine-tuned variant of the base Qwen Image VAE, modified to emphasize high-frequency detail preservation and expanded color representation, following an HDR-style reconstruction objective.

    The evaluation compares the base and HDR-tuned models using perceptual, structural, distributional, and photometric metrics over identical input data.


    Evaluation Summary

    Perceptual Fidelity (LPIPS)

    • Base: 0.0177

    • HDR: 0.0786

    The HDR model exhibits a significant increase in perceptual distance, indicating reduced strict identity reconstruction under deep feature similarity metrics and a shift toward detail-enhancing reconstruction behavior.


    Structural Energy (Gradient Magnitude)

    • Ground Truth: 404.02 (both models)

    • Base Reconstruction: 313.46

    • HDR Reconstruction: 687.97

    The base model demonstrates strong low-pass behavior with reduced high-frequency content. In contrast, the HDR model exhibits high-frequency amplification, exceeding the structural energy of the original inputs.


    Color Distribution Support

    • Ground Truth: 33150.61 (both models)

    • Base Reconstruction: 35004.49

    • HDR Reconstruction: 40133.37

    The HDR model produces a substantially expanded color support space, indicating increased chromatic dispersion and reduced quantization collapse.


    Photometric Stability

    Brightness Bias

    • Base: 0.000351

    • HDR: 0.0000098

    Contrast Gain

    • Base: 0.9984

    • HDR: 0.99999

    Both models preserve global photometric consistency, with the HDR variant showing near-perfect affine stability.


    Channel Drift

    • Red Shift:

      • Base: +0.0116

      • HDR: +0.0104

    • Green Shift:

      • Base: -0.0606

      • HDR: -0.1856

    • Blue Shift:

      • Base: +0.0187

      • HDR: +0.0219

    The HDR model introduces a significantly stronger negative bias in the green channel, while maintaining comparable red and blue stability.


    Interpretation

    The base Qwen VAE behaves as a contractive perceptual projection operator, prioritizing smooth reconstructions and suppression of high-frequency components.

    The HDR-tuned variant transitions into a detail-amplifying reconstruction operator, characterized by:

    • Increased high-frequency energy

    • Expanded color manifold coverage

    • Higher perceptual divergence under LPIPS

    • Preserved global photometric invariance

    This represents a functional shift from a smoothing autoencoder regime toward a high-frequency preserving (HDR-like) reconstruction regime.

    Description

    FAQ

    VAE
    Qwen

    Details

    Downloads
    141
    Platform
    CivitAI
    Platform Status
    Available
    Created
    6/21/2026
    Updated
    6/27/2026
    Deleted
    -

    Files

    hdrVAEAnimaQWENImage_bf16.safetensors

    Mirrors