This is an experimental attempt to convert a standard SDXL model to V-prediction one with Zero Terminal SNR schedule with a fixed colors using a relatively small diverse dataset of some hand-picked images.
V6
You must set sampling type to V-prediction and apply Zero Terminal SNR patch, otherwise you will get noise.
This checkpoint was continued from ArtiWaifu using a modified version of https://github.com/kohya-ss/sd-scripts on ~1750 images with text encoder frozen. Compared to 5.X, switching to an actual finetune fixed weird and uneven loss scaling across layers, which allowed the model to learn how to show the full color range.
Unlike V5 and 5.1, V6 is a "normal" checkpoint, not a lora.
The flaws are:
Doesn't work well that great if merged to Neta 2.
Some concepts and poses appear to be overfit.
Some styles, poses and other knowledge isn't transferred well (if at all).
V5 & 5.1
To use it, you must apply it as a lora to another ArtiWaifu-based checkpoint such as ArtiWaifu or Neta Art 2.0, set sampling type to V-prediction and apply Zero Terminal SNR patch, otherwise you will get noise.
This checkpoint was continued from ArtiWaifu using https://github.com/KohakuBlueleaf/LyCORIS and a modified version of https://github.com/kohya-ss/sd-scripts on 1066 images.
You can use this sample workflow for ComfyUI: https://files.catbox.moe/cd3ao0.png
Someone was confused about whether this is a lora or not, and it's certainly should not be called a "LoRA" as in https://arxiv.org/abs/2106.09685. As to what exactly is this, you should ask kblueleaf.
The flaws are:
Full color range isn't learned due to OUT5 layer frying out. I blame it on the way LyCORIS trains individual layers.
It's very unstable and fails to work properly outside of 1024x base resolution.
Some styles, poses and other knowledge isn't transferred well (if at all).
Description
-usual 0-1 timesteps
-longer training




