Astigmatism (formerly 'Semantic Shift')

Astigmatism (formerly 'Semantic Shift') - Astigmatism -0.5b

NSFW

17-01-2025 START

Ok, so the newest astigmatism positive, +0.6 is here. It's really really good, but as in all things, I recommend blending with 0.5 to attenuate overfitting and truly get the best results possible. I'll look at a lora merge later and see if I can make an easy package with an "optimal" astigmatism at this stage.

Hope you all enjoy. I'm working on a really large negative for 0.6, but I need more buzz so it will take a little bit to train, but rest assured, it is on the way, and I think it will be quite a big jump.

17-01-2025 END

---

Post 0.5b, I recommend just playing with 0.5b and/or -0.5b.

When using the negative, be sure to crank CFG up to start, as this is the main advantage it affords you.

It also, in small amounts, can increase creativity, but broadly, +0.5b is the powerhouse, despite having a much smaller dataset

below is stuff I wrote previously for anything pre 0.5b:
---------------------
I recommend the following mix for anyone starting out (I will release some sort of a mixed LoRa sometime in the next week that will require less VRAM than loading 4 LoRas lol)

Astigmatism +0.5
Astigmatism -0.5
Astigmatism +0.4b
Astigmatism -0.2

The +'s at 0.33 each
The -'s at -0.33 each

This has to do with overfitting in the training process, and errors on my part. Rather than address these errors with limited resources directly (which I cannot do as this would require many many iterations of the LoRas that I cannot afford, in order to test and find the optimal setups) using blends mitigates overfitting and generally improves perfermance, as you can see from the plethora of merge Checkpoints on Civitai, including ones which simply merge newer versions of a model into the older version.

Basically, older versions may "understand" something better than a newer version, and vice versa, but as long as your version are MOSTLY improved, then the merge process will over time lead to the model becoming a better generalizer, and this particular lora, which directly target generalization and capabilities of the model, is no exception.

Love yall, and this community.

If anyone wants to collaborate on further training who has the resources, please contact me. I have had a great deal of success in improving prompt adherence and I suspect this can be massively grown with a solid community effort.

Carefully examine the weights used to know how to mess with this LoRa. Think of it like adjusting the fooocu on a lens that you are looking through. Every prompt and Checkpoint combination will have different needs, but ultimately, most of them can be dialed in such that adherence begins to work within a certain delimiter where it wasn't working previously.

I suppose I will have to do a video on the "Why" behind this soon as my adhd and time constraints make writing it, as I want to, beyond my current capacity. But a video, I probably can do, although it will be... chaotic.

This model is based on work I did on my "Unsettling" Lora. It uses some of the images generated there, along with subsequent images made using, again, the LoRa progeny of those, as well as the techniques I experimented with.

Basically, the goal of this LoRa is to "semantically shift" SDXL such that terms that have a set meaning are entirely changed in an internally consistent manner. I used a technique to do this partially in the Unsettling LoRa, although it was overtrained, and became intrigued by the idea that "good" prompts remain "good," albeit on a different axis, even if the internal understanding of them "shifts" within a given model. In other words: a unique and interesting prompt can create unique and interesting images in multiple new and unique themes if you play with the brain of the model in a directed way.

How did I do this?

I found areas of overtraining within SDXL and targeted them. Mona Lisa, Pillars of Creation, etc, and I redirected them to new images. As I suspected, this had ripple effects in the way the entire model perceives the concepts connected to the images modified, and these effects are quite substantial.

UPDATE

Since this started, the purpose of this LoRa has changed substantially to basically helping improve SDXL's overall prompt adherence and winrate, while using very small training datasets targeting the areas of overfitting in the model and teaching it to generalize them.

A side effect of this is that it is a lot easier to produce images at arbitrary resolutions.

Description

Improved parameters, same underlying training data

FAQ

Comments (16)

DecoySandroAug 3, 2024· 1 reaction

CivitAI

this is some good shit, well done!

BlueWalkingNov 13, 2024

CivitAI

why is this lora 1.5gig ?

sirrece

Author

Nov 14, 2024

because it has a lot of parameters

sirrece

Author

Nov 14, 2024

It doesn't really affect inference time since most platform just roll it into the model pre inference, and then as long as the weights remain the same, it doesn't have to recalculate.

BlueWalkingNov 16, 2024· 1 reaction

could you explain what this is on explanation of lora ? like i saw some results what it took me here but ive no idea about it

sirrece

Author

Nov 25, 2024· 1 reaction

@BlueWalking it basically inverted overfitting, although it's a bit more complex since it actually isn't an ablation. In some sense, it proactively ablates potential overfitting in future fine tuning by identifying likely candidates where increases in weight will further harm generalization, and then in turn, we invert this lora and proactively lower those weights. This is why it's particularly strong on even distant fine tunes of sdxl.

netsuNov 25, 2024· 1 reaction

CivitAI

I don't exactly know WHAT it does, but it sure does something that makes my images look more... visceral.. more palpable in many cases.

sirrece

Author

Nov 25, 2024

Yea, it's hard to express without technical stuff, but basically its a Lora intended to help the model, directly, to generalize, by countering overfitting. It's in some sense a weird form of ablation, although this technique (using an inverted Lora) is fundamentally different and had different downstream impacts. But basically, you can throw this, for example, on an illustrious+xl mix even and see results, particularly when paired with the positive.

JellaiDec 17, 2024

CivitAI

I don't understand. Could you have a screenshot of your Lora setup in comfy or something? I don't get what I'm supposed to do with these. You talk of mixtures, but I don't know if I'm supposed to run any as negative loras, make the numbers even out, or what. Your examples seem to run them all at 1 strength, but I don't know if it's telling me legit info.

sirrece

Author

Dec 18, 2024· 1 reaction

the astig with the "-" is the negative one, and the one without is the positive. I will probably include a full checkpoint on the next one, the Loras are just convenient for people who want site generation. I'll try to remember to comment here when I have that up, I have to finish the last touches on a flux Lora/checkpoint and then I'll be back on this SDXL line.

JellaiDec 19, 2024· 1 reaction

@sirrece Okay, so if it has a negative, I should run the Lora at negative strength, and vice versa for positive? Thanks for the clarification!

jimmyjoeJan 6, 2025

CivitAI

It seems you can push CFG scale absurdly high with this lora and still get usable results and prompt adherence, but what benefit does access to high CFG scales give you practically? I'm curious to know your reasoning.

sirrece

Author

Jan 7, 2025· 2 reactions

It's not intrinsically useful, but rather an indicator of generalization. When you push cfg broadly what you are going to see is the stronger "signals" in your neural net are only going to scale ie if you have overfit anywhere, you'll be more likely to see it.

This is often why people associate cfg with prompt adherence, although in truth its kind of a side effect, and it's less important to go high than it is to go "right". Having a lower cfg often works better with the Loras actually.

In any case, it's the generalization uplift that's the benefit here, which in practical terms means better adherence and a reduced rate of artifacting.

sirrece

Author

Jan 7, 2025

Incidentally, it also means pushing high res native works better (and I never thought to test it, but prob lower than typical step counts)

sirrece

Author

Jan 7, 2025

That being said, the Loras themselves have several areas of overfitting, so it's why I prefer to keep them separated so you can tune them if you hit a valley

jimmyjoeMay 25, 2025· 1 reaction

@sirrece I've used Astigmatism -0.5b for almost every prompt I've executed since I downloaded it. It makes generation times slower, but the images are almost always better.

LORA

SDXL 1.0

by sirrece

Download (Beta) View on CivitAI