17-01-2025 START
Ok, so the newest astigmatism positive, +0.6 is here. It's really really good, but as in all things, I recommend blending with 0.5 to attenuate overfitting and truly get the best results possible. I'll look at a lora merge later and see if I can make an easy package with an "optimal" astigmatism at this stage.
Hope you all enjoy. I'm working on a really large negative for 0.6, but I need more buzz so it will take a little bit to train, but rest assured, it is on the way, and I think it will be quite a big jump.
17-01-2025 END
---
Post 0.5b, I recommend just playing with 0.5b and/or -0.5b.
When using the negative, be sure to crank CFG up to start, as this is the main advantage it affords you.
It also, in small amounts, can increase creativity, but broadly, +0.5b is the powerhouse, despite having a much smaller dataset
below is stuff I wrote previously for anything pre 0.5b:
---------------------
I recommend the following mix for anyone starting out (I will release some sort of a mixed LoRa sometime in the next week that will require less VRAM than loading 4 LoRas lol)
Astigmatism +0.5
Astigmatism -0.5
Astigmatism +0.4b
Astigmatism -0.2
The +'s at 0.33 each
The -'s at -0.33 each
This has to do with overfitting in the training process, and errors on my part. Rather than address these errors with limited resources directly (which I cannot do as this would require many many iterations of the LoRas that I cannot afford, in order to test and find the optimal setups) using blends mitigates overfitting and generally improves perfermance, as you can see from the plethora of merge Checkpoints on Civitai, including ones which simply merge newer versions of a model into the older version.
Basically, older versions may "understand" something better than a newer version, and vice versa, but as long as your version are MOSTLY improved, then the merge process will over time lead to the model becoming a better generalizer, and this particular lora, which directly target generalization and capabilities of the model, is no exception.
Love yall, and this community.
If anyone wants to collaborate on further training who has the resources, please contact me. I have had a great deal of success in improving prompt adherence and I suspect this can be massively grown with a solid community effort.
Carefully examine the weights used to know how to mess with this LoRa. Think of it like adjusting the fooocu on a lens that you are looking through. Every prompt and Checkpoint combination will have different needs, but ultimately, most of them can be dialed in such that adherence begins to work within a certain delimiter where it wasn't working previously.
I suppose I will have to do a video on the "Why" behind this soon as my adhd and time constraints make writing it, as I want to, beyond my current capacity. But a video, I probably can do, although it will be... chaotic.
This model is based on work I did on my "Unsettling" Lora. It uses some of the images generated there, along with subsequent images made using, again, the LoRa progeny of those, as well as the techniques I experimented with.
Basically, the goal of this LoRa is to "semantically shift" SDXL such that terms that have a set meaning are entirely changed in an internally consistent manner. I used a technique to do this partially in the Unsettling LoRa, although it was overtrained, and became intrigued by the idea that "good" prompts remain "good," albeit on a different axis, even if the internal understanding of them "shifts" within a given model. In other words: a unique and interesting prompt can create unique and interesting images in multiple new and unique themes if you play with the brain of the model in a directed way.
How did I do this?
I found areas of overtraining within SDXL and targeted them. Mona Lisa, Pillars of Creation, etc, and I redirected them to new images. As I suspected, this had ripple effects in the way the entire model perceives the concepts connected to the images modified, and these effects are quite substantial.
UPDATE
Since this started, the purpose of this LoRa has changed substantially to basically helping improve SDXL's overall prompt adherence and winrate, while using very small training datasets targeting the areas of overfitting in the model and teaching it to generalize them.
A side effect of this is that it is a lot easier to produce images at arbitrary resolutions.
Description
Improved parameters, same underlying training data
FAQ
Comments (16)
this is some good shit, well done!
why is this lora 1.5gig ?
because it has a lot of parameters
It doesn't really affect inference time since most platform just roll it into the model pre inference, and then as long as the weights remain the same, it doesn't have to recalculate.
could you explain what this is on explanation of lora ? like i saw some results what it took me here but ive no idea about it
@BlueWalking it basically inverted overfitting, although it's a bit more complex since it actually isn't an ablation. In some sense, it proactively ablates potential overfitting in future fine tuning by identifying likely candidates where increases in weight will further harm generalization, and then in turn, we invert this lora and proactively lower those weights. This is why it's particularly strong on even distant fine tunes of sdxl.
I don't exactly know WHAT it does, but it sure does something that makes my images look more... visceral.. more palpable in many cases.
Yea, it's hard to express without technical stuff, but basically its a Lora intended to help the model, directly, to generalize, by countering overfitting. It's in some sense a weird form of ablation, although this technique (using an inverted Lora) is fundamentally different and had different downstream impacts. But basically, you can throw this, for example, on an illustrious+xl mix even and see results, particularly when paired with the positive.
I don't understand. Could you have a screenshot of your Lora setup in comfy or something? I don't get what I'm supposed to do with these. You talk of mixtures, but I don't know if I'm supposed to run any as negative loras, make the numbers even out, or what. Your examples seem to run them all at 1 strength, but I don't know if it's telling me legit info.
the astig with the "-" is the negative one, and the one without is the positive. I will probably include a full checkpoint on the next one, the Loras are just convenient for people who want site generation. I'll try to remember to comment here when I have that up, I have to finish the last touches on a flux Lora/checkpoint and then I'll be back on this SDXL line.
@sirrece Okay, so if it has a negative, I should run the Lora at negative strength, and vice versa for positive? Thanks for the clarification!
It seems you can push CFG scale absurdly high with this lora and still get usable results and prompt adherence, but what benefit does access to high CFG scales give you practically? I'm curious to know your reasoning.
It's not intrinsically useful, but rather an indicator of generalization. When you push cfg broadly what you are going to see is the stronger "signals" in your neural net are only going to scale ie if you have overfit anywhere, you'll be more likely to see it.
This is often why people associate cfg with prompt adherence, although in truth its kind of a side effect, and it's less important to go high than it is to go "right". Having a lower cfg often works better with the Loras actually.
In any case, it's the generalization uplift that's the benefit here, which in practical terms means better adherence and a reduced rate of artifacting.
Incidentally, it also means pushing high res native works better (and I never thought to test it, but prob lower than typical step counts)
That being said, the Loras themselves have several areas of overfitting, so it's why I prefer to keep them separated so you can tune them if you hit a valley
@sirrece I've used Astigmatism -0.5b for almost every prompt I've executed since I downloaded it. It makes generation times slower, but the images are almost always better.









