This was an experiment with LoKr for Flux. My best friend in the entire world bghira made this really cool optimizer called adamw schedulefree that's kinda like prodigy but doesn't suck for flux. Using it was like the opposite of lion, which is to say delightful, not super aggressive, and very tasty. It's so far pretty good at style transfer but not so much at introducing concepts or likeness. It's an adaptive learning rate optischeduler (very real word) and I like it a lot.
bghira also provided the film stills and captions for me to train with. very epic
Description
FAQ
Comments (10)
Fantastic results! Definitely interested in learning more about adamw_schedulefree and how you went about implementing it for usage, since it seems like it's helped a ton with training.
Don't know if I'll be able to implement it on my end just yet since I'm using Kohya's SD-Scripts for local finetuning, but could definitely still be worth learning about!
it's just adamw, that i've added kahan summation to, and then added built in learning rate adjustment based on the bias correction factor, similar to Facebook's implementation.
a lot of people don't know how to set the optimiser's learning rate warmup and this one just sort of handles it for you. but you don't get the learning rate decay, instead, you get decoupled weight decay and very stable training over many tens of thousands of steps. it makes it so stable that you have to adjust weight_decay and beta1/beta2 to get the thing to move more quickly like one might get from Lion
@ptx0 Thanks for your contributions! Would you happen to have a link to any sort of repo or documentation you've written up for this technique? Would be interested in taking a look at it
Goddamn is this a gorgeous model!
What kind of videocard did you train this on????
A100 SXM on Vast. I think I spent less than $2 for this run
did you build your own trainer code? Which one did you use?
I use SimpleTuner almost exclusively
@markury Thanks for sharing. This looks really good
ScheduleFree is from a facebook researcher, not bghira lol
Oh yeah, sorry, I forgot that researchers take the time to implement their findings into every downstream optimization task. As a reward for your excellent detective work, you can take a look at the SimpleTuner repo and report back with the person responsible for implementing the ScheduleFree code (hint). Was it that facebook researcher? Or perhaps you would have an easier time making some more lewd furry art. You seem to be better at that. Anyway, have a great day correcting people; maybe just make sure you're right first :)
Details
Available On (2 platforms)
Same model published on other platforms. May have additional downloads or version variants.


