Trigger words: ANIMSTY, SHW_GLBY.
Note: I used the wrong trigger on the samples, I reccomend using the above instead.
I completely retrained from the ground up using newly sourced 493 clips from golden boy. I used my 90s anime lora as a base.
https://huggingface.co/comfyuiman/anime90s_ltx/tree/main/anime90s-step00053000-state
I resumed from the 53K savestate in musubi and trained an additional 14K steps. I saved a checkpoint every 500 steps but checked in around every 5k or so steps. I think this checkpoint has a good balance between movement and detail. I'll also upload the 72k steps version which has less detail and more movement if you want to compare them.
You do not need to mention regarding the style (ie anime), it should trigger almost every time now. Mentioning these keywords will drift the motion towards 3d so I don't recommend it. As always I recommend using my example workflow. And set it to 50fps to reduce blur.
Try this lora with the OMNI NFT lora at strength 1 on both passes and you get very nice results too:
https://huggingface.co/Kijai/LTX2.3_comfy/blob/main/loras/LTX-2.3-OmniNFT-RL-Lora_bf16.safetensors
Overall I'm very happy with the result. And training anime on LTX is a lot easier now than it used to be.
Description
FAQ
Comments (16)
This was more for learning, but I put it out anyway. Maybe over trained, I was not sure which is better 4.5k steps or 2.5k steps.
Hey. I don't train or have any idea of how it works, but may I ask a question? You said it looks good at 2500steps but voices are high pitched and it fixed at 4500. Isn't there a way to split the training efficiently between video and audio? Like after X steps it only trains audio? Thank you 🙌
@GlowingGuardianGirl you can pause the training and turn off audio training. Maybe better to do the other way around, train without audio and for x # of final steps have the audio training enabled. But I think high pitched audio would be fixed with the audio normalization on at first. or maybe using character tags so captioned audio doesn't mesh together (just a theory). FYI after I put this setting correct, the high pitched voice left after 500 steps, though I did see it come back sort of later but just not as bad
@tazmannner379 Thank you for answering 🙌 So the option exists but it's not adequate. And is split training possible? Audio only/Style only and merge both?
@GlowingGuardianGirl I dont know about audio only training, my expereince so far is splitting training runs is less stable (doing video and then image training versus just image + video together)
@tazmannner379 Yeah no worries, I was just wondering if there weren't a more efficient way / time effective because of what you described for the training. Thank you for taking the time to answer, cheers 🙌
Fantastic idea! Goldenboy has, in my opinion, arguably the best retro anime aesthetic. Thanks for this, can't wait to see where you take this next for LTX 2!
😍 👍🏻
I found 1920x1200 generations fix a lot of the issues I was having. including sound issues fixed
after testing I think 7k version is fine with higher resolution so I posted that too.
thx :O)
"I had to set AI toolkit to train on 16fps"
What do you mean by this? Which setting is this
AI toolkit will default to like 24 fps or something. tbh I dont reccomend ai toolkit anymore. its trash for ltx. Also all the advice in this lora is very dated, it was the very first attempt at an ltx lora for me.
Waithing for support on ltx 2.3
Working on this hopefully soon. I have the dataset pretty much ready actually. I restarted from scratch
@tazmannner379 Thx, your models are awesome!