Hello!
When experimenting, I made a merge of Cornerstone Labs' Turbo Loras, and Polyx's Cosmos-Predict2.5-2B DMD2 LoRA
Because one is rank32 and one is rank64, I did a binary masking with various routing maps. The version section describes what each one is. I think DMD2 Output Projection is the best for my taste.
Works across various styles.
Please post your images and feedback, I would like to hear if this helped you.
How to use:
I suggest between 6 and 12 steps. If you lower the weight below 0.8 or so, you might want extra steps (trial and error)
er_sde and euler are good samplers
for schdulers, try beta57, beta, sgm uniform, and bong tangent, but simple and ddim uniform also work
Description
For this version, I compared turbo 0.1, turbo 0.2, and the dmd2 distill to their own averages, as a ratio, and took the layers that were furthest from each lora's average. This was to help compensate for the rank32 vs rank64 difference.
FAQ
Comments (9)
Hey gang, how do I choose which image is the cover image?
Usually the most recent version and which ever is first in your order in the image post. And then random luck if civit works or not in displaying.
@Bacterial_Inflammation Oh, that's weird. Thank you. I guess I'd have to delete all the images to change the cover then?
@kai if you click the 3 dots at the top where your file names are "DMD (expressive)" it should say Manage Images. There you should be able to rearrange the photos. It depends on what you posted at the start when you uploaded it. If your first post Didnt have the cover picture just Manage the images and upload it, then rearrange.
@Bacterial_Inflammation Thank you! I'll try it! Very helpful!
not sure why civit won't let me mark images with all three loras as resources when its a comparison between them (enabled advanced mode), but shrug emoji
Regarding the rank difference, you could redim either lora using svd to avoid the 32/64 dim mix. Also, did you drop the x_embedder.proj.1 keys from the cosmos2.5 lora? the shape is wrong for anima which is based on cosmos2 (2048x72 vs 2048x68).
Yes, I did not route the x_embedder, as routing it caused a (harmless) error in comfyui. It gets skipped if its included, I tested both ways. I was able to assign the t_embedder to dmd2, though.
The way I did the merges was by building a routing map as a csv and then having a script build the new lora from the pieces of the others. I also casted everything to fp16.
I may experiment with redim, it's something I considered, but was pretty satisfied with the result I got without. By separating the functions out by lora, I believe the ranks are not fighting in a way where dmd2 as a rank 64 would have an outsized impact in the same way a normally binary mask would.
I'm not certain, but I believe keeping it mixed rank for the output projection routing has more information than otherwise.
For the LLM adapter blocks, technically I assigned turbo 0.1 to those, but as there is no change between turbo 1 and 2 for the llm adapters that's basically just a semantic.
I tried about 6 different techniques for choosing which layers to route, and the three I shared here had the best results. Even doing the 3 way avg diff ratio mask, dmd2 won something like 55% of the layers (likely because of the rank64).
Redimming the dmd2 to rank32 and doing a new avg diff ratio would be an avenue for exploration.
I think binary masking over slerping may yield better results because of how sparse latent space is, which is one of the main reasons I chose it at my method.
I don't mind sharing any of my scripts if you would want them, but they're 2/3 vibe-coded and then fixed up a bit.
Thank you for your thoughtful comment.
@kai Thank you for this full follow up and description 💜 On my own version i went for a standard weighted sum after a slerp merge of 0.1/0.2, but i did try for other LoRA some mix and match of "blocks" to see what i could achieve, that's a great way of doing so. For redim/svd, you can check my (bad) code here: https://huggingface.co/n-Arno/ANIMA_extract/blob/main/merge.py
