A lora which intends to create realistic generations involving furries.
Trained on images generated with chroma, using civitai's training features. No regularization or anything.
Based on illustrious, sort of supports noob.
Recommended to use with novafurry.
More versions coming soon hopefully. CivitAI generator version is biased due to the lack of countering the training. Planning to train a version with minimal changes to the image content, which only changes the style.
Description
Rank 64, not countered on bias. No proper regularization. Don't expect much.
FAQ
Comments (9)
I always approve of realistic furries! But why use it with NovaFurry, it was never intended to be realistic (did you train on NovaFurry?). If you wanna max out realism it would have been more beneficial to train on NovaAnimal instead imo.
He said it is trained on chroma, but using it with Nova Animal or a more realistic model will probably yield less flat results
Nice work! Just a tip, I have found that using a high network rank of 512 for wan 1.3b fun gives a massive boost in quality to the point that it gets close to that of wan 14b. You can see examples on my new Lora I posted. I ran many, many tests to confirm this I did 128 then 256 then 512 each jump in network rank resulted in massive quality boosts! Make sure you use the automagic optimizer as it automatically adjusts the learning rate and does a very good job at it! Learning rate is very, very important from my extensive testing and so is batch size.
Yeah the higher the rank, the more precisely it can adjust weights. I wish full diff "lora" was available in diffusion pipe (comfyui supports it, it's effectively a hypernetwork with every single change included, while loras compress the changes).
It probably wouldn't be a good idea to go beyond 1024 rank for example, anything higher and you're going to use nearly as much memory as full diff finetuning, with extra calculations and lower precision.
@mylo1337 its been tempting to experiment and see how 1024 compares to 512, at that point its pretty much a "fine tune" of the 1.3b since the lora parameters will be at least 1B, I'm curious of how many motion nuances it can learn from the dataset with that many trainable parameters paired with the automagic optimizer. Larger batch sizes seem to be best as well if you have the VRAM, a global batch size of 16 seems to give the best result on the current dataset. I wrote out a small lora training "quality guide" thought id share my findings on what has worked well for me.
Wan fun 1.3b InP v1.1 high quality LoRa training workflow
High capacity LoRa Rank 256-512
Minimum rank is 128 even then its often too low
Avoid ALL ambiguity in dataset captions, use creative writing and be very descriptive of the actions, motion, scene composition, character description, background, environment, anatomy
Use the Automagic optimizer so it optimally adjusts LR throughout the training run
Wan fun 1.3b InP v1.1 is the current best for small model i2v training
Use high batch sizes 12-16 EBS maybe even higher
you do not need to train with a high resolution I have noticed no decrease in quality reducing training resolution from 480p to 256p only quicker training and less VRAM usage
Ensure there are no low quality videos or videos with stuttering as that will show up in the generations
Loop videos to an average of 25-30 seconds
Include perfect loops of the action/motion the LoRa is intended for
Use multiple_overlapping
Make the perfect loops 2-4 minutes long
Aim for around 5~ of those perfect loops if your dataset is 40+ videos long
@basedbase Oh yeah, I didn't account for the wan 1.3b block dimensions, don't go over 512 rank because if you do 1024 rank, your model will probably be larger than the base model itself (and any additional parameters are going to be completely useless)
A regular wan block's attention weight is 1536*1536 (2359296) parameters
The rank 1024 lora's attention weight is 1024*1536*2 (3145728) parameters
While there are also feed-forward weight layers in wan with dimensions (8960, 1536). You will still lose out on a ton of potential, and use nearly the same amount of weigths even on those layers.
Regular feed-forward is 8960*1536 (13762560) parameters
Rank 1024 lora's weights for that feed-forward is (8960*1024)+(1536*1024) (10747904) parameters.
512 is already too much, but at that point it'll at least be a bit better than lower ranks. But going to 1024 is 100% not worth the cost, as it'll cost more than just training a full diff.
@mylo1337 in my experience rank 512 was massively better in terms of quality than rank 256
@mylo1337 I will eventually try 1024 and report back on the findings, I have 48GB of VRAM to train with so I'm not reliant on runpod and I can train local so no issues on costs! If diffusion pipe ever enables full finetuning for wan 1.3b I will def try that out on a large dataset.
@basedbase Just train the whole model at that point. It's more precise and costs less memory. With a real 1024 lora you'll have more parameters than the base model and still not have full precision so full fine-tuning is both cheaper and more accurate.
I won't recommend even trying to make that lora, sure, it'll be better than 512 possibly. But worse than a full fine-tune and larger. There's no reason to do it.
Looks like we don't have an active mirror for this file right now.
CivArchive is a community-maintained index — we catalog mirrors that volunteers upload to HuggingFace, torrents, and other public hosts. Looks like no one has uploaded a copy of this file yet.
Some files do get recovered over time through contributions. If you're looking for this one, feel free to ask in Discord, or help preserve it if you have a copy.
Details
Available On (1 platform)
Same model published on other platforms. May have additional downloads or version variants.