Zendaya Maree Stoermer Coleman is an American actress and singer. She was born on September 1, 1996, in Oakland, California.
Fuel my GPU's caffeine addiction! (Ko-fi)
https://ko-fi.com/cyberaimania
Trained on Wan 2.1 T2V-14B model locally on RTX 4090 with 49 publicly accessible images using diffusion-pipe.
Trigger: zendj111
Example prompts:
--
(realistic, candid shot), medium shot, zendj111 waiting at a bus stop, leaning against the shelter, looking down at her phone then briefly up and sighing softly, wearing jeans and a t-shirt, slightly overcast day, normal street background, subtle body movement.
--
(vlog style phone footage, kitchen counter), close-up on zendj111's face as she takes a bite of food she just prepared, ((eyes widen slightly in reaction to the taste)), maybe a small nod or shake of the head, looking at the camera sharing her reaction, slightly messy kitchen background, natural window lighting mixed with kitchen lights.
--
(video call footage, webcam perspective), close-up, zendj111 sees the camera activate, gives a natural, slightly hesitant wave and a warm smile, says "Hi" (no audio needed, just mouth movement), typical webcam lighting from monitor.
This is my second LORA, so I would REALLY appreciate any comments with suggestions, feedback, advice, or even negative opinions.
Check my other Loras: Claire Forlani
https://civarchive.com/models/1465482/claire-forlani-or-wan21-t2v-14b
Enjoy!
Description
FAQ
Comments (9)
Cool, what Resolution, steps, sampler, etc did you use for your samples?
Could you make a lora of Hayley Atwell as well? This one works pretty well
Tomorrow afternoon will be ready
720 x 480, steps 30, EULER,
I really appreciate that you are posting your training datasets with your models!
I really like your LoRA. I've just started learning about LoRA training, and you sharing your training data has been a great help. Would you mind sharing the steps you took to train it? I noticed you set the learning rate (LR) between 4e-5 and 3e-5 with a dataset that has a repeat rate of 12. Did you use this value from the beginning to the end, or did you use different values initially? Any information you could share would be greatly appreciated. Thank you very much.
I'm happy to share a bit about the process for this one. You're right about the learning rate (LR) – for the main training run, I aimed for 4e-5 (which is 0.00004).
Regarding your question about whether it was constant:
Main Training Phase: Yes, once the training got going properly, that 4e-5 LR was used consistently right through to the end. I didn't manually change it mid-way.
Initial Warmup: There's a standard setting called warmup_steps (I used 100 for this). During those very first 100 steps, the learning rate gradually increases from a very low value up to the target 4e-5. This helps the model start learning more gently before hitting the full speed LR. So, technically it does change right at the beginning, but that's an automatic part of the process.
Backup LR: The mention of 3e-5 was more of a fallback plan. Sometimes, if the training becomes unstable (you might see errors like loss=nan in the logs), lowering the LR slightly (like to 3e-5) can help stabilize it. Thankfully, in this case, 4e-5 seemed to work okay from the start (after the warmup).
About the num_repeats = 12: This setting works together with the number of images in the dataset (I had 120 images) and the total number of epochs (I used 20). It basically controls how many times each image is "seen" by the model within one epoch. I adjusted this number to aim for a total number of training steps (around 14,000-15,000) that usually gives good results for this kind of LoRA within a reasonable time frame (around 10 hours on my RTX 4090). So, 12 repeats * 120 images = 1440 examples processed per epoch.
Just a few other key settings/steps I used for context:
Base Model: Wan2.1 T2V 14B (using the FP8 version of the main transformer weights, which saves a lot of VRAM).
Tool: The training was done using the Diffusion-Pipe framework.
Resolution: Trained at 512x512.
Rank: LoRA Rank 32, like you mentioned.
Optimizer: AdamW8bitKahan (great for saving memory).
Key Optimizations: activation_checkpointing = 'unsloth' is crucial for saving VRAM on the 14B model. I also managed to run this specific one with blocks_to_swap = 0 (no VRAM swapping to disk), which makes it faster, using gradient_accumulation_steps = 2. If you run into memory issues, setting blocks_to_swap = 5 and maybe increasing gradient_accumulation_steps to 4 is a safer starting point.
Hope this gives you a clearer picture! It often takes a bit of experimenting to find what works best for your specific data and hardware. Don't hesitate to tweak things and see what results you get.
Good luck with your LoRA training!
@CyberAImania That was a very clear and detailed explanation, I'll give it a shot and try to adapt it. Thank you very much!
Thank you for sharing training data!
Details
Files
Available On (2 platforms)
Same model published on other platforms. May have additional downloads or version variants.