~Magic Wand for HunyuanVideo(and now with experimental MagicWan2.1)~
Heya! I've updated the Wan version with my much better training settings that I've developed! It should be /much/ better now and I improved the captions to support "wide shot/medium shot/medium close up/close up" like my other models do. From my tests the overall quality, motion and everything is improved significantly. If you have any issues with blurred faces or face out of frame please add "blurred face" or "face out of frame" to your negative! Enjoy! Training was exactly like my WanNipplePlay model: https://civarchive.com/models/1590451/both-hands-sensual-nipple-play-selfother?modelVersionId=1799759
Experimental MagicWan: Latest version "MagicWan" for Wan2.1 is up! It's experimental and my first attempt at training a LoRA for Wan. I used the same settings and dataset I'd used for 1.5 Hunyuan but lower res. By 3600 steps it was COOKED, but epoch 56 around 2000 steps seems usable though it can definitely be improved! Feedback is welcome and I will definitely be attempting to improve it. Oh and please let me know in the comments if there is any interest in a version for the 1.3B model, I don't use it personally but I'd be happy to train a version for it if it's desired!
MagicWand 1.5: New version is up! Featuring much better quality overall, it was trained on a 25% larger dataset at 8e-5 for 3600 steps. It can prompt for corded/cordless (specify either before the color but note that non-white corded models are very limited in dataset and in porn in general so ymmv there) and a bit better for the motion of the vibrator(can append either "vigorously rubbing it up and down" or "grinding herself against it/she is grinding herself against it" to the trigger, results may be seed dependent for grinding especially). For instance:
"a nude young woman is reclining on a bed while her girlfriend uses a cordless white MgcWndVb to stimulate her vagina, vigorously rubbing it up and down. She has messy blonde hair and large breasts. Her girlfriend has red hair and is wearing black lingerie. Their faces are close together and they are looking directly into each other's eyes as they share this intense moment of intimacy"
It should also handle prompts for couples better. I noticed that "her vagina is wet" doesn't work as well as it did in 1.0 though and instead... kinda makes her squirt a bit sometimes? More testing is needed there.
MagicWand 1.0:
Good vibrations are coming! Hey all, this is my second LoRA for HunyuanVideo and it turned out /really/ well in my opinion. It produces videos of women either using a magic wand style vibrator on themselves or having one used on them by another man/woman. It was trained at Rank 16 with blurred faces(captioned "blurred out face") so it should be highly compatible with your character LoRA. It learned the motion well enough you can often see the woman's vagina undulating with the vibrations of the wand!
Captions/Prompting:
"a nude woman is reclining on a kitchen chair with her legs spread while a man standing behind her uses a purple MgcWndVb to stimulate her vagina, rubbing it up and down. She has dark brown hair, small breasts, and her vagina is wet. He has gray hair and is wearing a dark gray shirt. Behind them is a large bed with a black and white painting hanging above it and on the left appears to be a sliding glass door and another painting"
"a nude woman with dark hair is using a black MgcWndVb to stimulate her vagina. She arches her back and bends forward as she begins having an orgasm. A tattoo is visible on her hip and her bellybutton is pierced."
The trigger should be "using a {black/white/purple/pink} MgcWndVb to stimulate her vagina." Several of the data points also showed the woman wet with arousal and were tagged "Her vagina is wet" so that's promptable too, as is the motion of the wand to a degree. It should be capable of creating solo female, female/female, and male/female scenes. The dataset was quite varied and included various positions, ages of women, ethnicities, aspect ratios etc so try it out I hoped to make it versatile!
Training notes:
I've learned that high resolutions are not needed for training Hunyuan(Sauce: https://civarchive.com/articles/11942/training-a-lora-the-right-way), at least for motion LoRAs. This LoRA was trained on 15 videos that were preprocessed to six seconds long at 24 fps. A simple python script was used to employ Yolov8X-face detection and then apply a heavy Gaussian blur to faces in order to maintain face agnostic behavior(captioned accordingly). These were then VAE encoded to a combination of 424x240@129f and 640x360@41f and trained in about 12 hours on my 4070TI Super with Musubi Tuner. LR was 1.2e-4 with a LoraPlus multiplier of 4 for 2400 steps using CAME optimizer and constant with warmup scheduler and 100 warmup steps.
Final Notes:
Ever used one of these things yourself or had one used on you? If not, you should try! It will break your mind in the very best way 😉
Oh also, I've got a few utility scripts I use when creating video datasets for like chunking up videos, normalizing frame rate, blurring faces, etc. Nothing special, don't expect too much; they are just simple CLI affairs but if you are like me and use a largely CLI workflow, they could be helpful: https://github.com/Sarania/videoprocessingscripts They've been written and tested on Linux but should work on Windows too. Depends on Python3, ffmpeg, opencv, and ultralytics. Feel free to use them or not I just thought they might be useful especially the yolov blurring one!
Description
Date: 2025-03-02T07:59:49 Title: MagicWand
Resolution: 1280x720 Architecture: hunyuan-video/lora
Network Dim/Rank: 16.0 Alpha: 1.0 dtype: BF16
Module: networks.lora : {'loraplus_lr_ratio': '4'}
Learning Rate (LR): 0.00012
Optimizer: came_pytorch.CAME.CAME(weight_decay=0.01,eps=(1e-30, 1e-16),betas=(0.9, 0.999, 0.9999))
Scheduler: constant_with_warmup Warmup steps: 100
Epoch: 40 Batches per epoch: 60 Gradient accumulation steps: 1
Timestep sampling: Shift Discrete flow shift: 7.0
FAQ
Comments (15)
works pretty well! Good job!
with that said, I'm not so sure resolution doesn't matter, I'm getting lower quality video from this one compared to others... like more grainy...
resolution matters, if he's inputting 640 and you're running at 720, it would make sense.
I'm basing that on my own data and the data of others: https://civitai.com/articles/11942/training-a-lora-the-right-way (in fact they recommend maxing out at 256x256) For instance this model of mine was trained at a combination of 480x272, 848x480, 1280x720 but is not objectively more detailed: https://civitai.com/models/1152160
All that said, I can include some higher resolution buckets in my revision and see if that helps what you are seeing, but to go above 848x480 requires me to cut them really short(16GB VRAM)
Edit: Also, many if not most of the motion LoRA on Civit were trained at similar or lower res as this one, especially those trained with diffusion-pipe because it doesn't support as much offloading/advanced memory stuff and video activation memory is huge. LoRA trained on images don't have this problem though. I use Musubi to get the max I can out of 16GB (I at least have 64GB sysram) but there still are limits unfortunately
@blyss from my own testing this LoRa looks no different than others, maybe he's just getting unlucky with bad resolutions
could be workflows... one thing that's a tiny bit annoying about HV is that I feel like some LORA work way better in some workflows and others in others. One relatively new one does great at low resolutions and upscales stunningly. Another one just zooms in crazy close at the same low resolutions, so upscaling is useless because the contents of the frame are just 'off'.
What workflow are you using or should I just drag/drop one of your vids?
Or... if I may... Does this:
https://civitai.com/images/61143146
not look washed out and even like more blurry than this
https://civitai.com/images/60955122
Not cherry picked... just two poster vids from two different LORAs
anyway! LOVE the wand concept, not trying to be a d!ck
@az420 No you're fine, I don't mind the feedback or discussion! I do agree that one looks a bit washed out but I think that's because of the workflow I used in creating the demo ones. I basically did a "HiresFix" where I generated at 960x544 or equivalent, then used 4x-NMKD-Siax-200k to upscale, then lanczos down to 1280x720, then vid2vid for 40% denoise with the same prompt/LoRA/etc, then a final hires step to finish at 1920x1080. Usually this works really well for me to create the best quality I can achieve, but occasionally the ESRGAN based upscaling steps cause some color grading/washed out issues and I think that's what happened here. If I look at the 544x960 original, her skin color is healthier and less washed out.
As far as workflow, it's not present in my samples because TBH my workflow has a lot of personal autistic flair and stuff I'm not willing to share lol, also it's based on a custom version of HunyuanVideoWrapper with a few small extra features. But it's just your basic HVW workflow, nothing crazy. I do 50 steps usually with a flow shift of about 9 and guidance between 8 and 10.
I've noticed that LoRA I've trained on only images and hence at a higher resolution do tend to produce a bit of a sharper output, though. So for my next run of this LoRA, I'm gonna include some high resolution images along with the lower res videos and see if we can boost the quality a bit without killing the training time or using too much VRAM.
@blyss that workflow sounds complicated! :)
I'm liking this one quite a bit: https://civitai.com/models/1184655/a-flexible-hunyuan-workflow?modelVersionId=1352771
fast first iteration and great quality on the upscale (most of the time)
@az420 I think I made it more complicated than it sounds lol. The actual workflow I built has group sections: T2V, V2V, ESRGAN based upscale, MMAudio(definitely check it out if you haven't it takes video and text prompt as input and produces synchronized audio as output!). My process is kind of manual because I like to refine each step optimally!
Meanwhile I trained another iteration of this last night, gonna see how it turned out this morning!
Anyone who hasn't used a wand needs to buy one.
Anyone who has used one should learn how to tie one:
Some may consider this a basic survival skill
Yes! So yummy 🥰
Cool! Thanks for the training info!
I've tried using Musubi with 12GB vram, but have not had concepts based on images come out at all. I suspect it is a captioning issue. Tried blurring faces but got no where. Any more advice?
Without knowing more about your settings and dataset, it's hard to say. I would suggest maybe posting here: https://github.com/kohya-ss/musubi-tuner/issues with your training command and like are you trying to train a character or a style or what you're trying to achieve. I'm active over there a lot too and then maybe I can help you!