This one is a bit hit or miss and I honestly don't know if it's worth uploading, but since nothing else like this seems to exist yet, I guess it's better than nothing. Success rate seems to be maybe 50% in terms of whether you get extra limbs or fingers in places that don't make sense.
Depending on the prompt, I've had success increasing the lora weight, or decreasing it, and I can't really figure out any rhyme or reason behind it. If you're having trouble with your prompt, try the lora at strength 1.0, 0.9, or possibly even lower.
If anyone figures out settings that seem to work more consistently, let me know.
Here's the important part of the prompt:
A man is {standing|kneeling} in front of her positioned between her legs thrusting his penis back and forth in her vaginaI've included a workflow with a dynamic prompt that I've had success with in the "Training Images" download. That's giving me a reasonable success rate, depending on how much you're willing to overlook slight flaws.
Description
FAQ
Comments (50)
Awww man I had such HIGH HOPES!!!! I can't get one good render. I tired different lora strengths, models, prompts, model sampling, flux guidance... and nothing....=( I hope you get to fine tune this- I'm really excited to see this work!
Same here trying my best to make it work, hope it gets tuned
No luck here too. I feel like some detailed missing in prompt to get it to work. i get nothing even remotely close. i tried like 30 renders 0 results that are even remotely close and tried various settings.
Guys, try double blocks. It works perfectly here, even with a character lora at 1.9 strength. It's a pretty solid lora.
Hmm, for the people that can't get it to work at all, are you using the prompts I included for each image as a starting point? Just wondering if some people aren't aware I put the prompts in there.
@dtwr434 Yes, on work flow i am using i tried switching from main lora to double blocks, and i still was getting terrible results. i am now trying more detailed prompt which seems to help, provided one i think is far too vague for AI to understand. In fact i've gotten similar results in past not using any lora at all with this model. Now i am getting a okay result like 25% of time but even in that 25% of time positioning of limbs and other issues is still an issue, often missing an arm or a leg or going a leg that branches off into two legs at the knee. Overall i'd say i only get an acceptable result maybe 10% of time, and sadly even then sometimes face is not clear or some other issue keeping it from being perfect. I'd say if i switched to a workflow with video 2 video and then chose best outcomes and then worked on those, i'd likely eventually get good results but it's a lot of work. It's not a reliable lora unless it's just a matter of figuring out a better prompt which i suspect
works for me most times with example prompts, native nodes with 16bit checkpoint
I uploaded a workflow with dynamic prompts as "Training Images". For people that aren't able to get it to work, I'm curious how that works for you. Try downloading it and dragging the json in to Comfy.
I just generated 10 videos using this workflow, and I would say only 2 of the videos it produced were truly bad. 3 were quite good. The rest were good with slight flaws. So, I'm kind of wondering if the difference in success rate people are reporting is partly due to how much weirdness people are willing to overlook. Many of them look great a a quick glance, but will have a weird looking foot or something in the background if you look really closely.
So, depending on how willing you are to overlook odd details, I'd say the success rate ended up being anywhere from 30% - 80% for me in the test batch I did.
If my workflow still isn't working well at all, then I'm not sure what's going on. Other people are starting to upload stuff to the gallery, so I'd try some of those as well.
@dtwr434 yes- I'm using the 'base' prompt you've suggested, I am using double block lora loader.
@soundgate Give the workflow I uploaded a try. I use single and double blocks unless I'm also using a character lora, and then I only filter out single blocks for the character lora.
@dtwr434 nothing loads in the workflow
@Cyberai99 Sorry, not sure what you mean by nothing loads. It's a ComfyUI workflow, so you should be able to just drag and drop the .json file into your browser window to populate the workflow. If nodes are missing, you can use the comfy manager to install the missing nodes. Otherwise, I don't know how to help.
@dtwr434 Okay.... I imported your workflow and it seems to work VASTLY different than mine. I have a bunch of different nodes and shit! I think that is what is my issue. I guess it doesn't work for my setup.
@soundgate Oh, interesting. Well I'm glad it sounds like there's at least an explanation.
@dtwr434 I tried it again on my workflow, and the strength is VERY touchy... I was able to get a couple of good renders, and then started using the good renders as a reference for the latent with a denoise at about .7 to .8....
I am also using "modelsamplingSD3" and "FLUX GUIDANCE"... I'm thinking its a mix of those + the strength of your lora that's the issue.... I haven't found a good balance yet though...
why don't you leave the embedding?
Partly because everyone has their own preferred workflows at this point, so I figure the prompt is mostly what people need to know. I included them for each image.
The other reason is the metadata baked into videos stores file paths from my personal computer that I'd rather not share with random people on the internet.
I just added a workflow under the "Training Images" download if you need something to get you started.
hey thanks!
Works great. It's flexible, it can do futa + female and with some prompt experimenting it seems the motion from this lora can be used for other poses too
works great for me! I'll try to post a workflow for this.
I feel like it prefers lower resolutions... 480x320 worked pretty well. Higher resolutions 'seem' to create additional limbs/people
Best Hunyuan LoRA to date. Amazing work.
good lora- very versatile. fast movements, so i dropped fps to 12 on VHS node. Cranked flowshift up to 12 and got better results. oh, and it sometimes randomly switches to doggystyle, which isn't a bad thing.
flowshift?
This is great! Thank you!
I posted a couple of clips to the gallery which should have embedded workflows. Not my workflow, will update comment if I can remember who's it is to provide attribution.
I've been able to generate clips very consistently with this workflow and your lora!
Amazing LORA, all sorts SEX scenes coming soon.
wow very cool! works great and flexible. Actually for me it works better without using double blocks.
Do you mean passing single blocks only or do you mean just not filtering?
@playnproto266 not filtering
@WhatTheGuy I don't understand the double_blocks thing, most loras need the all blocks or they don't work correctly. Sure it seems "ok" for character loras, but anything like this just doesn't move right. Unless they are trained on specific blocks surely all is required?
This is the main issue for me at the moment with Hunyuan, still can't get multiple loras to give a good quality output, you need to sacrifice something somewhere and it shouldn't be like that
@azeli sometimes when combining multiple loras the images become greyed out and blurry. Then double blocks helps. Of course it would be best not to need to use double blocks. But at the moment double blocks helps for this kind of problem ;)
@WhatTheGuy I don't see a solution being discussed anywhere though, 100s of loras being created but it may be that everyone is creating them incorrectly
@WhatTheGuy what does double blocks mean? I've seen it mentioned a few times but don't currently understand it and haven't been able to find it being described anywhere. Forgive my ignorance lol
@Pity_the_Foo the model is trained on single and doulble blocks. If yopu use just one Lora you don't care about it. If you use multiple character Loras you only use the single blocks and leave the double blocks, because the character data is mostly stored in the single blocks and the double blocks just mess up the other loras. But in motion Loras the motion is stored in the doubvle blocks, so filter them away means losing motion data.
So basicly you load motion loras completely ( single and double blocks ) and for character loras ( or stuff that isn't tranined on motion ) you only use single blocks. In Comfyui is the node 'Hunyuan Video LoRa Loader' for that where you can choose 'all', 'single blocks' and 'double blocks'. Hope this helps =)
@WhatTheGuy it does fill in some of the gaps in my knowledge but I'm still unclear what a single or double block actually is lol. Appreciate your time btw.
@Pity_the_Foo ahh just realizing I swapped single and double blocks in my previours comment >_< so using double blocks for character loras is right! and using both for motion loras. Sorry. just opened an old Hunyuan workflow and saw I messed up my comment ^^'
@WhatTheGuy Yeah I still don't get it lol... I'm just finally getting into ComfyUI although I have dabbled before but the spaghetti mess of connections is daunting enough much less all the terminology so I stuck to Auto1111 and Forge as gradio is much more straightforward.
How can a single or double block be visualized? What are they? Are they nodes, settings in nodes, two lora nodes with the same settings to "double" them up? Sorry for my ignorance but as I said, looking online for answers just lands me to comments where people who already know what single and double blocks means are discussing it and not explaining it.
@Pity_the_Foo The Hunyuan Lora is built by ( I think it was ) 20 single blocks and 20 double blocks. Just think about then as sticky nodes with some values written on it. Single blocks just have one value written on it and double blocks have 2 values written on it. A single block injects it's value into the model in one place. The double block injects its value on 2 places in the model. And by experimenting we found aout that just using double blocks the character loras do combine much better if you use multiple of them because in the single blocks is mostly unneccassary data which just messes up the image if you add multiple on them. And then we found out that for motion loras the single blocks are more important, so we use both blocks.
Then you have this node https://github.com/facok/ComfyUI-HunyuanVideoMultiLora where you can just select if you want to use double blocks, single blocks, or both of your lora. Just 3 options in the node.
There is also a node where all 20 single blocks and double blocks are listed in a big list where you can switch them on and off individually to experiment where exactly in the lora the important data is stored. but that's a bit overkill =D
And here is the explanation ChatGPT gave me:
In the context of Hunyuan Video Lora, the terms "single blocks" and "double blocks" typically refer to LoRA injection points—where and how the LoRA (Low-Rank Adaptation) modules are inserted into the base model architecture. Here's what those terms generally mean:
🔹 Single Blocks
A "single block" LoRA means the adaptation layer is inserted into only one location per transformer block (e.g., just into the query/key/value or feedforward part of the transformer).
It’s lightweight, with fewer parameters.
Used when minimal changes are needed or when you want to keep the model efficient.
🔹 Double Blocks
A "double block" LoRA inserts the adaptation into two components within each transformer block—typically both the attention layer and the MLP (feedforward) layer.
This offers more expressive power, allowing the LoRA to influence more parts of the model's internal reasoning.
It’s heavier in terms of parameter count and compute, but can yield better performance or finer control.
-- 📽 In Hunyuan Video Context
In a video model like Hunyuan, which deals with temporal and spatial representations:
Single-block LoRA might adapt either spatial or temporal features.
Double-block LoRA might adapt both, improving generation quality across time and space dimensions.
-- If you're training or fine-tuning a LoRA for Hunyuan Video and need to choose between the two:
Go single block for lighter, faster training or exploratory experiments.
Go double block when you want stronger control or better quality outputs, especially for complex video prompts or styles.
@WhatTheGuy Thank you for the in-depth explanation. I appreciate your time. ✌
Curious if you're able to give any details on how you trained this. Did you use musubi tuner or something else? Was it all images or videos too? It's very well done, thanks for posting this!
I used the same process as all of my other loras. I'm using diffusion-pipe and trained on videos only for this one.
This is amazing - thank you!
Would love to see you make a vaginal cowgirl lora one day!
Great lora!
@dtwr434 Would be awesome if v2 was more consistent, as you say 50% or less ends up a bit wonky but it has such good potential I think you should go for it again!
Yep, but I've never once seen a masked face in the output. Must be something to do with your prompt. You can probably just ask for a specific type of facial expression and it will go away.
@dtwr434 sorry you are right I used masked as a descriptive word!
Keep up the good work, as you say a little wonky but excited for a v2 if you can ever be bothered.
I feel like all we're missing now is a good fisting lora... wink wink nudge nudge ;)
now i can go animate all my flux and pony images... generate img2vid at 12fps, interpolate at 24fps lets me get ~10 secs of video at 720x480.
works amazing with framepack
this is nice! 🙂