1.1 update: Extended the dataset and improved the captioning based on things I've learned from making other loras. Since faces are so critical to this concept, I'm not able to make a non face altering version like some of my others, but hopefully the increased variety in the dataset improves this at least a little from the previous version.
The "training images" download contains a simple workflow with a large dynamic prompt that seems to work pretty well.
This was trained on 4 different angles, here is how to trigger them:
A woman is lying on her stomach between the legs of the viewer and performing oral sex on a man. Her head moves up and down as she sucks the penis.An overhead view of a woman kneeling between the legs of the viewer and performing oral sex on a man. She moves her head back and forth as she sucks the penis.A woman is leaning over a man positioned in between the legs of the viewer and performing oral sex on a man. Her head moves up and down as she sucks the penis.A woman is kneeling in between the legs of the viewer and performing oral sex on a man. Her head moves up and down as she sucks the penis.Description
FAQ
Comments (19)
this works amazing. thanks for workflow as well
DAMN! Great job bro :)!!!
Awesome!
Always the same face unfortunately
turn down the strength to around 0.7 and/or combine it with another lora
twerk lora would be crazy
you know there is one, right?
https://civitai.com/models/1092179/twerk-dance
Should train using Sladkislivki as a reference.
How long were your video clips?
The videos themselves are 3 seconds (72 frames), but I specified 50 frames in the config file.
How many steps, frames, and fps are you setting your videos to? They're so much smoother than what I'm producing with this lora.
@crazymanjj I usually do 22 FPS, 30 steps, 81 or 101 frames. You could do 24 fps, but for some reason I just like the results of 22 better.
Mine gets hung up loading the text encoder....?! It downloads and the folder in \LLM is 14.9gb, but still errors out.
how did you train the lur, Con Diffusion-Pipe u otro?
Hey! What was the resolution of the dataset for training? Also with this config, how much vram was utilized when training? Appreciate it!
480p actual video size, but specifying 244 as the resolution in the config file. I'm not exactly sure about VRAM, probably somewhere around 17 or 18 GB. The number of frames also makes a difference for VRAM usage, and I'm specifying 50 for this.
@dtwr434 I see, does increasing number of datasets increase vram usage? I'm curious why you did not increase the number of dataset.
@aperire0402947 No, it just increases the amount of time it takes to train, and I'm not sure what benefit you really get. Very few videos are needed to train a movement. It's possible more videos would increase variety of movement or facial variety, but I'm not even sure about that. I feel like larger datasets just end up having the same issues anyway. But more experimentation is needed to know for sure, and it just takes so long to train each thing, so I mostly am just doing what works for now.
I keep getting this error using your workflow. 'None type object is uncallable'