Trained for image2video generation, not tested on text2vid
v1.1 UPDATE
Version 1.1 is a significant improvement of v1.0. It produces the doggy style motion consistently. The LoRA was trained on vertical videos - it has been reported that wide aspect videos may have some issues. Please leave some feedback if you have issues with it.
Note: I noticed there can be some image artifacts that appear, usually on the ass. I think this is due to low resolution videos in my data set. I looked into improving the quality with AI upscaling, but ran into a lot off issues that ended up making the videos worse, so this will do for now.
Trigger word is POVdog. "A POV video showing a man having sex doggy style sex with a woman." and "Ass movement and bounce is emphasized" can help too.
I used aipinups69's https://civarchive.com/models/1358184/wan-21-penis-cock-dick-lora-t2v-i2v?modelVersionId=1534254 at strength 0.5 in my testing again and it worked great.
v1 -
This is my first attempt at training anything and is still a work in progress. It seems to work okay - some feedback on how to improve is always appreciated. I trained on 480p 14B, I am not sure if it works for the 720p version or not.
Trigger word is: POVdog. The phrase "A POV video showing man having sex doggy style sex with a woman." Was included in most of the training data as well.
In my testing I found this works better with aipinups69's https://civarchive.com/models/1358184/wan-21-penis-cock-dick-lora-t2v-i2v?modelVersionId=1534254 at low strength (0.4 - 0.5) to keep the man bits not looking too deformed.
I need to take a closer look to my training process and data set and will hopefully improve this in the week. Training took about 10 hours on two 4090s which seems a lot longer than other people have trained their models, so I definitely have some inefficiencies.
Let me know what y'all think!
Description
Trained for image2video generation, not tested on text2vid
Version 1.1 is a significant improvement of v1.0. It produces the doggy style motion consistently.
Note: I noticed there can be some image artifacts that appear, usually on the ass. I think this is due to low resolution videos in my data set. I looked into improving the quality with AI upscaling, but ran into a lot off issues that ended up making the videos worse, so this will do for now.
Trigger word is POVdog. "A POV video showing a man having sex doggy style sex with a woman." and "Ass movement and bounce is emphasized" can help too.
FAQ
Comments (35)
Hi, I need some clues about the training, did you have the about 85 GB worth of Wan 2.1 I2V models on your machine or is a different single file smaller model used ? Because I see that you have to copy the Wan files and there is about 85GB of it, which seems impossible to train on any cards...
Yes, the I2V-14B-480P model is about 82.2 GB, but both of my training runs have fit into 48gb of vram without issue. To be honest I am not 100% sure why or how that works. I think it is because the training is done at bfloat16 with some transformer stuff at fp8. My training data set videos also are not full 480p/720p videos. For v1.1's data set there were 13 training videos all 3 seconds long and 240x426 and it took up about 32GB of vram during training.
@lazerblazer okay I guess the rest goes into system Vram then ? If SO I can't run it Only 32gb and 24 Vram :)
@NoArtifact When training on a single gpu I believe you can enable block swap, which would put some of the model into system memory and reduce the amount of vram required. You can also do the full training run in 'float8', which also reduces the vram required significantly. Other people have posted they successfully trained on a single 4090, so I believe it is possible, I just don't have any experience with it or know the best settings for it.
@lazerblazer It's what they said on Diffusion-pipe github yes, but wasn't it for the T2V model not the I2V ?
@NoArtifact I am using diffusion-pipe and have been doing i2v only. I am not sure if there is a difference or not. I wouldn't think there would be more vram needed to train one vs the other since they are both 480p 14B models, but again not sure.
@lazerblazer Okay, anyway last time I tried it was on the little 1.3b and got crashes error logs for days :) so I'm far from testing, but it's good to know it may be possible, i'll give it a try at some point
gj, could you check wide apect ratio? It works kinda clunky for me with weird motions, did you train in any 16:9 horizontal videos or just 3:2 vertical ones?
All the videos in the v1.1 training data were vertical. I appreciate the feedback on it not working well on wide aspect ratios, I didn't realize that would make a difference. I will add some wide videos to my data set for the next training run.
thank you, that would be great! By the way, you said you tried to upscale your training dataset, but it made it worse, did you use latest topaz? I have one and could help with that if you want, at least i could try to do it better :-)
@NeuroFunkeR I haven't tried Topaz - but I saw it was only for Windows/Mac, I am on a Linux machine so it won't work for me. I'm not sure if I feel comfortable sharing the .mp4's I used for training - I don't know if there could be any legal/copyright issues using them for training, so I will just keep the sources unknown. I did see DaVinci Resolve Studio 19 is available for linux and has super scale which sounds like my best upscaling option, but it's $300. Seems kinda dumb to buy it just for better AI ass jiggles lol. We'll see.
@NeuroFunkeR There are probably thousands of HD porn videos released daily. Why would you use crappy videos "upscaled" with AI. Topaz doesn't interpret pixels like a generative AI. If you have shit in your source, you'll have enhanced shit in your upscale. But not something clean by magic.
FYI I created 16:9's without any trouble. Only tried a few so far but no issues
@Kiefstorm Awesome, that is good to know!
@SD_AI_2025 well, that question should be to OP not me :-)
@NeuroFunkeR Haha yes it should have been. And to answer the question for anyone else stumbling upon this, I looked into upscaling since I had already spent several hours on putting the data set together and didn't want to start over if there was an easy solution to improve the video's compression artifacts. Either way I think it turned out fine as is and I now know the impact lower quality videos will have on training output.
Seems to work fine with 720p. It's been working well, thanks
Glad it's working well and good to know that 720p works with it, thanks for the feedback!
Sorry to ask this, but could you please retrain with different keywords? The fact that I have to type "POVdoggy" or "doggy style" seems to sometimes make the woman turn into an actual dog who turns around and starts licking at the viewer. Or even if not that drastic, a lot of dog-type licking and slightly glitchy mouth action going on. Maybe something like "POVd4wg" and "sex from behind?"
Well that is certainly interesting, I haven't encountered that before. What is the cfg level you are generating at?
In a future training run it would be possible to change the trigger word, but I think the next version I will be doing a fine-tune on top of the existing LoRA, so it is kinda baked into the model for at least a little while. If it is a common issue I will make it a priority to change the trigger word.
Not sure if it matters but the trigger word is "POVdog" not POVdoggy. After the "POVdog" you don't really need to reference anything with dog or doggy again. You can use phrases like "The woman is on all fours and the man is thrusting into her from behind" or "woman is lying face down on a couch, her ass up in the air, the man's penis is penetrating the woman's pussy repeatedly.".
CFG 6. Also I mistyped, I meant POVdog and that's what's in my prompt. I have tried using it without POVdog and prompts similar to what you wrote, as well as using the LoRA at 0.8 strength, and that seems to help.
Nice work! I've been trying do a similar lora but getting too much weirdness with the dick lol (they kinda... melt?!). Also seems like I need to train for a long time - probably because it's learning more new concepts than something like the bouncing boobs lora. Are you training with any images or just vids? I was going to try adding images next to see if that helps.
Thanks! Yeah sometimes the dick gets all melty and stretchy, its very unsettling haha. I found that using a penis LoRA at a low setting (0.5 works well) along with the sex position LoRA at full strength will really help prevent the dicks from getting all weird. I don't think the base Wan model knows what a penis is or what it's physics should be so it needs some help.
For the training of v1.0 I had 2 images in the data set, but due to other training mistakes I made it didn't turn out great. In v1.1 I didn't include any images at all and that turned out better, but I think that was mainly due to data set improvements on the videos. So at the end of the day I still don't know enough to know if images have a significant impact when training a video LoRA or not. I would be curious if anyone else has experience with this?
When the pussy so good your dick melts haha. Yeah I tried the penis LoRA like you suggested and it does help. I know dtwr is including images in his missionary lora, and it turned out pretty good.
Good to know that others are using images successfully. I am putting together a data set for doggy style with the woman facing the camera, I'll probably throw some images in there as well to see how it turns out. I need to do some research on it - I am curious if there is a good ratio for the number of videos vs images. I still have quite a bit to learn and don't really understand what makes an optimal data set yet.
@lazerblazer yeah I don't think any of us do! In case it's helpful, dtwr's missionary lora was 11 vids (244 resolution, 32 frames), 7 images (at 800 resolution), trained at 1:1 ratio.
How to work if there are 3 girls in the image? I have all 3 moving at the same time, but I only need the central one)
It was trained on just one girl in a video, so I am guessing anything that looks like an woman's ass is going to get the movement. You can try prompting different movements for the other women, or try lowering the strength perhaps.
@lazerblazer I tried different ways until it worked
Thanks
masking maybe ?
I didn't see any hunyuan for this, which is crazy .. please if you can train it too oO
@lazerblazer please update Base Model of your lora from Wan Video to Wan Video 14B i2v 480p so your lora can appear in new wan generator
Just now seeing this, it is updated now.
Worked straight out of the gate. Very simple, easy. Great results. Well done!
great model, any plans for wan 22?
Details
Files
doggyPOV_v1_1.safetensors
Mirrors
doggyPOV_v1_1.safetensors
doggyPOV_v1_1.safetensors
doggyPOV_v1_1.safetensors
doggyPOV_v1_1.safetensors
doggyPOV_v1_1.safetensors
doggyPOV_v1_1.safetensors
doggyPOV_v1_1.safetensors
134_doggyPOV.safetensors
115_doggyPOV.safetensors
doggyPOV_v1_1.safetensors
doggyPOV_v1_1.safetensors
138_doggy.safetensors
155_doggyPOV_feb19.safetensors
doggyPOV_v1_1.safetensors
doggyPOV_v1_1.safetensors
doggyPOV_v1_1.safetensors
doggyPOV_v1_1.safetensors
wan_i2v_doggy_back.safetensors
wan2.1-i2v-s-dog-pov.safetensors
doggyPOV_v1_1.safetensors
doggyPOV_v1_1.safetensors
pov_dog.safetensors
doggyPOV_v1_1.safetensors
doggyPOV_v1_1.safetensors
wan21-action-doggypov-v2.safetensors
Wan POV Doggy Style (i2v).safetensors
doggyPOV_v1_1.safetensors
doggyPOV_v1_1.safetensors
wan-POVdog.safetensors
wan21-action-doggypov-v2.safetensors
doggyPOV_v1_1.safetensors
doggyPOV_v1_1.safetensors
doggyPOV_v1_1.safetensors
doggyPOV_v1_1.safetensors
doggyPOV_v1_1.safetensors
doggyPOV_v1_1.safetensors
doggyPOV_v1_1.safetensors
doggyPOV_v1_1.safetensors
doggyPOV_v1_1.safetensors
doggyPOV_v1_1.safetensors
doggyPOV_v1_1.safetensors
doggyPOV_v1_1.safetensors
Available On (1 platform)
Same model published on other platforms. May have additional downloads or version variants.