About the merge versions
The merge versions have both versions of the LoRA combined into a larger LoRA. With differing strengths.
I originally tuned the v1 x0.4, v2 x1.0 for nvfp4 ltx 2.3, however after further testing, I wouldn't recommend using this version at full strength for fp8. (ps: if you're having issues with desaturation like in the examples on v2 and the first merge, try using the fp8 or a gguf version instead)
The other version, v1 x0.5, v2 x0.7 was a sweet spot I found when running the fp8 model, it adds a good amount of dynamic motion and doesn't have too many issues. It also seems to be better at prompt following, but it may be a bit less automatic.
Technical explanation: All LoRA weights were scaled, then concatenated along the rank dim, and the alphas were multiplied by 2 (no extra scaling math needed considering both loras have the same rank). This allows the 2 loras to be merged into a single, larger lora, which will have the same effect as the 2 loras being used together. Unlike lora merging using merging and extraction, this method is lossless, and should produce nearly identical results to using the 2 models together.
About the LTX 2.3 version
The new version has been retrained from scratch, I've trained for significantly more steps with a lower learning rate, the dataset was captioned by my nsfwvision v3 model, given some additional information with some of the videos. The captioner was given the video clips at 1 fps, so if you want to indicate a timestamp in your prompt, either put "one second into the video" or "on the second frame", describing events in order should work decently well.
The model is far from perfect, I wouldn't say it's always better than the previous version, maybe a bit better at prompt understanding. Also, using the distil lora at low steps usually gets you less motion, but I still need to figure out a better workflow to fix some of the noise issues when using CFG, the motion tends to be significantly better with CFG than without, so I would recommend using it if you've got a good workflow.
None of the outputs during training had the issues I had when running the model without distil, so it's probably a user issue.
About T2V
T2V still isn't great, it may be better than it was in the previous version, but you need to give extremely detailed prompts, describe the framing, camera's movement or lack thereof, location of the characters (including pov if it's a pov shot). It's very finnicky, so I2V will pretty much always be easier.
Original info
A multi purpose lora for nsfw content primarily intended for anthro (furry) characters but as usual it'll likely work with regular human characters as well, a successor to the Wan furry loras.
This lora should be capable of producing both furry and non-furry NSFW content with audio.
The showcase vids are all image to video. At least 1280x720 recommended for high quality results, lower res (like 640x360) will still work, but may be lower quality. Showcase vids are mostly 640x360 but some are 1280x720. Black bars are due to resizing, my input images were in 2:3 or 3:2 aspect, when going to 16:9 this would stretch, so I used padding instead.
Examples are generated with nvfp4 dev model with distil lora, using my uncalibrated nvfp4 text encoder.
Supported styles
Supports 2d, 3d and realistic styles for image to video. Text to video is largely untested but likely not going to be great.
Keywords
Keywords such as "anthro", "furry", and "anthropomorphic" can be used to specify
(Written before training finished)
Not good for T2V, use I2V
I2V:
I2V is capable of various poses, perspectives and actions. The characters can still talk (but I do not recommend making a character attempt to talk during oral, for moaning during oral, prompt for "muffled moaning".
Foley:
LTX 2 can be used to create foley audio, meaning audio added to an existing video, this lora will work very well for this. [Workflow]
Text encoder info
The idea that abliterated gemma will produce better results than standard gemma as a text encoder is a myth. Abliterated models are lobotomized to forget about refusals, but this also kills other knowledge about banned concepts in the process. Do not use abliterated gemma unless you don't care about the quality of your outputs.
Additionally, since ltx 2 isn't truly censored, the information it picks up from the text encoder ignores censorship info, therefore the outputs will be perfectly fine and retain all knowledge.
Don't believe me? Prompt gemma to say "fuck" or other vulgar words, it will refuse. Now ask LTX 2 to make a character say "fuck", this will work perfectly fine, this is because LTX 2 still has all the information it needs to use your prompt. In short, don't use abliterated gemma or any finetunes that aren't made for LTX 2 with LTX 2.
Lora info
This lora was trained on a dataset with varied content (2d, 3d and for human content irl), on a dataset consisting of >200 videos with a mix of anthro and human videos. Tagged with an llm based on a still frame and corrected and slightly expanded. Most videos in the dataset include sound.
The lora is rank 64, affecting the full Attention + Feed forward parts of the network, and training with sound enabled.
The videos were preprocessed to various aspect ratio buckets at every matching increment of 25 frames, the videos are up to 20 seconds long, and training was done with various different framerates, if a video had a framerate greater than 25fps, it was lowered to 25fps, if the video had a framerate lower than 25fps, it was kept.
Trained using the official ltx 2 trainer.
Want to support future training?
If you want to support me financially to train more models, feel free to send me a code for runpod credit. I don't have any other donation options available right now.
Description
Initial release, i2v capable, not good for t2v
FAQ
Comments (97)
Text to video doesn't work at all.
Correct, that's why I do not recommend text to video for the current version. Maybe it'll work better in a later checkpoint, otherwise, only use for image to video.
Training text to video loras for such varied concepts usually requires very large datasets and a lot of training, in most cases it's better to train the whole model instead of a lora, as that can capture more details. No promises if text to video is going to work with this lora in the future either, but for now image to video works pretty well.
@mylo1337 well thank you for trying cant wait for a general nsfw lora to come out though
do i need to stick directly and completely to your prompt, or can i change things up a bit?
You can change your prompt, I used the same prompt in most examples because prompt encoding is very slow for me since it's running on cpu. I just wrote a generic prompt that would work. But you can prompt for other things as well.
@mylo1337 thank you.
https://civitai.com/models/2086389/uncanny-photorealism-chroma
Specifically converted to nvfp4 but I haven't uploaded that model. The nvfp4 I made is hit or miss sometimes, the actual model is really good.
@mylo1337 +1 for chroma, AIO versions are good, too
@happylittleteapot Pretty excited for zeta chroma (https://huggingface.co/lodestones/Zeta-Chroma) tbh, haven't really seen a lot of details about it aside from that it's based on z-image, I think the architecture is a bit different though since I'm unable to load it in comfy, it's still early in training anyway I think but I'm excited to see where that goes.
Hello there, great work! Do you have any intentions of enhancing it for T2V?
(CivitAI put my reply to a different comment here?)
Not yet, I think I'll need to make a good captioner before I can get to that point, 200 videos is pretty good for image to video, but I feel like I still need more for text to video to really work.
@mylo1337 It's strange. I don't know what's required because I don't make loras but I swear I've seen people train models for Wan with 30 videos
@Gradasho yeah most loras focus on a single subject, so that can be trained pretty easily. With this lora however there are a lot of concepts involved. The design of loras also makes them more optimal for a single subject.
With a multi purpose lora you need multiple videos of each concept, it will take a lot of training for the model to understand what you want. Technically finetuning would work better here, but that has significantly higher hardware requirements.
But both due to the small (relative to usual finetuning) dataset with imperfect captions and the fact that it's a lora is largely holding it back and making it hard to do t2v.
There are some tricks to train with the memory usage of a lora but adjusting the full model, but I've only seen them used in LLM trainer tools so far.
Looks great, but having some trouble. Made sure all files are in the right folders and have the right names, but Comfyui keeps doing the infinite "Reconnecting" on LTXVAVTextEncoderLoader. I would love to use your models, but I've never used anything other than Hunyuan Video. Any advice?
Are you on hdd? LTX needs a big page file so hdd load times are unbearably long
@vesola3327205 Nah, nvme. I've had this "Reconnecting" issue before. Basically just a crash. Thanx for the advice tho.
@N3RDGURL Then it could be the page size is not enough. I had to manually set mine on my ssd, letting windows do it was also giving me a crash on TE load (my entire comfy crashed and I have to restart)
@vesola3327205 That sounds promising. My paging file is currently 32442 MB. How much would you recommend?
@N3RDGURL I think depends on your ram, I had 32 so I set mine to around 70gb before it worked. Tried 50 at first and it still crashing so needs a lot.
@vesola3327205 Worth a try. I'll have to make some space, but I'll give it a try and get back to you.
@vesola3327205 You're a livesaver! I looked everywhere and no one else ever even mentioned the paging file! Is there a button I can press or something that will reward you for your help?
@N3RDGURL You could tip them buzz on their profile if you have any.
I have 48gb ram and use a 96 to 192gb max pagefile. I actually got a second nvme drive just so I could increase my pagefile without relying too much on my main drive (which already doesn't have much storage left)
@mylo1337 A good idea, but this new blue buzz stuff wants me to pay real money to tip. By the way, now that I've tried it, I really like your model. This is leagues beyond Hunyuan Video!
@N3RDGURL no need for anything glad to help, I went through the same thing was frustrating af figuring this out, since these days it feels like everybody has a 100s of ram on their system and don't see errors like these lol
it does the motions, but the video quality is really low. like, wavy and morphy, and sometimes blurry in spots.
what checkpoint do you use?
I use the nvfp4 ltx 2 dev with the distil lora for 8 steps 1 cfg with euler a + beta
Also my gens are raw gens, no refiner/latent upscale pass. It's possible that latent upscale has issues.
Its got some weirdness but still pretty good. Try modifying the prompt to be very descriptive since this model likes that. I think ltx2 will be able to make some good shit whenever we get all the enhancement and specialized lora like wan has. Although it still has some bodypart bending/morphing (penis) the animations using a 2d image look like they are properly animated in 2d and not 3d like wan does.
@mylo1337 latent upscale destroys quality. 0.50 strength is why, 1.0 should be the default. LTX team probably set it so low so more people can generate at reasonable speeds.
@ylvlylvuyv374212 Oh yeah I've noticed, I've also seen some people say their latent upscale refined results had better lipsync quality, but I got the same quality lipsync in a single full res gen without upscale used.
Nice. Be great to have dedicated to Judy Hopps with her voice too.
This is the way to go. Instead of making 1 brazillion LoRAs for basic NSFW concepts - make a single one properly trained on all of them evenly. Obviously this one still has many flaws but as a 'proof of concept' in the early stages of LTX-2 LoRA expansion - its extraordinary nonetheless! Thank you and please keep it up!
It really is great
thanks for sharing info about abliterated models. So you're saying you used the default Gemma during dataset preparation and during the training?
During latent preprocessing I used the regular Gemma 3 12b qat 4 model that ltx recommends yeah. Also all the generations are with nvfp4 Gemma from my huggingface repo (also not abliterated). In the end, abliterated Gemma knows less than the original, and since ltx 2 ignores the censorship since it doesn't sample like an llm but instead takes all processed tokens, the info is still there, uncensored.
@mylo1337 ok thanks, I had issues with my penis Lora but i used abliterated. I'll try again on the default
Rename for I2V plz
Alright, a future version could possibly support text to video, but for now I'll include i2v in the title for clarity
Decent lora -- it works a little like Hunyuan did for me at the beginning, or even AnimateDiff. Not terrible, but I could only really get the sliding effect a few times.
LTX has made me go back to WAN for so many things. LTX has the lip-sync on lock, but everything else... might not happen
It seems to work really well with human couples when combined with.
https://civitai.com/models/1445226/stomach-bulge?modelVersionId=2595456
https://civitai.com/models/2306310/ltx-2-improved-female-nudity?modelVersionId=2594914
It can do pretty much any position, cowgirl, reverse cowgirl, prone bone pov, low angle prone bone from behind. vaginal, anal, yeah I'm quite pleased with the results i've been getting. This is using standard gemma and I think LTX-2 is just way better with this than wan 2.2 when it comes to actually prompting the motion and positions and it has sound with makes it all the better.
Cool, what strengths have you tried with those combined?
@Light6969 I just set them all to 1.0 which seemed to work best and just describe the scene and what happens, using terms like vagina, anus, buttocks and breasts. Avoid slang terms like pussy, dick or cock because the model probably does not understand those terms. It struggles with the penis, it sometimes gets mangled but the penis lora can help with that, however the penis lora sometimes likes to create penis in strange places so maybe lower its strength.
@rogerstone442 Thanks!, yeah I'll experiment with that, good idea with the wording. Also yeah penis lora also can make the action become more still, they just sort of hold the penis.
@Light6969 The problem I see with LTX-2 i that you have to have the characters already in the act, at least that is the case with the current lora's we have.
@rogerstone442 Yeah, at the moment I'm using it as a extra model, not the main model, so I can use wan for certain videos and zimage or flux klein for image editing, and the ltx-2 for talking and certain things, there's positives and negatives so gonna try make the most out of the positves.
@Light6969 Yeah that is what I'm doing, I gen with Wan the start frames then i load that video into the LTX-2 model with certain lora's enabled. It will use those frames to carry on the motion and add the audio.
So I don't like furry at all, but what you're saying is this improves sex for non-furry, right?
@ForeverNecessary737716 Yep it's pretty universal, the dataset was primarily furry, with a smaller portion of human, for i2v this won't matter, it's not going to turn a human character into a furry or anything like that. For t2v it still knows what humans look like (although it's not good for t2v in general in the current state, at least not for sex, on its own).
can you make a cumshot lora this lora is great
First off, the Lora is really good but still have some issues. Well, at least I have them, so to name a few.
1. It works great with the fp4/fp8/dev models, but when I tried to make it with distilled models I got nightmare fuel even after multiple generations.
2. As far as I can see the Lora was trained with videos that are 25fps but LTX works natively on 24fps. Changing the fps from 25 to 24 can give some bad results and combining it with other Lora's make it a bit difficult. (please correct me if I'm wrong I'm still new with LTX)
3. Works great when you gen it with 8 steps. If I try more like 20-30 I get horrible results and I have no idea why.
4. Works great with euler_ancestral sampler, if I switch it to any other like res_2s I get horrible results.
All in all, great work and I hope you'll make more stuff. Also, if someone can explain/correct any of the points I made I would be happy to listen.
1. All the example videos are generated using dev fp4 with distill lora at 8 steps in a single pass (2 step latent upscaler often doesn't look good with this lora, and it's not really faster that way anyway)
2. The lora was trained on various frame rate videos, any fps over 25 was lowered. 25fps is a supported fps for ltx 2. 24, 16 and 50 are some other values which work well, but 25 is the fps used in the training config template gen for example. Because of mixed fps training it should be less restrictive in framerates, I haven't had it break at the fps options I named earlier. Training exclusively on a single framerate could lead the model to forget how to handle other framerates (although risk is lower with lora).
I'm currently working on some captioning models, if that goes well I can make a much larger dataset with higher quality captions, maybe text to video will work then... I'll pre-test my captioner to caption a dataset for a smaller image model, maybe flux 2 Klein. If the anatomy stays somewhat stable then using it to caption videos should also work.
@mylo1337 i've been using a wf with a second pass, is that why my videos seem deepfried and the skin goes really ugly?
@mylo1337 Thanks for explaining the nr.2 appreciate it. As for point 1. I wasn't talking about the latent upscale but the distilled versions of LTX checkpoint. Like for instance, I've noticed that the Prone Bone lora works best with the distilled models while yours produces better results with the fp4/fp8/dev so when I try to combine the 2 Loras for any reason the end result, even tho okay in terms of animation, produces some strange results like fore mentioned deepfried skin, weird jiggling, etc. I'll so some more testing to see if it's purely model related or is there something else that affects the end result.
@crombobular If you are using the distill model, try the 2nd pass with the distill lora at a negative (-0.4 to -0.6), it seems to help with the skin a bit. If you are on the dev model lower the distill lora to 0.6 on first and 0.4 on 2nd. Their distillation really fries the skin
@vesola3327205 yeah! i just took the wf from the examples and it looks so much better. insane
@vesola3327205 Thanks, will try it.
First things first, thank you for providing a nsfw concept lora for LTX-2 too.
I was thinking about making my own, but was surprised that someone already made one.
I finished optimising the i2v basic workflow and the results are rather...meh. Since your wan 2.1 lora was S-tier, I am 100% certain that this is just LTX-2.
Perhaps the best use for LTX-2 for now is to use the video to video workflow to add sounds. Its indefinetely more better than MMAudio.
WOW, THIS IS AWESOME!
Tell us if you’re planning to make one with creampies and cumshots inside, or in mouth etc etc!
please add T2V support
I would if I could. I'm working on captioning models first, currently on the 5th experiment. Trying to teach a vision llm nsfw concepts works but I still have to find the right configuration and tricks to not bias it too much it have it forget stuff.
Additionally to a larger, more high quality dataset, t2v training will take a lot longer and a lot more resources, I'd like to make a t2v version some day but I think right now that's way out of my budget.
I've got various experiments I plan on doing though, should give me an insight on things I can do for faster training, what works and what doesn't work. But t2v for ltx 2 is low priority now because it doesn't seem like it's in my budget right now, when I managed decent t2v on smaller models I'll start scaling up.
So I had some issues with the LORA even with the provided WF and text encoder, where the face warped badly and everything blurred badly. I did the most random combination possible and fixed it.
I use the ltx2 distilled model, threw on the distilled LORA and set the LORA to -0.4. For the audio and graphical VAE, I linked it up to a checkpoint node with the distilled model as that model has the updated fixed VAEs. This fixed 90% of the graphical issues, though occasionally the audio loses sync.
Hope this helps.
Hmm, okay I believe I've seen people recommend using the distil lora at 0.6 strength before, and -0.4 starting at distilled should have a similar effect to that.
The vae issues themselves were only in the distil model initially on release, the dev model did have the release vae, but the distil model had the beta vae still. I never ran into that myself though since I used dev with the distil lora.
@mylo1337 Oh I see that makes sense. Great LORA btw.
If I have any critiques is that there is occasionally the jello dick issue and male thrusting can be improved, but overall I am very happy, thank you.
I also tried using the distilled lora at -0.4 strength on the distilled model and it does make generations less glitchy with fewer weird face shit idk why it works though. Did fixed seed tests. Still great lora mylo.
@mylo1337 I have tried to run dev with distil lora and it wont fit on my 5090. How much VRAM do you have?
@kunde2 16gb, comfy's memory management makes it work. (I also have 48gb ram and 200gb swap just in case), I run the nvfp4 version btw. (Also I made an unofficial nvfp4 version of ltx2.3, which I'll make an updated lora for at some point, haven't checked if there's an official nvfp4 yet though)
Fellatio, boobjob work very well and pretty much anything "from behind"- is fantastic.
Another S-tier Lora from mylo.
Is there any way to target specific voice tones? Sometimes i want her to have a deep voice and other times a high tone, like in anime. Is that possible?
I'm not sure about targetting voices, I didn't caption voices super specifically, you can try giving it a description but you'll have to rely on the base model's ability to understand what a voice should sound like from a description. Accents and stuff work pretty well though iirc.
This right here is hands down the best nsfw LoRA for LTX2, not just for furry, but for pretty much anything. It works great in i2v, and although it doesn't function well in t2v, that's not really a problem.
There's a few issues with it, namely thrusting motions not working sometimes depending on the angle, and faces being distorted, especially on human gens. Also often with penetration, if the penetrating object isn't a penis from a POV angle, it will mostly fail to make the object enter the cavity: There will be a thrusting motion, but absolutely no depth to the penetration. This is most noticeable with tentacles, things which are "disembodied", or "large insertions".
I hope you'll continue to train this one, and potentially add more stuff like extra angles and different types of penetration, and hopefully fix the "no penetration depth" issue.
But still, this is an incredible LoRA. Amazing work :)
what types of oral positions/actions are best for it? a
@Carnal_Creations Side view seems to work best for oral.
workflow?
I'm really curious why every workflow I steal has this LoRA in it. What's so special about it ? What does it do ? Thanks !
Well if they're nsfw workflows, this lora basically handles nsfw motions and also tries to improve sound. Unlike the more common type of nsfw lora which adds a single thing, this lora tries to add general understanding of penetration, moans, and some other related concepts.
@mylo1337 Bro is there any other kind of workflow ? Thanks !
@girlswithafros For sfw stuff you can use the default workflows, like the comfy official ones, the ones made by ltx, or others from people on civitai
@mylo1337 @mylo1337 What I mean was NSFW is all I do LOL.
@girlswithafros then you'll probably want to use this lora lol
@mylo1337 unrelated but is there are a v2/update planned to help with t2v?
@crombobular Yeah but first I have some other stuff to do. I'm planning on/have started writing my own training software because I'm always getting issues, missing features or just poor implementations. When I get that in a functional state I'll release it and use it to train an updated lora. I've already got a plan for the training but you probably won't see it very soon since I still have a lot of work to do. I might release an experiment I did for flux 2 Klein 9b, where I got pretty nice results using muon, it's convinced me that it is possible to create a t2v capable lora with a low budget, I just need to use an optimizer like muon, it worked very well with flux, so I expect similar results with ltx 2.
Used your gemma uncalibrated text encoder and when I run my prompt the actor says "I'm an ethical AI assistant and cannot .... " then kinda breaks. Any ideas?
The model is the original censored model. It's good for prompt encoding (it keeps the original meanings), but not enhancing (it'll refuse when used for text generation). You could either use a different model for the enhancing, like abliterated, but still use the original model for prompt encoding. Or, although I would not recommend it since there will be some knowledge loss, you could use the abliterated Gemma for both.
Personally I'd either go with no enhancement at all, or otherwise using a different model for the enhancing, but nvfp4 regular gemma for prompt encoding.
Ok so I've tried using this in a ton of variations and settings and they all look like absolute trash. Is it CivitAI's generator being useless, or am i missing something? Maybe i need a good ComfyUI workflow, but i can't get this to work properly in CivitAI at all. Anybody got tips or a good workflow for NSFW? Thanks
Text to video is not expected to work. But if it's got issues on image to video (like being worse than no lora) there's probably an issue with civitai's generator.
Also I am considering training a new version made for ltx 2.3. The current version does apply to ltx 2.3 since the transformer architecture is largely unchanged but features' meanings have shifted with the additional training, so I should probably retrain. (Not that civitai's generator supports 2.3 yet anyways.)
@mylo1337 I must be doing something wrong. Either that or a lot of the posted generations under LTX on civitai are not being honest about their gens, or they're heavily refined post gen. I've tried LTX2 on other providers now, as well as on a 24gb gpu on cloud within comfyUI. In the latter case, i was able to play around with strengths, loras, shifts, steps etc. and 9/10 times what came out what just really sub par. Nowhere even close to the quality or consistency of say wan 2.1. Their faces were warping, the audio was glitchy, the physics sucked, it had terrible nsfw comprehension even with the loras and heretic text encoder.
I just don't get why i can't get anything useful out of it. Maybe it's just a crap model. idk
it looks not bad but on real humans the faces and animation is badly disorted
@mylo1337 am looking forward for a newer version what not disort the image like the actually version
@mylo1337 please do, that would be amazing
It's not the fault of the Lora. I havn't used the online generator but Comfy is also sub-optimal well. Best local option right now is Wan2GP which mimics the pipeline of the official desktop app which gives the best results
@kunde2 Oh i had an awful experience with that PoS. Very limited UI, constant crashes, outputs were trash. I ditched it very quickly and went back to comfy.
Really good stuff!
will there be a 2.3 version?
Yes, I'm currently still collecting data since I want to use a fresh dataset.
The visuals I get with this thing are fine. So it’s direly needed.
Unfortunately, the moment I include it the people in the scene tend to narrate the stage instructions.
Doesn’t help to say, “no narration,” “no dialog.”
What do I do?
Make sure you're using regular Gemma and not an altered version such as abliterated, There might also be some words that the model misinterprets as speech, hopefully that won't happen in the next version for ltx 2.3.
An LTX 2.3 version would be amazing! Hopefully it has all the basic NSFW elements (blowjob, handjob, cowgirl and missionary - anal-vaginal).
It's over 3 days in training right now, I'll probably end training tomorrow. t2v is better than in the previous version, although still far from perfect. I'm having some issues generally with 2.3 but unrelated to the lora, if the release is later than expected it's probably that I'm still finding the right settings.
Details
Files
ltx2_nsfw+furry_lora_step_15000.safetensors
Mirrors
ltx2_nsfw+furry_lora_step_15000.safetensors
103-nsfw-furry_lora_step-jan27.safetensors
ltx2_nsfw+furry_lora_step_15000.safetensors
ltx2_nsfw+furry_lora_step_15000.safetensors
ltx2_nsfw+furry_lora_step_15000.safetensors
c44251eb01964bffd3f6f4e753bb9d4f.safetensors
ltx2_nsfw_lora_step_15000.safetensors
c44251eb01964bffd3f6f4e753bb9d4f.safetensors
ltx2_nsfw_furry_lora_step_15000.safetensors
c44251eb01964bffd3f6f4e753bb9d4f.safetensors
ltx2_nsfw+furry_lora_step_15000.safetensors
ltx2_nsfw+furry_lora_step_15000.safetensors
ltx2_nsfw+furry_lora_step_15000.safetensors
c44251eb01964bffd3f6f4e753bb9d4f.safetensors