CivArchive
    LTX 2/2.3 [I2V] NSFW (+furry) - Multi purpose sex lora - v2.0 LTX2.3 step 36000
    NSFW

    About the merge versions

    The merge versions have both versions of the LoRA combined into a larger LoRA. With differing strengths.

    I originally tuned the v1 x0.4, v2 x1.0 for nvfp4 ltx 2.3, however after further testing, I wouldn't recommend using this version at full strength for fp8. (ps: if you're having issues with desaturation like in the examples on v2 and the first merge, try using the fp8 or a gguf version instead)

    The other version, v1 x0.5, v2 x0.7 was a sweet spot I found when running the fp8 model, it adds a good amount of dynamic motion and doesn't have too many issues. It also seems to be better at prompt following, but it may be a bit less automatic.

    Technical explanation: All LoRA weights were scaled, then concatenated along the rank dim, and the alphas were multiplied by 2 (no extra scaling math needed considering both loras have the same rank). This allows the 2 loras to be merged into a single, larger lora, which will have the same effect as the 2 loras being used together. Unlike lora merging using merging and extraction, this method is lossless, and should produce nearly identical results to using the 2 models together.

    About the LTX 2.3 version

    The new version has been retrained from scratch, I've trained for significantly more steps with a lower learning rate, the dataset was captioned by my nsfwvision v3 model, given some additional information with some of the videos. The captioner was given the video clips at 1 fps, so if you want to indicate a timestamp in your prompt, either put "one second into the video" or "on the second frame", describing events in order should work decently well.

    The model is far from perfect, I wouldn't say it's always better than the previous version, maybe a bit better at prompt understanding. Also, using the distil lora at low steps usually gets you less motion, but I still need to figure out a better workflow to fix some of the noise issues when using CFG, the motion tends to be significantly better with CFG than without, so I would recommend using it if you've got a good workflow.

    None of the outputs during training had the issues I had when running the model without distil, so it's probably a user issue.

    About T2V

    T2V still isn't great, it may be better than it was in the previous version, but you need to give extremely detailed prompts, describe the framing, camera's movement or lack thereof, location of the characters (including pov if it's a pov shot). It's very finnicky, so I2V will pretty much always be easier.

    Original info

    A multi purpose lora for nsfw content primarily intended for anthro (furry) characters but as usual it'll likely work with regular human characters as well, a successor to the Wan furry loras.

    This lora should be capable of producing both furry and non-furry NSFW content with audio.

    The showcase vids are all image to video. At least 1280x720 recommended for high quality results, lower res (like 640x360) will still work, but may be lower quality. Showcase vids are mostly 640x360 but some are 1280x720. Black bars are due to resizing, my input images were in 2:3 or 3:2 aspect, when going to 16:9 this would stretch, so I used padding instead.

    Examples are generated with nvfp4 dev model with distil lora, using my uncalibrated nvfp4 text encoder.

    Supported styles

    Supports 2d, 3d and realistic styles for image to video. Text to video is largely untested but likely not going to be great.

    Keywords

    Keywords such as "anthro", "furry", and "anthropomorphic" can be used to specify

    (Written before training finished)

    Not good for T2V, use I2V

    I2V:

    I2V is capable of various poses, perspectives and actions. The characters can still talk (but I do not recommend making a character attempt to talk during oral, for moaning during oral, prompt for "muffled moaning".

    Foley:

    LTX 2 can be used to create foley audio, meaning audio added to an existing video, this lora will work very well for this. [Workflow]

    Text encoder info

    The idea that abliterated gemma will produce better results than standard gemma as a text encoder is a myth. Abliterated models are lobotomized to forget about refusals, but this also kills other knowledge about banned concepts in the process. Do not use abliterated gemma unless you don't care about the quality of your outputs.

    Additionally, since ltx 2 isn't truly censored, the information it picks up from the text encoder ignores censorship info, therefore the outputs will be perfectly fine and retain all knowledge.

    Don't believe me? Prompt gemma to say "fuck" or other vulgar words, it will refuse. Now ask LTX 2 to make a character say "fuck", this will work perfectly fine, this is because LTX 2 still has all the information it needs to use your prompt. In short, don't use abliterated gemma or any finetunes that aren't made for LTX 2 with LTX 2.

    Lora info

    This lora was trained on a dataset with varied content (2d, 3d and for human content irl), on a dataset consisting of >200 videos with a mix of anthro and human videos. Tagged with an llm based on a still frame and corrected and slightly expanded. Most videos in the dataset include sound.

    The lora is rank 64, affecting the full Attention + Feed forward parts of the network, and training with sound enabled.

    The videos were preprocessed to various aspect ratio buckets at every matching increment of 25 frames, the videos are up to 20 seconds long, and training was done with various different framerates, if a video had a framerate greater than 25fps, it was lowered to 25fps, if the video had a framerate lower than 25fps, it was kept.

    Trained using the official ltx 2 trainer.

    Want to support future training?

    If you want to support me financially to train more models, feel free to send me a code for runpod credit. I don't have any other donation options available right now.

    Description

    Retrained from scratch for ltx 2.3, 5x lower learning rate, 2 gradient accumulation

    FAQ

    Comments (21)

    mycoreMar 15, 2026· 7 reactions
    CivitAI

    finally new update, and LTX 2.3,

    9832676Mar 15, 2026
    CivitAI

    can you let me know how my caption tool compared to yours? i find that in most cases i can take the caption and copy it back into LTX-2.3 and get the same/very simular video back
    might be the key to text to video loras?

    mylo1337
    Author
    Mar 15, 2026· 1 reaction

    I think the key to text-to-video loras is probably to just have a prompt de-enhancer, which simplifies prompts that can be added as alternative captions in the dataset, so the model would know what it's expected to guess. Ltx 2 has very little nsfw knowledge by default though, much less than wan or hunyuan.

    redlittlerabbitMar 15, 2026· 1 reaction
    CivitAI

    I'm gald youre contributing. Im just wondering why LTX2.3 has a good furry lora before a sex one hahaha

    mylo1337
    Author
    Mar 15, 2026· 5 reactions

    I'd assume because it's very pricey, this lora trained for almost 4 days and cost around 150 dollars. And spending 150 dollars on something you aren't sure will work is pretty risky. I had some plans to train with muon optimizer and all, but I haven't finished writing the trainer, and afaik muon isn't really supported in any trainers that support ltx 2/2.3.

    pocketpieMar 15, 2026

    @mylo1337 wow I thought LTX was cheaper to train on than wan, like $5-30 for a rental. what service did you use?

    MisticRain69Mar 15, 2026

    @mylo1337 Oh shit yea. My lora was at least $40 and the runpod troubleshooting the first run was 2hr. Not to mention you need a rtx pro 6000 to not OOM.

    mylo1337
    Author
    Mar 15, 2026· 2 reactions

    @pocketpie I rented an rtx pro 6000 for almost 4 days. If the lora was for a basic concept it would be cheaper since you won't need as many steps, but ltx2 loras are expensive to train because it's a 22b model, using GPUs under 80gb vram had issues for me before. With wan around 24 gb is often enough. (And with wan 2.2 you can do 2 separate training runs, one for each model.)

    MilitAIMar 16, 2026· 1 reaction

    check penile praxis

    redlittlerabbitMar 16, 2026

    @MilitAI Checked. Sex is always a bit slow and awkward looking. Thanks though

    JaimyBMar 15, 2026· 1 reaction
    CivitAI

    There is something i always imagine when a lora is posted, what does the dataset looks like lol , great job for this one , 36000 steps man hope you got the RTX 6000 pro at home :D

    MisticRain69Mar 15, 2026

    These NSFW general loras are the only ones ive tried that actually generalize well. The other ones if you make like an anime woman will give them super long eyelashes and uncanny eyelids. Prob cause its trained specifically for realistic gens but I like mylos since it generalizes to realistic and animated.

    JaimyBMar 15, 2026

    @MisticRain69 I imagine the prompt had to be very important at so high steps, Human can easily finish with a tail in the ass :-p

    crombobularMar 16, 2026· 3 reactions
    CivitAI

    having a hard time getting characters to talk with the 2.3 version. either text appears or they say nothing. using the distill model

    mylo1337
    Author
    Mar 16, 2026

    For me I usually put something like 'the woman says "text goes here"'. That works pretty much every time for me.

    crombobularMar 16, 2026

    @mylo1337 yeah, that's pretty much what i write too. that aside, the lora works really well and is fun to use

    crombobularMar 16, 2026

    @mylo1337 it's possible that ltx2.3 requires even more prompting than ltx2. i'll enhance the prompts and see how it turns out. with the ltx2 version i could just copy paste the simple prompts from the examples and then add a "the woman says:" etc, but that is fully not working with 2.3

    mylo1337
    Author
    Mar 16, 2026

    @crombobular When I used the prompt enhancer the "enhanced" prompt completely got rid of things like speech. I thought it was the gen at first but when I checked the enhanced prompt it had none of the speech I put in the original prompt. So if you're using a prompt enhancer, it might be removing your speech text.

    crombobularMar 16, 2026

    @mylo1337 nono, i'm not using that node. unrelated but i can't find a model that doesn't just refuse fully just for mentioning anything sexual so i gave up bothering with it. it either refuses or returns the same prompt, so i just use a completely different llm with the ltx2 sysprompt (minus the safety bullshit)

    crombobularMar 16, 2026

    alright yeah it was a prompt skill issue. for anyone having the same issue, do not blindly copy the prompts from the examples. a simple "she moves up and down" suffices completely and lets you prompt for dialogue

    crombobularMar 16, 2026

    @mylo1337 i'm using the new merge version and it seems good. i get amazing results even with just "the woman moves up and down" and nothing else

    LORA
    LTXV 2.3

    Details

    Downloads
    3,045
    Platform
    CivitAI
    Platform Status
    Available
    Created
    3/15/2026
    Updated
    5/4/2026
    Deleted
    -