Self Foot Worship/Smelling - CivArchive (CivitAI Archive)

I've heard a lot of people are into this kind of thing ;) Also, this is my first upload here, so I might have messed things up somewhere on the model page.

Anyway.
Trained this on the side while testing a new video captioning pipeline. Picked a niche fetish concept on purpose since the popular territory is already saturated and I needed something the base model would actually struggle with to stress-test the captions. Turned out to push past two things Wan 2.2 I2V genuinely fumbles: bringing body parts cleanly into frame from off-screen (looking at you, detached, floating body parts), and anatomically correct complex knee and leg articulation in motion. Absolute facial consistency and expression coherence during the motion was a secondary training focus, you can actually see her enjoying the act ;)

Trained on the high-noise expert only at 81 frames using a hand picked, high-quality dataset, so motion stability is the priority. Works pretty good up to 121 frames.

Please make sure to use the VBVR LoRA with your high-noise model. Check the training dataset which includes the captions, to get an idea for different prompts.

Example prompt:

S3LFSN1FF1, the woman is smelling the sole and toes of her foot. She is lifting her leg high, gripping the arch firmly with both hands while pressing the toes against her nose. Her head is tilting back as she is inhaling deeply into the skin of her toes, eyes closing in pure enjoyment and sexual arousal. She has short toenails. She wiggles her toes as she smells her foot.

(Important keywords are highlighted)

I would highly suggest you to caption the image and prepend it to your prompt to give the model a better understanding of the overall scene and position of the limbs. You can for example use Qwen3-VL-4B-Instruct-Abliterated with this system prompt:

You are a precise visual captioner for AI video generation. Describe what is visible in this image.

RULES:

- Describe physical state, body position, and visible body parts.
- Always state the shot framing at the start: close-up of face, medium shot, full body, etc.
- Always state which body parts are visible and which are out of frame.
- If feet are visible, describe their position relative to the body and face.
- Use present tense, active voice.
- Keep caption between 30-60 words.
- No emotions, narrative, or intent. No aesthetic commentary.

OUTPUT FORMAT:
One plain paragraph. No bullet points. No headers. No quotation marks.

EXAMPLE:
"Close-up of a woman's face from chin to forehead, lit softly from the left. Her eyes are open and lips are slightly parted. No feet or hands visible in frame. Her hair falls to one side."

EXAMPLE 2:
"Medium shot of a woman seated on a bed, framed from waist up. Her right leg is bent and lifted, with her bare foot raised near her face. The sole faces the camera, toes curled slightly. Her left arm supports the leg at the ankle. Face is visible in profile."

I have a lot of LoRA ideas in the pipeline but training them fast on the cloud isn't exactly cheap (especially not for me, considering where I am living). If you decide to chip in, genuinely appreciated. If not, no hard feelings, enjoy the LoRA either way.

Description

FAQ

Comments (5)

Details

Files

S3LFSN1FF1_high_only.safetensors

Mirrors

S3LFSN1FF1_Captions.zip

Mirrors

Description

FAQ

What is Self Foot Worship/Smelling?

How do I use Self Foot Worship/Smelling?

Why might this LoRA not be producing the expected results?

Can I use this LoRA commercially?

What files are available and where can I download them?

Comments (5)

Details

Files

S3LFSN1FF1_high_only.safetensors

Mirrors

S3LFSN1FF1_Captions.zip

Mirrors