Vivi Luvva ( Ima's asian sister from another mother w/training dataset )

Vivi Luvva ( Ima's asian sister from another mother w/training dataset ) - v1v1l2vv4-hunyuan-v1.0

NSFW

Work in progress. This is initial drop of 'Vivi Luvva'. This will be another series of loras similar to 'That Bitch!' and 'Ima Luvva'. Currently at the text embed stage. Other loras will be added as they are trained/created. This is mainly for character preview. I hope you like her... she has, long dark brown hair, pale fair porcelain skin, large brown eyes, long eyelashes, slight rouge, slim hips, long legs, perky medium-small breasts, 23 years old, "vivi luvva" from korea.

I'll upload loras for sdxl, flux, hunyuan video and wan2.1 video. Right now just creating a training dataset.

Notes:

sdxl-base version strengths 0.8-0.9, any lower loses asian eyes.

wan2.2 lora tested on wan2.2_t2v_*_noise_14B_fp8_scaled models w/ Wan2.2-Lightning_T2V-A14B-4steps-lora_*_fp16 loras and wan2.2-t2v-rapid-aio model. lora strength set to 1.0.

hunyuan lora tested with hunyuan t2v and framepack-lora ready.

Description

same dataset as wan2.2

FAQ

Comments (17)

thejsnAug 6, 2025· 1 reaction

CivitAI

Thanks so much, such a great character (Wan). Hope you can do the low-rank(wan22) lora. and looking forward to the training set. Thanks so much for sharing!

According to the official implementation, the boundary of the timesteps for T2V is 0.875, so when training the low model, min=0, max=875, and for the high model, min=875, max=1000 would be good. Since I2V is 0.9, the thresholds are 900 for both. I know this feature was just added to a new branch of musubi-tuner

tedbiv

Author

Aug 6, 2025

https://civitai.com/models/1791202?modelVersionId=2084383 is the wan2.2 t2v low noise lora.

thejsnAug 6, 2025

tedbiv oops i meant the high lora. Thanks. I'm still experimenting, but i feel like the motion and composition is better, with a trained low/high pair.

tedbiv

Author

Aug 6, 2025

added training data.

tedbiv

Author

Aug 6, 2025

thejsn not sure if i'm gonna do a hi lora. i used the low lora for both hi and loww noise models and it seems to work fine.

thejsnAug 6, 2025

tedbiv Thanks, may i ask your training rate, and other parameters? are you using shift or sigmoid, network rank/alpha? I'm still struggling to get perfect likness with wan 2.2 training. Thanks again, looking forward to your next creation!

tedbiv

Author

Aug 6, 2025

thejsn i'm training on diffusion-pipe. i think i can copy the contents of my .toml files into dm.

tedbiv

Author

Aug 6, 2025· 1 reaction

thejsn wan2.2-low.toml = # This configuration should allow you to train Wan 14b t2v on 512x512x81 sized videos (or varying aspect ratios of the same size), with 24GB VRAM.

# change this

output_dir = '/home/tedbiv/diffusion-pipe/training-data/output/wan2.2'

# and this

dataset = '/home/tedbiv/diffusion-pipe/training-data/dataset.toml'

# training settings

epochs = 1000

micro_batch_size_per_gpu = 1

pipeline_stages = 1

gradient_accumulation_steps = 1

gradient_clipping = 1

warmup_steps = 10

# eval settings

eval_every_n_epochs = 1

eval_before_first_step = true

eval_micro_batch_size_per_gpu = 1

eval_gradient_accumulation_steps = 1

# misc settings

save_every_n_epochs = 5

checkpoint_every_n_epochs = 20

checkpoint_every_n_minutes = 120

activation_checkpointing = 'unsloth'

partition_method = 'parameters'

save_dtype = 'bfloat16'

caching_batch_size = 1

steps_per_print = 1

video_clip_mode = 'single_beginning'

# blocks_to_swap = 32

[model]

type = 'wan'

ckpt_path = '/home/tedbiv/diffusion-pipe/Wan2.2-T2V-A14B'

transformer_path = '/home/tedbiv/diffusion-pipe/Wan2.2-T2V-A14B/low_noise_model'

dtype = 'bfloat16'

transformer_dtype = 'float8'

min_t = 0

max_t = 0.875

timestep_sample_method = 'logit_normal'

[adapter]

type = 'lora'

rank = 32

dtype = 'bfloat16'

[optimizer]

type = 'AdamW8bitKahan'

# was 2e-5

lr = 5e-5

betas = [0.9, 0.99]

weight_decay = 0.01

stabilize = false

tedbiv

Author

Aug 6, 2025

thejsn dataset.toml = # Resolutions to train on, given as the side length of a square image. You can have multiple sizes here.

# !!!WARNING!!!: this might work differently to how you think it does. Images are first grouped to aspect ratio

# buckets, then each image is resized to ALL of the areas specified by the resolutions list. This is a way to do

# multi-resolution training, i.e. training on multiple total pixel areas at once. Your dataset is effectively duplicated

# as many times as the length of this list.

# If you just want to use predetermined (width, height, frames) size buckets, see the example cosmos_dataset.toml

# file for how you can do that.

resolutions = [640]

# You can give resolutions as (width, height) pairs also. This doesn't do anything different, it's just

# another way of specifying the area(s) (i.e. total number of pixels) you want to train on.

# resolutions = [[1280, 720]]

# Enable aspect ratio bucketing. For the different AR buckets, the final size will be such that

# the areas match the resolutions you configured above.

enable_ar_bucket = true

# The aspect ratio and frame bucket settings may be specified for each [[directory]] entry as well.

# Directory-level settings will override top-level settings.

# Min and max aspect ratios, given as width/height ratio.

min_ar = 0.5

max_ar = 2.0

# Total number of aspect ratio buckets, evenly spaced (in log space) between min_ar and max_ar.

num_ar_buckets = 7

# Can manually specify ar_buckets instead of using the range-style config above.

# Each entry can be width/height ratio, or (width, height) pair. But you can't mix them, because of TOML.

# ar_buckets = [[512, 512], [448, 576]]

# ar_buckets = [1.0, 1.5]

# For video training, you need to configure frame buckets (similar to aspect ratio buckets). There will always

# be a frame bucket of 1 for images. Videos will be assigned to the longest frame bucket possible, such that the video

# is still greater than or equal to the frame bucket length.

# But videos are never assigned to the image frame bucket (1); if the video is very short it would just be dropped.

frame_buckets = [1, 65]

# If you have >24GB VRAM, or multiple GPUs and use pipeline parallelism, or lower the spatial resolution, you could maybe train with longer frame buckets

# frame_buckets = [1, 33, 65, 97]

[[directory]]

# Path to directory of images/videos, and corresponding caption files. The caption files should match the media file name, but with a .txt extension.

# A missing caption file will log a warning, but then just train using an empty caption.

# path = '/home/anon/data/images/grayscale'

path = '/home/tedbiv/diffusion-pipe/training-data/images'

# You can do masked training, where the mask indicates which parts of the image to train on. The masking is done in the loss function. The mask directory should have mask

# images with the same names (ignoring the extension) as the training images. E.g. training image 1.jpg could have mask image 1.jpg, 1.png, etc. If a training image doesn't

# have a corresponding mask, a warning is printed but training proceeds with no mask for that image. In the mask, white means train on this, black means mask it out. Values

# in between black and white become a weight between 0 and 1, i.e. you can use a suitable value of grey for mask weight of 0.5. In actuality, only the R channel is extracted

# and converted to the mask weight.

# The mask_path can point to any directory containing mask images.

#mask_path = '/home/anon/data/images/grayscale/masks'

# How many repeats for 1 epoch. The dataset will act like it is duplicated this many times.

# The semantics of this are the same as sd-scripts: num_repeats=1 means one epoch is a single pass over all examples (no duplication).

num_repeats = 1

# Example of overriding some settings, and using ar_buckets to directly specify ARs.

# ar_buckets = [[448, 576]]

# resolutions = [[448, 576]]

# frame_buckets = [1]

# You can list multiple directories.

# [[directory]]

# path = '/home/anon/data/images/something_else'

# num_repeats = 5

tedbiv

Author

Aug 6, 2025· 1 reaction

i'm currently training grinder on wan2.2. it's my first motion lora. depending on how it turns out will determine if i need a hi noise version.

thejsnAug 7, 2025

tedbiv I think you will definately get better results on a motion lora with the highnoise. Highnoise is the composition and motion, Low is the detail.

tedbiv

Author

Aug 7, 2025

thejsn maybe... i just finished training my grinder lora on low noise. while i can see the grinding motion it's not great. i may try training it in high noise next. i just wish they would get a workflow to generate decent videos out to at least 10secs. gonna move back over to framepack for a while.

tedbiv

Author

Aug 7, 2025· 1 reaction

it did train quickly. it only took about 2-3 hours.

thejsnAug 7, 2025

tedbiv I'm doing 8 seconds pretty consistently with wan 2.2 - 129 frames is my target. High Noise is running 10 of 20 steps CFG3.5 (no fast lora), low noise pass 6 of 12 steps (with fast lora) CFG1. This yields great results. I don't recommend using fast lora on the highpass as that is composition and motion. Although i think a new version of lightning was just released.

tedbiv

Author

Aug 7, 2025

thejsn thx. i'll try that. i just finished training high noise version of grinder.

thejsnAug 10, 2025

I just posted my experience training character lora on this post. https://www.reddit.com/r/StableDiffusion/comments/1mmni0l/wan_22_character_lora_training_discussion_best/

tedbiv

Author

Aug 10, 2025

thejsn nice write up.

LORA

Hunyuan Video