Created to complement arching orgasm scenes.
Motion includes arching, spasms, and facial expressions.
The dataset was trained from 20 animated crops.
Captions are a mix of manual, Ovis2-16B, and modified versions.
Sample videos were generated with Wan2_1-I2V-14B-480P_fp8_e4m3fn.
i2V reference images may be better with animation style images.
Generation prompts may be better described in more detail.
Sample videos were generated with intensity 1.0, but the correlation between intensity and motion is unknown.
It is not clear how this will work when applied to T2V or InP models or realistic scenes.
Description
#command line
accelerate launch --num_cpu_threads_per_process 1 --mixed_precision bf16 wan_train_network.py --task i2v-14B --dit "wan2.1_i2v_480p_14B_fp16.safetensors" --dataset_config "example.toml" --sdpa --mixed_precision fp16 --fp8_base --fp8_scaled --optimizer_type adamw8bit --learning_rate 2e-4 --gradient_checkpointing --max_data_loader_n_workers 2 --persistent_data_loader_workers --network_module networks.lora_wan --network_dim 32 --network_alpha 1 --timestep_sampling shift --discrete_flow_shift 3.0 --max_train_epochs 120 --save_every_n_epochs 20 --seed 27 --output_dir "example\output" --output_name example --blocks_to_swap 33 --split_attn --metadata_title example --logging_dir "example\logs" --log_with tensorboard
#Dataset
[general]
caption_extension = ".txt"
batch_size = 1
enable_bucket = true
bucket_no_upscale = true
[[datasets]]
resolution = [384, 208]
video_directory = "example-work"
cache_directory = "example-cache"
frame_extraction = "full"
max_frames = 81
source_fps = 16.0