CivArchive
    SDXL LoRA train(8GB) and Checkpoint finetune(16GB) - v3.5
    NSFW
    Preview 1812507

    Credit:

    This script package from bdsqlsz

    use kohya-ss/sd-scripts (github.com) for core

    original script from Akegarasu/lora-scripts: LoRA training scripts use kohya-ss's trainer, for diffusion model. (github.com)

    How to use:

    0、(windows)Give unrestricted script access to powershell so venv can work:

    • Open an administrator powershell window

    • Type Set-ExecutionPolicy Unrestricted and answer A

    • Close admin powershell window

    1、Unzip this to anyway you want(Recommend with other train program which has venv)

    if you Update it,just Rerun install-cn-qinglong.ps1.then enter N

    2、Run install-cn-qinglong.ps1 in windows(linux just use commnd line)

    it will automatically install environment(if you has venv,just put to over it)

    3、Put your datesets in /input dir.

    a formattting:
    ./input/2_lora/(datasets images)

    4、Run tagger.ps1 for tagger them,automatically.(linux use tagger.sh)

    5、Edit your datesets if your want

    6、Edit train script(train_8Glora or train_16G_DB)(linux use train_tacanime)

    just for

    1. train_mode(lora、db、sdxl_lora、sdxl_db、cn3l)

    2. pretrained_model

    3. train_data_dir

      a formatting ./input/

    4. resolution(SDXL for 1024*1024,sd1.5 for 576*576)

    5. max and min bucket repo(SDXL 640~1536,sd1.5:256~1024)

    6. optimizer(SDXL:adaFactor for batch size1,Pagedadamw8bit for batch size 4)

    other dont need more change if you dont know it

    if i can help you ,i will very happy that~

    Support 青龍聖者@bdsqlsz on Ko-fi! ❤️. ko-fi.com/bdsqlsz - Ko-fi ❤️ Where creators get support from fans through donations, memberships, shop sales and more! The original 'Buy Me a Coffee' Page.

    Description

    fixed bitsandbytes 8bit problem

    update run install-cn-qinglong.ps1 then enter N

    FAQ

    Comments (16)

    sajeasJul 31, 2023· 1 reaction
    CivitAI

    It doesn't see images
    i put them in
    C:\Lora\kohya_ss\input
    in train_8Glora file the setting is this:
    $train_data_dir = "./input"
    How the right way to put folder paths?

    bdsqlsz
    Author
    Jul 31, 2023

    you should mkdir over a dir then put images in.
    named: Number_name(whatever)
    train_data_dir=./input/
    ./input/2_lora/you-dataset-images(such as png、txt)

    sajeasJul 31, 2023

    it worked but now shows this error:

    [Dataset 0]

    loading image sizes.

    100%|████████████████████████████████████████████████████████████████████████████████| 21/21 [00:00<00:00, 2632.17it/s]

    make buckets

    number of images (including repeats) / 各bucketの画像枚数(繰り返し回数を含む)

    bucket 0: resolution (1024, 1024), count: 2100

    mean ar error (without repeats): 0.0

    clip_skip will be unexpected / SDXL学習ではclip_skipは動作しません

    Warning: SDXL has been trained with noise_offset=0.0357 / SDXLはnoise_offset=0.0357で学習されています

    noise_offset is set to 0.0375 / noise_offsetが0.0375に設定されました

    preparing accelerator

    loading model for process 0/1

    load StableDiffusion checkpoint: ./Stable-diffusion/dreamshaperXL10_alpha2Xl10.safetensors

    building U-Net

    loading U-Net from checkpoint

    U-Net: <All keys matched successfully>

    building text encoders

    loading text encoders from checkpoint

    text encoder 1: <All keys matched successfully>

    text encoder 2: <All keys matched successfully>

    building VAE

    loading VAE from checkpoint

    VAE: <All keys matched successfully>

    Enable xformers for U-Net

    import network module: networks.lora

    [Dataset 0]

    caching latents.

    checking cache validity...

    100%|█████████████████████████████████████████████████████████████████████████████████| 21/21 [00:00<00:00, 346.91it/s]

    caching latents...

    0it [00:00, ?it/s]

    move vae and unet to cpu to save memory

    [Dataset 0]

    caching text encoder outputs.

    checking cache existence...

    100%|██████████████████████████████████████████████████████████████████████████████████████████| 21/21 [00:00<?, ?it/s]

    caching text encoder outputs...

    100%|██████████████████████████████████████████████████████████████████████████████████| 21/21 [00:02<00:00, 7.99it/s]

    move vae and unet back to original device

    create LoRA network. base dim (rank): 32, alpha: 16.0

    neuron dropout: p=None, rank dropout: p=None, module dropout: p=None

    create LoRA for Text Encoder 1:

    create LoRA for Text Encoder 2:

    create LoRA for Text Encoder: 264 modules.

    create LoRA for U-Net: 722 modules.

    enable LoRA for U-Net

    prepare optimizer, data loader etc.

    ╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮

    │ C:\Lora\kohya_ss\sd-scripts\sdxl_train_network.py:174 in <module> │

    │ │

    │ 171 │ args = train_util.read_config_from_file(args, parser) │

    │ 172 │ │

    │ 173 │ trainer = SdxlNetworkTrainer() │

    │ ❱ 174 │ trainer.train(args) │

    │ 175 │

    │ │

    │ C:\Lora\kohya_ss\sd-scripts\train_network.py:325 in train │

    │ │

    │ 322 │ │ │ ) │

    │ 323 │ │ │ trainable_params = network.prepare_optimizer_params(args.text_encoder_lr, ar │

    │ 324 │ │ │

    │ ❱ 325 │ │ optimizer_name, optimizer_args, optimizer = train_util.get_optimizer(args, train │

    │ 326 │ │ │

    │ 327 │ │ # dataloaderを準備する │

    │ 328 │ │ # DataLoaderのプロセス数:0はメインプロセスになる │

    │ │

    │ C:\Lora\kohya_ss\sd-scripts\library\train_util.py:3222 in get_optimizer │

    │ │

    │ 3219 │ │

    │ 3220 │ elif optimizer_type.endswith("8bit".lower()): │

    │ 3221 │ │ try: │

    │ ❱ 3222 │ │ │ import bitsandbytes as bnb │

    │ 3223 │ │ except ImportError: │

    │ 3224 │ │ │ raise ImportError("No bitsandbytes / bitsandbytesがインストールされていない │

    │ 3225 │

    │ │

    │ c:\Lora\kohya_ss\venv\lib\site-packages\bitsandbytes\__init__.py:6 in <module> │

    │ │

    │ 3 # This source code is licensed under the MIT license found in the │

    │ 4 # LICENSE file in the root directory of this source tree. │

    │ 5 │

    │ ❱ 6 from . import cuda_setup, utils, research │

    │ 7 from .autograd._functions import ( │

    │ 8 │ MatmulLtState, │

    │ 9 │ bmm_cublas, │

    │ │

    │ c:\Lora\kohya_ss\venv\lib\site-packages\bitsandbytes\research\__init__.py:1 in <module> │

    │ │

    │ ❱ 1 from . import nn │

    │ 2 from .autograd._functions import ( │

    │ 3 │ switchback_bnb, │

    │ 4 │ matmul_fp8_global, │

    │ │

    │ c:\Lora\kohya_ss\venv\lib\site-packages\bitsandbytes\research\nn\__init__.py:1 in <module> │

    │ │

    │ ❱ 1 from .modules import LinearFP8Mixed, LinearFP8Global │

    │ 2 │

    │ │

    │ c:\Lora\kohya_ss\venv\lib\site-packages\bitsandbytes\research\nn\modules.py:8 in <module> │

    │ │

    │ 5 from torch import Tensor, device, dtype, nn │

    │ 6 │

    │ 7 import bitsandbytes as bnb │

    │ ❱ 8 from bitsandbytes.optim import GlobalOptimManager │

    │ 9 from bitsandbytes.utils import OutlierTracer, find_outlier_dims │

    │ 10 │

    │ 11 T = TypeVar("T", bound="torch.nn.Module") │

    │ │

    │ c:\Lora\kohya_ss\venv\lib\site-packages\bitsandbytes\optim\__init__.py:6 in <module> │

    │ │

    │ 3 # This source code is licensed under the MIT license found in the │

    │ 4 # LICENSE file in the root directory of this source tree. │

    │ 5 │

    │ ❱ 6 from bitsandbytes.cextension import COMPILED_WITH_CUDA │

    │ 7 │

    │ 8 from .adagrad import Adagrad, Adagrad8bit, Adagrad32bit │

    │ 9 from .adam import Adam, Adam8bit, Adam32bit, PagedAdam, PagedAdam8bit, PagedAdam32bit │

    │ │

    │ c:\Lora\kohya_ss\venv\lib\site-packages\bitsandbytes\cextension.py:13 in <module> │

    │ │

    │ 10 │

    │ 11 setup = CUDASetup.get_instance() │

    │ 12 if setup.initialized != True: │

    │ ❱ 13 │ setup.run_cuda_setup() │

    │ 14 │

    │ 15 lib = setup.lib │

    │ 16 try: │

    │ │

    │ c:\Lora\kohya_ss\venv\lib\site-packages\bitsandbytes\cuda_setup\main.py:126 in run_cuda_setup │

    │ │

    │ 123 │ │ self.initialized = True │

    │ 124 │ │ self.cuda_setup_log = [] │

    │ 125 │ │ │

    │ ❱ 126 │ │ binary_name, cudart_path, cc, cuda_version_string = evaluate_cuda_setup() │

    │ 127 │ │ self.cudart_path = cudart_path │

    │ 128 │ │ self.cuda_available = torch.cuda.is_available() │

    │ 129 │ │ self.cc = cc │

    ╰──────────────────────────────────────────────────────────────────────────────────────────────────╯

    ValueError: too many values to unpack (expected 4)

    ╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮

    │ D:\Python\lib\runpy.py:196 in runmodule_as_main │

    │ │

    │ 193 │ main_globals = sys.modules["__main__"].__dict__ │

    │ 194 │ if alter_argv: │

    │ 195 │ │ sys.argv[0] = mod_spec.origin │

    │ ❱ 196 │ return runcode(code, main_globals, None, │

    │ 197 │ │ │ │ │ "__main__", mod_spec) │

    │ 198 │

    │ 199 def run_module(mod_name, init_globals=None, │

    │ │

    │ D:\Python\lib\runpy.py:86 in runcode │

    │ │

    │ 83 │ │ │ │ │ loader = loader, │

    │ 84 │ │ │ │ │ package = pkg_name, │

    │ 85 │ │ │ │ │ spec = mod_spec) │

    │ ❱ 86 │ exec(code, run_globals) │

    │ 87 │ return run_globals │

    │ 88 │

    │ 89 def runmodule_code(code, init_globals=None, │

    │ │

    │ in <module>:7 │

    │ │

    │ 4 from accelerate.commands.accelerate_cli import main │

    │ 5 if name == '__main__': │

    │ 6 │ sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0]) │

    │ ❱ 7 │ sys.exit(main()) │

    │ 8 │

    │ │

    │ c:\Lora\kohya_ss\venv\lib\site-packages\accelerate\commands\accelerate_cli.py:45 in main │

    │ │

    │ 42 │ │ exit(1) │

    │ 43 │ │

    │ 44 │ # Run │

    │ ❱ 45 │ args.func(args) │

    │ 46 │

    │ 47 │

    │ 48 if name == "__main__": │

    │ │

    │ c:\Lora\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py:918 in launch_command │

    │ │

    │ 915 │ elif defaults is not None and defaults.compute_environment == ComputeEnvironment.AMA │

    │ 916 │ │ sagemaker_launcher(defaults, args) │

    │ 917 │ else: │

    │ ❱ 918 │ │ simple_launcher(args) │

    │ 919 │

    │ 920 │

    │ 921 def main(): │

    │ │

    │ c:\Lora\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py:580 in simple_launcher │

    │ │

    │ 577 │ process.wait() │

    │ 578 │ if process.returncode != 0: │

    │ 579 │ │ if not args.quiet: │

    │ ❱ 580 │ │ │ raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd) │

    │ 581 │ │ else: │

    │ 582 │ │ │ sys.exit(1) │

    │ 583 │

    ╰──────────────────────────────────────────────────────────────────────────────────────────────────╯

    CalledProcessError: Command '['c:\\Lora\\kohya_ss\\venv\\Scripts\\python.exe', './sd-scripts/sdxl_train_network.py',

    '--pretrained_model_name_or_path=./Stable-diffusion/dreamshaperXL10_alpha2Xl10.safetensors', '--output_dir=./output',

    '--logging_dir=./logs', '--resolution=1024,1024', '--max_train_epochs=40', '--learning_rate=2e-4',

    '--lr_scheduler=constant_with_warmup', '--output_name=sdxl_8glora', '--train_batch_size=1', '--save_every_n_epochs=10',

    '--save_precision=bf16', '--seed=1026', '--max_token_length=225', '--caption_extension=.txt',

    '--save_model_as=safetensors', '--vae_batch_size=4', '--xformers', '--cache_text_encoder_outputs',

    '--cache_text_encoder_outputs_to_disk', '--bucket_reso_steps=32', '--train_data_dir=./input/', '--clip_skip=2',

    '--network_dim=32', '--network_alpha=16', "--training_comment=this LoRA model created from bdsqlsz by bdsqlsz'script",

    '--persistent_data_loader_workers', '--cache_latents', '--cache_latents_to_disk', '--gradient_checkpointing',

    '--noise_offset=0.0375', '--adaptive_noise_scale=0.0375', '--optimizer_type=PagedAdamW8bit', '--optimizer_args',

    'weight_decay=0.01', '--network_train_unet_only', '--unet_lr=1e-5', '--text_encoder_lr=2e-5', '--keep_tokens=1',

    '--min_snr_gamma=5', '--lr_scheduler_num_cycles=1', '--enable_bucket', '--min_bucket_reso=640',

    '--max_bucket_reso=1536', '--full_bf16', '--mixed_precision=bf16', '--network_module=networks.lora',

    '--lr_warmup_steps=50']' returned non-zero exit status 1.

    Train finished

    bdsqlsz
    Author
    Jul 31, 2023

    @sajeas sorry,i found it has changed in bnb 0.41.0 you can use other optimizer such as adafactor instead of 8bit. i will fix it soon

    bdsqlsz
    Author
    Jul 31, 2023

    i update V3.5,it should work now

    sajeasJul 31, 2023

    @bdsqlsz Thank you, it seems to working.

    Cecily_ccAug 3, 2023
    CivitAI

    is Checkpoint finetune(16GB) sdxl_db parameter for 16g train script?

    bdsqlsz
    Author
    Aug 3, 2023· 1 reaction

    yes,it is for DB checkpoint.

    Cecily_ccAug 3, 2023

    @bdsqlsz Thank you so much, I have tested it by sdxl lora and db with same parameter and datasets (high steps). 我用kohya 在我电脑跑有问题,解决起来还麻烦 要扩盘····多亏了青龙大佬的脚本了!love u.

    sajeasAug 3, 2023
    CivitAI

    How do I set learning rate to something like 0,0004?
    My PC can handle only 1 Epoch and warm up like 2e-5 or 4e-7 overtrains lora
    do I need to change $lr_scheduler to constant?
    What would be the best setting for 1 epoch?

    bdsqlsz
    Author
    Aug 3, 2023

    Xe-Y=0*Y+X

    0,0004=4e-4

    for overfitting,you need change
    $network_drop:0.1~0.3

    or set smaller repeats number for dir name such as

    2_lora

    sajeasAug 3, 2023

    @bdsqlsz Thank you

    Hex4Aug 6, 2023
    CivitAI

    if you only want 5 epochs, what should you change, have you tried your settings for 5 epochs, the results are still not very similar

    bdsqlsz
    Author
    Aug 6, 2023

    less epoch need higher LR

    CyberDickLangAug 8, 2023
    CivitAI

    在sdxl_db模式下 除了adaFactor之外的优化器会报错

    prepare optimizer, data loader etc.

    Traceback (most recent call last):

    File "E:\TRAINING\lora-scripts\sd-scripts\sdxl_train.py", line 648, in <module>

    train(args)

    File "E:\TRAINING\lora-scripts\sd-scripts\sdxl_train.py", line 251, in train

    , , optimizer = train_util.get_optimizer(args, trainable_params=params_to_optimize)

    File "E:\TRAINING\lora-scripts\sd-scripts\library\train_util.py", line 3222, in get_optimizer

    import bitsandbytes as bnb

    File "E:\TRAINING\lora-scripts\venv\lib\site-packages\bitsandbytes\__init__.py", line 6, in <module>

    from . import cuda_setup, utils, research

    File "E:\TRAINING\lora-scripts\venv\lib\site-packages\bitsandbytes\research\__init__.py", line 1, in <module>

    from . import nn

    File "E:\TRAINING\lora-scripts\venv\lib\site-packages\bitsandbytes\research\nn\__init__.py", line 1, in <module>

    from .modules import LinearFP8Mixed, LinearFP8Global

    File "E:\TRAINING\lora-scripts\venv\lib\site-packages\bitsandbytes\research\nn\modules.py", line 8, in <module>

    from bitsandbytes.optim import GlobalOptimManager

    File "E:\TRAINING\lora-scripts\venv\lib\site-packages\bitsandbytes\optim\__init__.py", line 6, in <module>

    from bitsandbytes.cextension import COMPILED_WITH_CUDA

    File "E:\TRAINING\lora-scripts\venv\lib\site-packages\bitsandbytes\cextension.py", line 13, in <module>

    setup.run_cuda_setup()

    File "E:\TRAINING\lora-scripts\venv\lib\site-packages\bitsandbytes\cuda_setup\main.py", line 126, in run_cuda_setup

    binary_name, cudart_path, cc, cuda_version_string = evaluate_cuda_setup()

    ValueError: too many values to unpack (expected 4)

    Traceback (most recent call last):

    File "E:\Python\Python310\lib\runpy.py", line 196, in runmodule_as_main

    return runcode(code, main_globals, None,

    File "E:\Python\Python310\lib\runpy.py", line 86, in runcode

    exec(code, run_globals)

    File "E:\TRAINING\lora-scripts\venv\Scripts\accelerate.exe\__main__.py", line 7, in <module>

    File "E:\TRAINING\lora-scripts\venv\lib\site-packages\accelerate\commands\accelerate_cli.py", line 45, in main

    args.func(args)

    File "E:\TRAINING\lora-scripts\venv\lib\site-packages\accelerate\commands\launch.py", line 918, in launch_command

    simple_launcher(args)

    File "E:\TRAINING\lora-scripts\venv\lib\site-packages\accelerate\commands\launch.py", line 580, in simple_launcher

    raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)

    subprocess.CalledProcessError: Command '['E:\\TRAINING\\lora-scripts\\venv\\Scripts\\python.exe', './sd-scripts/sdxl_train.py', '--pretrained_model_name_or_path=./sd-models/stableDiffusionXLAnime_v10.safetensors', '--output_dir=./output', '--logging_dir=./logs', '--resolution=1024,1024', '--max_train_epochs=10', '--learning_rate=5e-5', '--lr_scheduler=cosine_with_restarts', '--output_name=darkPizzaXLv21anime', '--train_batch_size=1', '--save_every_n_epochs=1', '--save_precision=bf16', '--seed=1026', '--max_token_length=225', '--caption_extension=.txt', '--save_model_as=safetensors', '--vae_batch_size=4', '--xformers', '--cache_text_encoder_outputs', '--cache_text_encoder_outputs_to_disk', '--bucket_reso_steps=32', '--train_data_dir=./train/darkPizza/trainxlv2', '--clip_skip=2', '--persistent_data_loader_workers', '--cache_latents', '--cache_latents_to_disk', '--noise_offset=0.1', '--adaptive_noise_scale=0.0375', '--optimizer_type=PagedAdamW8bit', '--optimizer_args', 'weight_decay=0.01', '--keep_tokens=1', '--min_snr_gamma=5', '--lr_scheduler_num_cycles=1', '--enable_bucket', '--min_bucket_reso=640', '--max_bucket_reso=1536', '--full_bf16', '--mixed_precision=bf16', '--lr_warmup_steps=50']' returned non-zero exit status 1.

    Train finished

    bdsqlsz
    Author
    Aug 8, 2023

    bnb马上会更新新版本

    Other
    SDXL 1.0

    Details

    Downloads
    341
    Platform
    CivitAI
    Platform Status
    Available
    Created
    7/31/2023
    Updated
    5/13/2026
    Deleted
    -

    Files

    sdxlLoraTrain8GBAnd_v35.zip

    Mirrors

    CivitAI (1 mirrors)

    Available On (1 platform)

    Same model published on other platforms. May have additional downloads or version variants.