SDXL LoRA train(8GB) and Checkpoint finetune(16GB) - v3.0

NSFW

Credit:

This script package from bdsqlsz

use kohya-ss/sd-scripts (github.com ) for core

original script from Akegarasu/lora-scripts: LoRA training scripts use kohya-ss's trainer, for diffusion model. (github.com )

How to use:

0、(windows)Give unrestricted script access to powershell so venv can work:

Open an administrator powershell window
Type Set-ExecutionPolicy Unrestricted and answer A
Close admin powershell window

1、Unzip this to anyway you want(Recommend with other train program which has venv)

if you Update it,just Rerun install-cn-qinglong.ps1.then enter N

2、Run install-cn-qinglong.ps1 in windows(linux just use commnd line)

it will automatically install environment(if you has venv,just put to over it)

3、Put your datesets in /input dir.

a formattting:
./input/2_lora/(datasets images)

4、Run tagger.ps1 for tagger them,automatically.(linux use tagger.sh)

5、Edit your datesets if your want

6、Edit train script(train_8Glora or train_16G_DB)(linux use train_tacanime)

just for

train_mode(lora、db、sdxl_lora、sdxl_db、cn3l)
pretrained_model
train_data_dir
a formatting ./input/
resolution(SDXL for 1024*1024,sd1.5 for 576*576)
max and min bucket repo(SDXL 640~1536,sd1.5:256~1024)
optimizer(SDXL:adaFactor for batch size1,Pagedadamw8bit for batch size 4)

other dont need more change if you dont know it

if i can help you ,i will very happy that~

Support 青龍聖者@bdsqlsz on Ko-fi! ❤️. ko-fi.com/bdsqlsz - Ko-fi ❤️ Where creators get support from fans through donations, memberships, shop sales and more! The original 'Buy Me a Coffee' Page.

Description

fixed bugs about bitsandbytes over 0.35

FAQ

Comments (16)

sajeasJul 31, 2023· 1 reaction

CivitAI

It doesn't see images
i put them in
C:\Lora\kohya_ss\input
in train_8Glora file the setting is this:
$train_data_dir = "./input"
How the right way to put folder paths?

bdsqlsz

Author

Jul 31, 2023

you should mkdir over a dir then put images in.
named: Number_name(whatever)
train_data_dir=./input/
./input/2_lora/you-dataset-images(such as png、txt)

sajeasJul 31, 2023

it worked but now shows this error:

[Dataset 0]

loading image sizes.

100%|████████████████████████████████████████████████████████████████████████████████| 21/21 [00:00<00:00, 2632.17it/s]

make buckets

number of images (including repeats) / 各bucketの画像枚数（繰り返し回数を含む）

bucket 0: resolution (1024, 1024), count: 2100

mean ar error (without repeats): 0.0

clip_skip will be unexpected / SDXL学習ではclip_skipは動作しません

Warning: SDXL has been trained with noise_offset=0.0357 / SDXLはnoise_offset=0.0357で学習されています

noise_offset is set to 0.0375 / noise_offsetが0.0375に設定されました

preparing accelerator

loading model for process 0/1

load StableDiffusion checkpoint: ./Stable-diffusion/dreamshaperXL10_alpha2Xl10.safetensors

building U-Net

loading U-Net from checkpoint

U-Net: <All keys matched successfully>

building text encoders

loading text encoders from checkpoint

text encoder 1: <All keys matched successfully>

text encoder 2: <All keys matched successfully>

building VAE

loading VAE from checkpoint

VAE: <All keys matched successfully>

Enable xformers for U-Net

import network module: networks.lora

[Dataset 0]

caching latents.

checking cache validity...

100%|█████████████████████████████████████████████████████████████████████████████████| 21/21 [00:00<00:00, 346.91it/s]

caching latents...

0it [00:00, ?it/s]

move vae and unet to cpu to save memory

[Dataset 0]

caching text encoder outputs.

checking cache existence...

100%|██████████████████████████████████████████████████████████████████████████████████████████| 21/21 [00:00<?, ?it/s]

caching text encoder outputs...

100%|██████████████████████████████████████████████████████████████████████████████████| 21/21 [00:02<00:00, 7.99it/s]

move vae and unet back to original device

create LoRA network. base dim (rank): 32, alpha: 16.0

neuron dropout: p=None, rank dropout: p=None, module dropout: p=None

create LoRA for Text Encoder 1:

create LoRA for Text Encoder 2:

create LoRA for Text Encoder: 264 modules.

create LoRA for U-Net: 722 modules.

enable LoRA for U-Net

prepare optimizer, data loader etc.

╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮

│ C:\Lora\kohya_ss\sd-scripts\sdxl_train_network.py:174 in <module> │

│ │

│ 171 │ args = train_util.read_config_from_file(args, parser) │

│ 172 │ │

│ 173 │ trainer = SdxlNetworkTrainer() │

│ ❱ 174 │ trainer.train(args) │

│ 175 │

│ │

│ C:\Lora\kohya_ss\sd-scripts\train_network.py:325 in train │

│ │

│ 322 │ │ │ ) │

│ 323 │ │ │ trainable_params = network.prepare_optimizer_params(args.text_encoder_lr, ar │

│ 324 │ │ │

│ ❱ 325 │ │ optimizer_name, optimizer_args, optimizer = train_util.get_optimizer(args, train │

│ 326 │ │ │

│ 327 │ │ # dataloaderを準備する │

│ 328 │ │ # DataLoaderのプロセス数：0はメインプロセスになる │

│ │

│ C:\Lora\kohya_ss\sd-scripts\library\train_util.py:3222 in get_optimizer │

│ │

│ 3219 │ │

│ 3220 │ elif optimizer_type.endswith("8bit".lower()): │

│ 3221 │ │ try: │

│ ❱ 3222 │ │ │ import bitsandbytes as bnb │

│ 3223 │ │ except ImportError: │

│ 3224 │ │ │ raise ImportError("No bitsandbytes / bitsandbytesがインストールされていない │

│ 3225 │

│ │

│ c:\Lora\kohya_ss\venv\lib\site-packages\bitsandbytes\__init__.py:6 in <module> │

│ │

│ 3 # This source code is licensed under the MIT license found in the │

│ 4 # LICENSE file in the root directory of this source tree. │

│ 5 │

│ ❱ 6 from . import cuda_setup, utils, research │

│ 7 from .autograd._functions import ( │

│ 8 │ MatmulLtState, │

│ 9 │ bmm_cublas, │

│ │

│ c:\Lora\kohya_ss\venv\lib\site-packages\bitsandbytes\research\__init__.py:1 in <module> │

│ │

│ ❱ 1 from . import nn │

│ 2 from .autograd._functions import ( │

│ 3 │ switchback_bnb, │

│ 4 │ matmul_fp8_global, │

│ │

│ c:\Lora\kohya_ss\venv\lib\site-packages\bitsandbytes\research\nn\__init__.py:1 in <module> │

│ │

│ ❱ 1 from .modules import LinearFP8Mixed, LinearFP8Global │

│ 2 │

│ │

│ c:\Lora\kohya_ss\venv\lib\site-packages\bitsandbytes\research\nn\modules.py:8 in <module> │

│ │

│ 5 from torch import Tensor, device, dtype, nn │

│ 6 │

│ 7 import bitsandbytes as bnb │

│ ❱ 8 from bitsandbytes.optim import GlobalOptimManager │

│ 9 from bitsandbytes.utils import OutlierTracer, find_outlier_dims │

│ 10 │

│ 11 T = TypeVar("T", bound="torch.nn.Module") │

│ │

│ c:\Lora\kohya_ss\venv\lib\site-packages\bitsandbytes\optim\__init__.py:6 in <module> │

│ │

│ 3 # This source code is licensed under the MIT license found in the │

│ 4 # LICENSE file in the root directory of this source tree. │

│ 5 │

│ ❱ 6 from bitsandbytes.cextension import COMPILED_WITH_CUDA │

│ 7 │

│ 8 from .adagrad import Adagrad, Adagrad8bit, Adagrad32bit │

│ 9 from .adam import Adam, Adam8bit, Adam32bit, PagedAdam, PagedAdam8bit, PagedAdam32bit │

│ │

│ c:\Lora\kohya_ss\venv\lib\site-packages\bitsandbytes\cextension.py:13 in <module> │

│ │

│ 10 │

│ 11 setup = CUDASetup.get_instance() │

│ 12 if setup.initialized != True: │

│ ❱ 13 │ setup.run_cuda_setup() │

│ 14 │

│ 15 lib = setup.lib │

│ 16 try: │

│ │

│ c:\Lora\kohya_ss\venv\lib\site-packages\bitsandbytes\cuda_setup\main.py:126 in run_cuda_setup │

│ │

│ 123 │ │ self.initialized = True │

│ 124 │ │ self.cuda_setup_log = [] │

│ 125 │ │ │

│ ❱ 126 │ │ binary_name, cudart_path, cc, cuda_version_string = evaluate_cuda_setup() │

│ 127 │ │ self.cudart_path = cudart_path │

│ 128 │ │ self.cuda_available = torch.cuda.is_available() │

│ 129 │ │ self.cc = cc │

╰──────────────────────────────────────────────────────────────────────────────────────────────────╯

ValueError: too many values to unpack (expected 4)

│ D:\Python\lib\runpy.py:196 in runmodule_as_main │

│ │

│ 193 │ main_globals = sys.modules["__main__"].__dict__ │

│ 194 │ if alter_argv: │

│ 195 │ │ sys.argv[0] = mod_spec.origin │

│ ❱ 196 │ return runcode(code, main_globals, None, │

│ 197 │ │ │ │ │ "__main__", mod_spec) │

│ 198 │

│ 199 def run_module(mod_name, init_globals=None, │

│ │

│ D:\Python\lib\runpy.py:86 in runcode │

│ │

│ 83 │ │ │ │ │ loader = loader, │

│ 84 │ │ │ │ │ package = pkg_name, │

│ 85 │ │ │ │ │ spec = mod_spec) │

│ ❱ 86 │ exec(code, run_globals) │

│ 87 │ return run_globals │

│ 88 │

│ 89 def runmodule_code(code, init_globals=None, │

│ │

│ in <module>:7 │

│ │

│ 4 from accelerate.commands.accelerate_cli import main │

│ 5 if name == '__main__': │

│ 6 │ sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0]) │

│ ❱ 7 │ sys.exit(main()) │

│ 8 │

│ │

│ c:\Lora\kohya_ss\venv\lib\site-packages\accelerate\commands\accelerate_cli.py:45 in main │

│ │

│ 42 │ │ exit(1) │

│ 43 │ │

│ 44 │ # Run │

│ ❱ 45 │ args.func(args) │

│ 46 │

│ 47 │

│ 48 if name == "__main__": │

│ │

│ c:\Lora\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py:918 in launch_command │

│ │

│ 915 │ elif defaults is not None and defaults.compute_environment == ComputeEnvironment.AMA │

│ 916 │ │ sagemaker_launcher(defaults, args) │

│ 917 │ else: │

│ ❱ 918 │ │ simple_launcher(args) │

│ 919 │

│ 920 │

│ 921 def main(): │

│ │

│ c:\Lora\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py:580 in simple_launcher │

│ │

│ 577 │ process.wait() │

│ 578 │ if process.returncode != 0: │

│ 579 │ │ if not args.quiet: │

│ ❱ 580 │ │ │ raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd) │

│ 581 │ │ else: │

│ 582 │ │ │ sys.exit(1) │

│ 583 │

CalledProcessError: Command '['c:\\Lora\\kohya_ss\\venv\\Scripts\\python.exe', './sd-scripts/sdxl_train_network.py',

'--pretrained_model_name_or_path=./Stable-diffusion/dreamshaperXL10_alpha2Xl10.safetensors', '--output_dir=./output',

'--logging_dir=./logs', '--resolution=1024,1024', '--max_train_epochs=40', '--learning_rate=2e-4',

'--lr_scheduler=constant_with_warmup', '--output_name=sdxl_8glora', '--train_batch_size=1', '--save_every_n_epochs=10',

'--save_precision=bf16', '--seed=1026', '--max_token_length=225', '--caption_extension=.txt',

'--save_model_as=safetensors', '--vae_batch_size=4', '--xformers', '--cache_text_encoder_outputs',

'--cache_text_encoder_outputs_to_disk', '--bucket_reso_steps=32', '--train_data_dir=./input/', '--clip_skip=2',

'--network_dim=32', '--network_alpha=16', "--training_comment=this LoRA model created from bdsqlsz by bdsqlsz'script",

'--persistent_data_loader_workers', '--cache_latents', '--cache_latents_to_disk', '--gradient_checkpointing',

'--noise_offset=0.0375', '--adaptive_noise_scale=0.0375', '--optimizer_type=PagedAdamW8bit', '--optimizer_args',

'weight_decay=0.01', '--network_train_unet_only', '--unet_lr=1e-5', '--text_encoder_lr=2e-5', '--keep_tokens=1',

'--min_snr_gamma=5', '--lr_scheduler_num_cycles=1', '--enable_bucket', '--min_bucket_reso=640',

'--max_bucket_reso=1536', '--full_bf16', '--mixed_precision=bf16', '--network_module=networks.lora',

'--lr_warmup_steps=50']' returned non-zero exit status 1.

Train finished

bdsqlsz

Author

Jul 31, 2023

@sajeas sorry,i found it has changed in bnb 0.41.0 you can use other optimizer such as adafactor instead of 8bit. i will fix it soon

bdsqlsz

Author

Jul 31, 2023

i update V3.5,it should work now

sajeasJul 31, 2023

@bdsqlsz Thank you, it seems to working.

Cecily_ccAug 3, 2023

CivitAI

is Checkpoint finetune(16GB) sdxl_db parameter for 16g train script?

bdsqlsz

Author

Aug 3, 2023· 1 reaction

yes,it is for DB checkpoint.

Cecily_ccAug 3, 2023

@bdsqlsz Thank you so much, I have tested it by sdxl lora and db with same parameter and datasets (high steps). 我用kohya 在我电脑跑有问题，解决起来还麻烦要扩盘····多亏了青龙大佬的脚本了！love u.

sajeasAug 3, 2023

CivitAI

How do I set learning rate to something like 0,0004?
My PC can handle only 1 Epoch and warm up like 2e-5 or 4e-7 overtrains lora
do I need to change $lr_scheduler to constant?
What would be the best setting for 1 epoch?

bdsqlsz

Author

Aug 3, 2023

Xe-Y=0*Y+X

0,0004=4e-4

for overfitting,you need change
$network_drop:0.1~0.3

or set smaller repeats number for dir name such as

2_lora

sajeasAug 3, 2023

@bdsqlsz Thank you

Hex4Aug 6, 2023

CivitAI

if you only want 5 epochs, what should you change, have you tried your settings for 5 epochs, the results are still not very similar

bdsqlsz

Author

Aug 6, 2023

less epoch need higher LR

CyberDickLangAug 8, 2023

CivitAI

在sdxl_db模式下除了adaFactor之外的优化器会报错

prepare optimizer, data loader etc.

Traceback (most recent call last):

File "E:\TRAINING\lora-scripts\sd-scripts\sdxl_train.py", line 648, in <module>

train(args)

File "E:\TRAINING\lora-scripts\sd-scripts\sdxl_train.py", line 251, in train

, , optimizer = train_util.get_optimizer(args, trainable_params=params_to_optimize)

File "E:\TRAINING\lora-scripts\sd-scripts\library\train_util.py", line 3222, in get_optimizer

import bitsandbytes as bnb

File "E:\TRAINING\lora-scripts\venv\lib\site-packages\bitsandbytes\__init__.py", line 6, in <module>

from . import cuda_setup, utils, research

File "E:\TRAINING\lora-scripts\venv\lib\site-packages\bitsandbytes\research\__init__.py", line 1, in <module>

from . import nn

File "E:\TRAINING\lora-scripts\venv\lib\site-packages\bitsandbytes\research\nn\__init__.py", line 1, in <module>

from .modules import LinearFP8Mixed, LinearFP8Global

File "E:\TRAINING\lora-scripts\venv\lib\site-packages\bitsandbytes\research\nn\modules.py", line 8, in <module>

from bitsandbytes.optim import GlobalOptimManager

File "E:\TRAINING\lora-scripts\venv\lib\site-packages\bitsandbytes\optim\__init__.py", line 6, in <module>

from bitsandbytes.cextension import COMPILED_WITH_CUDA

File "E:\TRAINING\lora-scripts\venv\lib\site-packages\bitsandbytes\cextension.py", line 13, in <module>

setup.run_cuda_setup()

File "E:\TRAINING\lora-scripts\venv\lib\site-packages\bitsandbytes\cuda_setup\main.py", line 126, in run_cuda_setup

binary_name, cudart_path, cc, cuda_version_string = evaluate_cuda_setup()

ValueError: too many values to unpack (expected 4)

Traceback (most recent call last):

File "E:\Python\Python310\lib\runpy.py", line 196, in runmodule_as_main

return runcode(code, main_globals, None,

File "E:\Python\Python310\lib\runpy.py", line 86, in runcode

exec(code, run_globals)

File "E:\TRAINING\lora-scripts\venv\Scripts\accelerate.exe\__main__.py", line 7, in <module>

File "E:\TRAINING\lora-scripts\venv\lib\site-packages\accelerate\commands\accelerate_cli.py", line 45, in main

args.func(args)

File "E:\TRAINING\lora-scripts\venv\lib\site-packages\accelerate\commands\launch.py", line 918, in launch_command

simple_launcher(args)

File "E:\TRAINING\lora-scripts\venv\lib\site-packages\accelerate\commands\launch.py", line 580, in simple_launcher

raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)

subprocess.CalledProcessError: Command '['E:\\TRAINING\\lora-scripts\\venv\\Scripts\\python.exe', './sd-scripts/sdxl_train.py', '--pretrained_model_name_or_path=./sd-models/stableDiffusionXLAnime_v10.safetensors', '--output_dir=./output', '--logging_dir=./logs', '--resolution=1024,1024', '--max_train_epochs=10', '--learning_rate=5e-5', '--lr_scheduler=cosine_with_restarts', '--output_name=darkPizzaXLv21anime', '--train_batch_size=1', '--save_every_n_epochs=1', '--save_precision=bf16', '--seed=1026', '--max_token_length=225', '--caption_extension=.txt', '--save_model_as=safetensors', '--vae_batch_size=4', '--xformers', '--cache_text_encoder_outputs', '--cache_text_encoder_outputs_to_disk', '--bucket_reso_steps=32', '--train_data_dir=./train/darkPizza/trainxlv2', '--clip_skip=2', '--persistent_data_loader_workers', '--cache_latents', '--cache_latents_to_disk', '--noise_offset=0.1', '--adaptive_noise_scale=0.0375', '--optimizer_type=PagedAdamW8bit', '--optimizer_args', 'weight_decay=0.01', '--keep_tokens=1', '--min_snr_gamma=5', '--lr_scheduler_num_cycles=1', '--enable_bucket', '--min_bucket_reso=640', '--max_bucket_reso=1536', '--full_bf16', '--mixed_precision=bf16', '--lr_warmup_steps=50']' returned non-zero exit status 1.

Train finished

bdsqlsz

Author

Aug 8, 2023

bnb马上会更新新版本

Other

SDXL 1.0

by bdsqlsz

Download (Beta) View on CivitAI

script

Details

Downloads

114

Platform

CivitAI

Platform Status

Available

Created

7/31/2023

Updated

5/13/2026

Deleted

Files

sdxlLoraTrain8GBAnd_v30.zip

Size:

5.34 MB

SHA256:

afde27c6fcd1e5414891b6c4b21fe18079efb8a5ed7945f6fa61b26829a2802c

Mirrors

CivitAI (1 mirrors)

sdxlLoraTrain8GBAnd_v30.zip

Available On (1 platform)

Same model published on other platforms. May have additional downloads or version variants.

SeaArt

SDXL LoRA train(8GB) and Checkpoint finetune(16GB) - v3.0

Credit:

How to use:

just for

Description

FAQ

What is SDXL LoRA train(8GB) and Checkpoint finetune(16GB)?

What files are available and where can I download them?

Comments (16)

Details

Files

sdxlLoraTrain8GBAnd_v30.zip

Mirrors

Available On (1 platform)