Credit:
This script package from bdsqlsz
use kohya-ss/sd-scripts (github.com) for core
original script from Akegarasu/lora-scripts: LoRA training scripts use kohya-ss's trainer, for diffusion model. (github.com)
How to use:
0、(windows)Give unrestricted script access to powershell so venv can work:
Open an administrator powershell window
Type
Set-ExecutionPolicy Unrestrictedand answer AClose admin powershell window
1、Unzip this to anyway you want(Recommend with other train program which has venv)
if you Update it,just Rerun install-cn-qinglong.ps1.then enter N
2、Run install-cn-qinglong.ps1 in windows(linux just use commnd line)
it will automatically install environment(if you has venv,just put to over it)
3、Put your datesets in /input dir.
a formattting:
./input/2_lora/(datasets images)
4、Run tagger.ps1 for tagger them,automatically.(linux use tagger.sh)
5、Edit your datesets if your want
6、Edit train script(train_8Glora or train_16G_DB)(linux use train_tacanime)
just for
train_mode(lora、db、sdxl_lora、sdxl_db、cn3l)
pretrained_model
train_data_dir
a formatting ./input/
resolution(SDXL for 1024*1024,sd1.5 for 576*576)
max and min bucket repo(SDXL 640~1536,sd1.5:256~1024)
optimizer(SDXL:adaFactor for batch size1,Pagedadamw8bit for batch size 4)
other dont need more change if you dont know it
if i can help you ,i will very happy that~
Support 青龍聖者@bdsqlsz on Ko-fi! ❤️. ko-fi.com/bdsqlsz - Ko-fi ❤️ Where creators get support from fans through donations, memberships, shop sales and more! The original 'Buy Me a Coffee' Page.
Description
fixed bugs about bitsandbytes over 0.35
FAQ
Comments (16)
It doesn't see images
i put them in
C:\Lora\kohya_ss\input
in train_8Glora file the setting is this:
$train_data_dir = "./input"
How the right way to put folder paths?
you should mkdir over a dir then put images in.
named: Number_name(whatever)
train_data_dir=./input/
./input/2_lora/you-dataset-images(such as png、txt)
it worked but now shows this error:
[Dataset 0]
loading image sizes.
100%|████████████████████████████████████████████████████████████████████████████████| 21/21 [00:00<00:00, 2632.17it/s]
make buckets
number of images (including repeats) / 各bucketの画像枚数(繰り返し回数を含む)
bucket 0: resolution (1024, 1024), count: 2100
mean ar error (without repeats): 0.0
clip_skip will be unexpected / SDXL学習ではclip_skipは動作しません
Warning: SDXL has been trained with noise_offset=0.0357 / SDXLはnoise_offset=0.0357で学習されています
noise_offset is set to 0.0375 / noise_offsetが0.0375に設定されました
preparing accelerator
loading model for process 0/1
load StableDiffusion checkpoint: ./Stable-diffusion/dreamshaperXL10_alpha2Xl10.safetensors
building U-Net
loading U-Net from checkpoint
U-Net: <All keys matched successfully>
building text encoders
loading text encoders from checkpoint
text encoder 1: <All keys matched successfully>
text encoder 2: <All keys matched successfully>
building VAE
loading VAE from checkpoint
VAE: <All keys matched successfully>
Enable xformers for U-Net
import network module: networks.lora
[Dataset 0]
caching latents.
checking cache validity...
100%|█████████████████████████████████████████████████████████████████████████████████| 21/21 [00:00<00:00, 346.91it/s]
caching latents...
0it [00:00, ?it/s]
move vae and unet to cpu to save memory
[Dataset 0]
caching text encoder outputs.
checking cache existence...
100%|██████████████████████████████████████████████████████████████████████████████████████████| 21/21 [00:00<?, ?it/s]
caching text encoder outputs...
100%|██████████████████████████████████████████████████████████████████████████████████| 21/21 [00:02<00:00, 7.99it/s]
move vae and unet back to original device
create LoRA network. base dim (rank): 32, alpha: 16.0
neuron dropout: p=None, rank dropout: p=None, module dropout: p=None
create LoRA for Text Encoder 1:
create LoRA for Text Encoder 2:
create LoRA for Text Encoder: 264 modules.
create LoRA for U-Net: 722 modules.
enable LoRA for U-Net
prepare optimizer, data loader etc.
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ C:\Lora\kohya_ss\sd-scripts\sdxl_train_network.py:174 in <module> │
│ │
│ 171 │ args = train_util.read_config_from_file(args, parser) │
│ 172 │ │
│ 173 │ trainer = SdxlNetworkTrainer() │
│ ❱ 174 │ trainer.train(args) │
│ 175 │
│ │
│ C:\Lora\kohya_ss\sd-scripts\train_network.py:325 in train │
│ │
│ 322 │ │ │ ) │
│ 323 │ │ │ trainable_params = network.prepare_optimizer_params(args.text_encoder_lr, ar │
│ 324 │ │ │
│ ❱ 325 │ │ optimizer_name, optimizer_args, optimizer = train_util.get_optimizer(args, train │
│ 326 │ │ │
│ 327 │ │ # dataloaderを準備する │
│ 328 │ │ # DataLoaderのプロセス数:0はメインプロセスになる │
│ │
│ C:\Lora\kohya_ss\sd-scripts\library\train_util.py:3222 in get_optimizer │
│ │
│ 3219 │ │
│ 3220 │ elif optimizer_type.endswith("8bit".lower()): │
│ 3221 │ │ try: │
│ ❱ 3222 │ │ │ import bitsandbytes as bnb │
│ 3223 │ │ except ImportError: │
│ 3224 │ │ │ raise ImportError("No bitsandbytes / bitsandbytesがインストールされていない │
│ 3225 │
│ │
│ c:\Lora\kohya_ss\venv\lib\site-packages\bitsandbytes\__init__.py:6 in <module> │
│ │
│ 3 # This source code is licensed under the MIT license found in the │
│ 4 # LICENSE file in the root directory of this source tree. │
│ 5 │
│ ❱ 6 from . import cuda_setup, utils, research │
│ 7 from .autograd._functions import ( │
│ 8 │ MatmulLtState, │
│ 9 │ bmm_cublas, │
│ │
│ c:\Lora\kohya_ss\venv\lib\site-packages\bitsandbytes\research\__init__.py:1 in <module> │
│ │
│ ❱ 1 from . import nn │
│ 2 from .autograd._functions import ( │
│ 3 │ switchback_bnb, │
│ 4 │ matmul_fp8_global, │
│ │
│ c:\Lora\kohya_ss\venv\lib\site-packages\bitsandbytes\research\nn\__init__.py:1 in <module> │
│ │
│ ❱ 1 from .modules import LinearFP8Mixed, LinearFP8Global │
│ 2 │
│ │
│ c:\Lora\kohya_ss\venv\lib\site-packages\bitsandbytes\research\nn\modules.py:8 in <module> │
│ │
│ 5 from torch import Tensor, device, dtype, nn │
│ 6 │
│ 7 import bitsandbytes as bnb │
│ ❱ 8 from bitsandbytes.optim import GlobalOptimManager │
│ 9 from bitsandbytes.utils import OutlierTracer, find_outlier_dims │
│ 10 │
│ 11 T = TypeVar("T", bound="torch.nn.Module") │
│ │
│ c:\Lora\kohya_ss\venv\lib\site-packages\bitsandbytes\optim\__init__.py:6 in <module> │
│ │
│ 3 # This source code is licensed under the MIT license found in the │
│ 4 # LICENSE file in the root directory of this source tree. │
│ 5 │
│ ❱ 6 from bitsandbytes.cextension import COMPILED_WITH_CUDA │
│ 7 │
│ 8 from .adagrad import Adagrad, Adagrad8bit, Adagrad32bit │
│ 9 from .adam import Adam, Adam8bit, Adam32bit, PagedAdam, PagedAdam8bit, PagedAdam32bit │
│ │
│ c:\Lora\kohya_ss\venv\lib\site-packages\bitsandbytes\cextension.py:13 in <module> │
│ │
│ 10 │
│ 11 setup = CUDASetup.get_instance() │
│ 12 if setup.initialized != True: │
│ ❱ 13 │ setup.run_cuda_setup() │
│ 14 │
│ 15 lib = setup.lib │
│ 16 try: │
│ │
│ c:\Lora\kohya_ss\venv\lib\site-packages\bitsandbytes\cuda_setup\main.py:126 in run_cuda_setup │
│ │
│ 123 │ │ self.initialized = True │
│ 124 │ │ self.cuda_setup_log = [] │
│ 125 │ │ │
│ ❱ 126 │ │ binary_name, cudart_path, cc, cuda_version_string = evaluate_cuda_setup() │
│ 127 │ │ self.cudart_path = cudart_path │
│ 128 │ │ self.cuda_available = torch.cuda.is_available() │
│ 129 │ │ self.cc = cc │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
ValueError: too many values to unpack (expected 4)
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ D:\Python\lib\runpy.py:196 in runmodule_as_main │
│ │
│ 193 │ main_globals = sys.modules["__main__"].__dict__ │
│ 194 │ if alter_argv: │
│ 195 │ │ sys.argv[0] = mod_spec.origin │
│ ❱ 196 │ return runcode(code, main_globals, None, │
│ 197 │ │ │ │ │ "__main__", mod_spec) │
│ 198 │
│ 199 def run_module(mod_name, init_globals=None, │
│ │
│ D:\Python\lib\runpy.py:86 in runcode │
│ │
│ 83 │ │ │ │ │ loader = loader, │
│ 84 │ │ │ │ │ package = pkg_name, │
│ 85 │ │ │ │ │ spec = mod_spec) │
│ ❱ 86 │ exec(code, run_globals) │
│ 87 │ return run_globals │
│ 88 │
│ 89 def runmodule_code(code, init_globals=None, │
│ │
│ in <module>:7 │
│ │
│ 4 from accelerate.commands.accelerate_cli import main │
│ 5 if name == '__main__': │
│ 6 │ sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0]) │
│ ❱ 7 │ sys.exit(main()) │
│ 8 │
│ │
│ c:\Lora\kohya_ss\venv\lib\site-packages\accelerate\commands\accelerate_cli.py:45 in main │
│ │
│ 42 │ │ exit(1) │
│ 43 │ │
│ 44 │ # Run │
│ ❱ 45 │ args.func(args) │
│ 46 │
│ 47 │
│ 48 if name == "__main__": │
│ │
│ c:\Lora\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py:918 in launch_command │
│ │
│ 915 │ elif defaults is not None and defaults.compute_environment == ComputeEnvironment.AMA │
│ 916 │ │ sagemaker_launcher(defaults, args) │
│ 917 │ else: │
│ ❱ 918 │ │ simple_launcher(args) │
│ 919 │
│ 920 │
│ 921 def main(): │
│ │
│ c:\Lora\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py:580 in simple_launcher │
│ │
│ 577 │ process.wait() │
│ 578 │ if process.returncode != 0: │
│ 579 │ │ if not args.quiet: │
│ ❱ 580 │ │ │ raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd) │
│ 581 │ │ else: │
│ 582 │ │ │ sys.exit(1) │
│ 583 │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
CalledProcessError: Command '['c:\\Lora\\kohya_ss\\venv\\Scripts\\python.exe', './sd-scripts/sdxl_train_network.py',
'--pretrained_model_name_or_path=./Stable-diffusion/dreamshaperXL10_alpha2Xl10.safetensors', '--output_dir=./output',
'--logging_dir=./logs', '--resolution=1024,1024', '--max_train_epochs=40', '--learning_rate=2e-4',
'--lr_scheduler=constant_with_warmup', '--output_name=sdxl_8glora', '--train_batch_size=1', '--save_every_n_epochs=10',
'--save_precision=bf16', '--seed=1026', '--max_token_length=225', '--caption_extension=.txt',
'--save_model_as=safetensors', '--vae_batch_size=4', '--xformers', '--cache_text_encoder_outputs',
'--cache_text_encoder_outputs_to_disk', '--bucket_reso_steps=32', '--train_data_dir=./input/', '--clip_skip=2',
'--network_dim=32', '--network_alpha=16', "--training_comment=this LoRA model created from bdsqlsz by bdsqlsz'script",
'--persistent_data_loader_workers', '--cache_latents', '--cache_latents_to_disk', '--gradient_checkpointing',
'--noise_offset=0.0375', '--adaptive_noise_scale=0.0375', '--optimizer_type=PagedAdamW8bit', '--optimizer_args',
'weight_decay=0.01', '--network_train_unet_only', '--unet_lr=1e-5', '--text_encoder_lr=2e-5', '--keep_tokens=1',
'--min_snr_gamma=5', '--lr_scheduler_num_cycles=1', '--enable_bucket', '--min_bucket_reso=640',
'--max_bucket_reso=1536', '--full_bf16', '--mixed_precision=bf16', '--network_module=networks.lora',
'--lr_warmup_steps=50']' returned non-zero exit status 1.
Train finished
@sajeas sorry,i found it has changed in bnb 0.41.0 you can use other optimizer such as adafactor instead of 8bit. i will fix it soon
i update V3.5,it should work now
@bdsqlsz Thank you, it seems to working.
is Checkpoint finetune(16GB) sdxl_db parameter for 16g train script?
How do I set learning rate to something like 0,0004?
My PC can handle only 1 Epoch and warm up like 2e-5 or 4e-7 overtrains lora
do I need to change $lr_scheduler to constant?
What would be the best setting for 1 epoch?
if you only want 5 epochs, what should you change, have you tried your settings for 5 epochs, the results are still not very similar
less epoch need higher LR
在sdxl_db模式下 除了adaFactor之外的优化器会报错
prepare optimizer, data loader etc.
Traceback (most recent call last):
File "E:\TRAINING\lora-scripts\sd-scripts\sdxl_train.py", line 648, in <module>
train(args)
File "E:\TRAINING\lora-scripts\sd-scripts\sdxl_train.py", line 251, in train
, , optimizer = train_util.get_optimizer(args, trainable_params=params_to_optimize)
File "E:\TRAINING\lora-scripts\sd-scripts\library\train_util.py", line 3222, in get_optimizer
import bitsandbytes as bnb
File "E:\TRAINING\lora-scripts\venv\lib\site-packages\bitsandbytes\__init__.py", line 6, in <module>
from . import cuda_setup, utils, research
File "E:\TRAINING\lora-scripts\venv\lib\site-packages\bitsandbytes\research\__init__.py", line 1, in <module>
from . import nn
File "E:\TRAINING\lora-scripts\venv\lib\site-packages\bitsandbytes\research\nn\__init__.py", line 1, in <module>
from .modules import LinearFP8Mixed, LinearFP8Global
File "E:\TRAINING\lora-scripts\venv\lib\site-packages\bitsandbytes\research\nn\modules.py", line 8, in <module>
from bitsandbytes.optim import GlobalOptimManager
File "E:\TRAINING\lora-scripts\venv\lib\site-packages\bitsandbytes\optim\__init__.py", line 6, in <module>
from bitsandbytes.cextension import COMPILED_WITH_CUDA
File "E:\TRAINING\lora-scripts\venv\lib\site-packages\bitsandbytes\cextension.py", line 13, in <module>
setup.run_cuda_setup()
File "E:\TRAINING\lora-scripts\venv\lib\site-packages\bitsandbytes\cuda_setup\main.py", line 126, in run_cuda_setup
binary_name, cudart_path, cc, cuda_version_string = evaluate_cuda_setup()
ValueError: too many values to unpack (expected 4)
Traceback (most recent call last):
File "E:\Python\Python310\lib\runpy.py", line 196, in runmodule_as_main
return runcode(code, main_globals, None,
File "E:\Python\Python310\lib\runpy.py", line 86, in runcode
exec(code, run_globals)
File "E:\TRAINING\lora-scripts\venv\Scripts\accelerate.exe\__main__.py", line 7, in <module>
File "E:\TRAINING\lora-scripts\venv\lib\site-packages\accelerate\commands\accelerate_cli.py", line 45, in main
args.func(args)
File "E:\TRAINING\lora-scripts\venv\lib\site-packages\accelerate\commands\launch.py", line 918, in launch_command
simple_launcher(args)
File "E:\TRAINING\lora-scripts\venv\lib\site-packages\accelerate\commands\launch.py", line 580, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['E:\\TRAINING\\lora-scripts\\venv\\Scripts\\python.exe', './sd-scripts/sdxl_train.py', '--pretrained_model_name_or_path=./sd-models/stableDiffusionXLAnime_v10.safetensors', '--output_dir=./output', '--logging_dir=./logs', '--resolution=1024,1024', '--max_train_epochs=10', '--learning_rate=5e-5', '--lr_scheduler=cosine_with_restarts', '--output_name=darkPizzaXLv21anime', '--train_batch_size=1', '--save_every_n_epochs=1', '--save_precision=bf16', '--seed=1026', '--max_token_length=225', '--caption_extension=.txt', '--save_model_as=safetensors', '--vae_batch_size=4', '--xformers', '--cache_text_encoder_outputs', '--cache_text_encoder_outputs_to_disk', '--bucket_reso_steps=32', '--train_data_dir=./train/darkPizza/trainxlv2', '--clip_skip=2', '--persistent_data_loader_workers', '--cache_latents', '--cache_latents_to_disk', '--noise_offset=0.1', '--adaptive_noise_scale=0.0375', '--optimizer_type=PagedAdamW8bit', '--optimizer_args', 'weight_decay=0.01', '--keep_tokens=1', '--min_snr_gamma=5', '--lr_scheduler_num_cycles=1', '--enable_bucket', '--min_bucket_reso=640', '--max_bucket_reso=1536', '--full_bf16', '--mixed_precision=bf16', '--lr_warmup_steps=50']' returned non-zero exit status 1.
Train finished
bnb马上会更新新版本
