31 Jan 2024
v2.0 isn’t a better version of MarblingTIXL. Just different. v1.0 still works fine.
With the changes in kohya it turns out the way I made v1 of this TI no longer works, or at least doesn’t produce anything very useful.
Thanks to @raken for letting me know about this.
I still think there’s great potential in SDXL embeddings so I’ve done a fresh kohya_ss install (v22.6.0 at time of writing) and worked my way through various parameters/settings until I found a combo that makes a close relative of the original MarblingTIXL.
In case anyone’s interested in SDXL TIs (and I know there are at least 2 of you out there!), I’ve included my training data and kohya_ss config JSON. Possibly some notes as well if I can think of something useful.
On the upside, this TI trained faster... on the downside it’s not as consistent as the older TI. Or maybe I haven’t played with it enough. Who can tell out here on the bleeding edge?!
If anyone’s got any questions, observations, opinions or wisdom to share please stick a comment below. There’s doesn’t seem to be much hard info out there about how to create TI styles at the moment... I’ve read/watched lots of contradictory viewpoints. It can be done though, and I think there’s scope for better TIs than I’ve managed so far.
Competition for LoRAs? Nope, not really - LoRAs add something to a checkpoint whereas TIs leverage what’s already in the checkpoint. If I’ve understood it right, TIs let you reach areas in a checkpoint’s possibility space that it would be difficult to reach consistently. So TIs and LoRAs are different things for different purposes... that you can use together. So everyone is happy :-)
There are technical papers around (covering what a TI is, how you should train one, stuff about text encoders, etc) but I’m usually out my depth by half-way through the first page :-(
As far as I can tell, kohya_ss is only training the first TE (text encoder) in SDXL. That’s the one from SD v1.x that should work in auto1111 SDXL generation but doesn’t. (Some people have reported that SD v1.x TIs do work in Comfy, but the experience seems to be variable.) As far as I can tell the second TE isn’t being trained in kohya_ss (it’s the one from SD v2.x). Or maybe it’s a duplicate of TE1?
I had a go with OneTrainer (which has options for both TEs) but didn’t have any success with the few runs I tried, so I’m sticking with kohya_ss for now.
For reference, I’m using an RTX-3060 with 12GB on a reasonable PC. Current kohya_ss runs just overflow the 12GB (+ another 6GB if I’m generating samples) so it’s heavier on resources than doing a LoRA. I thought TIs would need less (or the same) resources so I’m a bit surprised. Perhaps there’s no perceived need to optimise for TIs? Yet :-)
The TI here was trained on:
sd_xl_base_1.0_0.9vae.safetensors
The showcase images were generated using:
crystalClearXL_ccxl.safetensors [0b76532e03]
i.e. a TI trained on vanilla base should work with other checkpoints.
Images are generated in a1111 v1.7.0 and I’ve used Hires.fix but no other adjusters.
The additional gallery below shows without/with pairs so you can see how the TI affects some selected prompts. Label “xlmrblnh36-500” means without, label “xlmrblng36-500” means with. I’ve done it that way to keep the two prompts as similar as possible.
If you’re interested, the Training Data zip contains the all the saved TIs at 25-step (*4 gradient accumulation = 100 normal steps) intervals.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
NOTE: There is a problem with SDXL in the current version of automatic1111’s webui (v1.6.0). If you use a refiner checkpoint, webui forgets all your embeddings until you load a different checkpoint and then reload your original checkpoint (or restart webui). I have raised the issue with the developers:
https://github.com/AUTOMATIC1111/stable-diffusion-webui/issues/13117
and it has been confirmed as a bug.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
***SUMMARY***
This embedding will apply a surreal/fantasy aesthetic inspired by vintage marbled paper patterns. The effect varies from low to extreme depending on how “close” your prompt already is to this aesthetic.
The training for this TI did not include any artist works or tags.
Copy the generation data from one of the showcase images and adjust it to taste, or start with a prompt like this that should give a decent result with any seed:
award-winning Art Nouveau xlmrblng15-1300, analog realistic colour photo of a Japanese mermaid sitting on a rock in the midst of crashing waves, very detailed
checkpoint: crystalClearXL_ccxl.safetensors [0b76532e03]
sampler: DPM++ 2M Karras
steps: 40
CFG: 7
height=width=1024
and then vary the terms as you please. Try to keep between 3 and 5 words before “xlmrblng15-1300”.
The simplest prompts worth trying are this sort of thing:
cybernetic nun, xlmrblng15-1300
fantasy winter landscape, xlmrblng15-1300
but generally you’ll need more words to get interesting results.
After lots of experimentation I found I was getting my best results with prompts between 30 and 45 tokens, with no negative prompts.
I have provided some before/after image pairs in the extra galleries below.
xlmrblnh15 = without this TI
xlmrblng15 = with this TI
As you’ll see, this TI does more than simply adding marbled paper patterns :-)
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
***MORE DETAIL & TRAINING INFO***
This is a TI (textual inversion) embedding that adjusts your image generations by adding marbled paper patterns, or adjusting things towards marbled paper patterns depending on your prompts. Because of the way the SDXL system works, the effect with longer/complex prompts will often be structural rather than simplistic.
It’s the SDXL successor of my MarblingTI for SD v1.5:
https://civarchive.com/models/69768/marblingti
Because of all the changes in SDXL I had quite a lot of false starts (20+), but I think this new TI is more useful than the old one... at least for the surreal/illustrative stuff I like to create.
Switching from automatic1111 to kohya_ss for training was not an easy process. More on that below.
The TI is 8 vectors (i.e. it takes 8 tokens of your prompt). It is overpowered for short/simple prompts. That’s by design - I did make a few subtle versions but they were no use for the longer/complex prompts I’ve been using with SDXL. From what I understand of Stable Diffusion, 4 vectors should have been enough but I couldn’t get consistent results with 4 vectors.
The source material consists of scans/photos of vintage marbled paper that were made into several precursor TIs, that were then used to create hybrid pictures, that became the inputs for this TI.
For prompting you’ll need to front-load with 3 to 5 tokens.
i.e.
portrait of a woman, xlmrblng15-1300
rather than
xlmrblng15-1300, portrait of a woman
If you use a short/simple prompt you’ll likely just get a vintage marbled paper pattern. Fine but boring. Also, for shorter prompts the TI might add a slight green cast to images. I’m not sure why; the training images don’t have any overall cast.
Weight/emphasis: from 0.81 up to 1.33 is usable depending on prompt. I find I get more consistent results by moving the TI token rather than using weighting.
All the image generation for this TI was done in automatic1111 webui v1.6.0. The only non-built-in extension I’m using is Dynamic Prompts (installed via Extensions tab). I’ve not used Hires.fix or in/outpainting or detailers or other TIs or LoRAs etc, so that you can get an idea from the showcase/gallery images about whether it’s worth your while to try this TI.
https://github.com/AUTOMATIC1111/stable-diffusion-webui
https://github.com/adieyal/sd-dynamic-prompts
I usually use the CrystalClearXL:
https://civarchive.com/models/122822?modelVersionId=133832
or SDXL FaeTastic
https://civarchive.com/models/129681?modelVersionId=157988
checkpoints for SDXL image generation, but this TI works with every SDXL checkpoint I tried.
Because of the way prompting works, if you want to see the effects with/without this TI then change a single letter only.
e.g.
WITH: cybernetic nun, xlmrblng15-1300
WITHOUT: cybernetic nun, xlmrblnh15-1300
You can change the trigger word by renaming the safetensors file you downloaded. PROBLEM: if you change the trigger to a word that SDXL “knows” such as marbling, you’ll get unexpected results. Even if you stick words together like newmarbling, SDXL will pick out “new” and “marbling” and, umm, do stuff with those rather than the TI.
The name I’ve used is to tell me it’s an SDXL TI, it’s marbling (mrblng), and it’s the 1300 step iteration of version 15.
I often use an art movement at the start of my prompts, e.g. Art Nouveau, either as-is or weighted down to somewhere between 0.3 and 0.5. Jumping off pages for Art Movements:
https://en.wikipedia.org/wiki/List_of_art_movements
https://en.wikipedia.org/wiki/Periods_in_Western_art_history
If that doesn’t suit the prompt purists, try something like “award-winning illustrative” instead. For me, adding an art movement I enjoy means I don’t have to fiddle with the rest of the prompt as much to get a similar effect. The short-list of art movements I like are listed in a txt file and put into Dynamic Prompt’s wildcards folder so that I can just use __Art_Movements__ in my prompts.
As a rule I don’t use artist names except, occasionally, posthumous ones to get a very particular effect. e.g. René Lalique
https://en.wikipedia.org/wiki/Ren%C3%A9_Lalique
I’ve started using kohya_ss (v21.8.9) for TI training since it looks like automatic1111 will not be adding SDXL training to webui.
https://github.com/bmaltais/kohya_ss
There are a lot of settings/config in kohya_ss and I still don’t know what half of them mean :-( However, I’m going to give some info here that might help people wanting to train SDXL Textual Inversion styles using kohya_ss. I haven’t tried an SDXL TI object, and I can’t get LoRA training to work in kohya_ss (it either fails to start training or falls over partway through).
I can only describe the settings that worked on my own PC, but I hope it’s still relevant for similar PCs. So...
The PC I’m using is:
Nvidia 3060/12GB (not the Ti version), MSI X570 mb, Ryzen 7-2700 (8c/16t), 64GB system RAM, multiple SSDs, Win10pro.
Created a folder structure:
XLmrblng15
\--img
\--\--50_XLmrblng15 style
\--log
\--model
Training images:
I created 45 1024x1024 images and put them in the “50_XLmrblng15 style” folder. Then created a .caption file for each image. Example:
cliff with waterfall.png
cliff with waterfall.caption
Caption files are just text files so I used a simple text editor. The contents of each .caption file follow the same pattern:
xlmrblng15, cliff with waterfall
That’s the name of the TI I’m creating, a comma, a space, and the descriptive filename.
I don’t use captioning utilities.
For the stuff below, if a parameter isn’t mentioned it means I left it at default.
In the main “Textual Inversion” tab in kohya_ss:
Source model tab
Model Quick Pick = custom
Save traind model as = safetensors
Pretrained model name or path = G:/stable-diffusion-webui-master/webui/models/Stable-diffusion/SDXL/sd_xl_base_1.0_0.9vae.safetensors
SDXL model = ticked
Folders tab
Image folder = G:/KOHYA/TRAIN/XLmrblng15/img
Output folder = G:/KOHYA/TRAIN/XLmrblng15/model
Logging folder = G:/KOHYA/TRAIN/XLmrblng15/log
Model output name = xlmrblng15
Parameters (basic) tab
Token string = xlmrblng
Init word = pattern
Vectors = 8
Template = caption
Mixed precision = bf16
Save precision = bf16
Number of CPU threads per core = 1
Cache latents = ticked
Cache latents to disk = ticked
LR Scheduler = constant
Optmimizer = AdamW8bit
Learning rate = 0.001
Max resolution = 1024,1024
No half VAE = ticked
Parameters (advanced) tab
VAE = G:/KOHYA/sdxl_vae.safetensors
Save every N steps = 100
Gradient checkpointing = ticked
Memory efficient attention = ticked
Max num workers for DataLoader = 4
Parameters (samples) tab
Sample every n steps = 100
Sample prompts =
an analog realistic photograph of a magnificent jug on a table with glass tumblers, very detailed, intricate, xlmrblng15 --w 1024 --h 1024
xlmrblng15, an analog realistic photograph of a magnificent English lady wearing a Victorian bathing dress, very detailed, intricate, --w 1024 --h 1024
With all the above settings, training time settled at around 6s/it. Variable because I still use the PC for other (simple!) things while kohya is doing its stuff. The xlmrblng15-1300 was produced at around 2hr10min into the run.
For the bulk of the training run, GPU RAM usage was just inside the 12GB of my 3060. However, during sample generation and saving the TI every 100 steps an extra 7GB was used (i.e. 19GB in total). That extra 7GB comes from “Shared GPU memory”, i.e. main system RAM. After the sample generations the memory usage went back to just the 12GB GPU RAM.
The slowdown when using “Shared GPU Memory” was about 10x. Bleeping bleep :-(
The sample images kohya produces are very poor compared to using the base SDXL model in automatic1111 webui. But I left them in because at least I could see if the training was heading roughly in the right direction.
Obviously your training dataset is very important. I tried lots of different combinations of generated and real images until I got a set that produced the TI on this page.
For 45 images, using a batch size of 1 (the default), the special folder name “50_XLmrblng15 style” tells kohya to process the images 45 times. 45 * 50 = 2250 steps in total. After testing the various saved TIs at 100, 200, 300 steps and so on, I decided the 1300 step one worked best.
On the Parameters-Basic tab there is an “Init word” field. I found that training was very very sensitive to what I used as the init word. In this particular case I used “pattern” which is a 1-token word as far as SDXL is concerned. Theoretically I should have been using an 8-token phrase (kohya gives a console warning if vectors =/= init tokens). For some runs I did use more tokens and got some very interesting TIs, but not what I was looking for.
Using “pattern” has a drawback: depending on your prompt to SDXL, you may get lots of repetition - like a repeating pattern on wallpaper or gift-wrap.
Using “marblng” or “paper marbling” didn’t work: compared to SDv1.x, SDXL “knows” much more about marbling. Try it in your prompts! Ask for marble/marbling/marbled things and SDXL does so much better than SDv1.x. Any time I made a TI where the init word was marbling or a conceptually close term, what I got was a TI that used SDXL’s inbuilt marbling rather than the training from my dataset. :-(
I looked into the history of marbled paper and tried some terms such as “ebru”, the Turkish version of marbled paper. That didn’t work very well either. In the end, the the very wide term “pattern” gave me most of what I wanted.
Kohya_ss has the option of a “style” template on the Parameters-Basic tab. I’ve had a couple of decent results using “style” for some of my non-published SDXL TIs, but for this marbled paper one the results were not good.
Textual Inversion .vs. LoRA
I’m concentrating on TIs because (a) I can’t get a LoRA to train to completion, and (b) I want to leverage the content in SDXL rather than layer more data on top of it. I’m not against LoRAs - far from it! I’m having great fun with LoRAs by konyconi and others. Getting some great wow! results :-)
But I feel more of an affinity to TIs at the moment. The way I think of it is that TIs allow you to adjust your prompts into SDXL areas that simple words can’t reach, but LoRAs add new stuff on top of SDXL that you “merge” with SDXL’s own content by way of prompts.
That’s very simplistic but I don’t want to get into a discussion about SDXL’s full sample space .vs. its probability space, and what happens in a superset. This is a hobby for me, not a job :-)
Last thing I’ll mention here is that I’m using Dynamic Prompts’ wildcard system heavily. My typical prompts using this xlmrblng15-1300 TI look like this:
(__Art_Movements__:0.5) xlmrblng15-1300, mature __Nationalities__ (__Character_MF__) riding a __BW_Animals__ in a white-tinted __Landscapes__, __Metal_Color__ filigree inlay
Instantiated prompts (i.e. looking at it after Dynamic Prompts has done its thing) tend to be between 30 and 45 tokens.
When I drag a generated image into the “PNG Info” tab in automatic1111 webui, a typical result of the above dynamic prompt is 34 tokens long:
(Surrealism:0.5) xlmrblng15-1300, mature Swedish (male vampire) riding a dalmation in a white-tinted mudflats with scarlet cranes, black filigree inlay
Why put things like nationalities in when SDXL pays little attention to it in longer prompts? Because SDXL is biased and will add little extras associated with nationalities. It can be things like red hair if Scottish is mentioned, pyramids if Egyptian is mentioned, or Mt Fuji if Japanese is mentioned. Works for other things too; context linking/association seems to be much heavier in SDXL than SDv1.x. Trying to control it is a pain in the butt :-(
Resolutions I use for SDXL are usually 1024x1024, 960x1344 and 1344x960. The suggested resolutions I’ve seen here and there on the net suggest using the base 1mp (megapixel) 1024x1024 of course, and other resolutions that are as close as possible to 1mp. So if I want 1344 width I should use 768 height. I tried that and my perception of quality for the 1344x768 image was much less than for the 1024x1024 and 1344x960 ones. Also, the 1344x960 scales exactly to my 7“ by 5” photo paper. So there’s that :-)
Description
This is a TI so you can rename the file to change the trigger word.
FAQ
Comments (11)
@chromesun Thanks for doing this write-up of how you make your SDXL TIs. You should consider copying it and posting as an article - I think people would find it really useful, as there really is so little info about making TIs in SDXL.
Anyway, I have a couple of questions before I try this out...
1. Do images have to be 1024x1024? Would bigger be ok? what about 832x1216? or even 1664x2432?
2. When you caption your images, are you describing everything that isn't normally part of the concept/style, the same as I do for lora captioning? Or are you just keeping captioning short - just the trigger word and a few words of description?
@raken I’m hoping to do an article once I’m happy I’ve got the kohya_ss process under control. I’ve been trying LoRAs with the same training sets as my TIs and so far the LoRAs cook faster (about 3.5s/it) compared to the TIs (about 5.5s/it). But the LoRAs are not good; they work but very differently to the TIs and not in a way I’m happy with. Testing these things is slow for me - 5 or more hours per run means only 1 test a day. <sigh>
Your questions...
(1) No the trainset doesn’t need to be 1024x1024, but I’ve found that straying too far from dimensions that multiply to 1mp (megapixel) don’t add anything, and in fact can make things worse. So your 832x1216 is reasonable but 1664x2432 probably pointless. Smaller ones like 768x1024 can work, but 768x1344 gives me better results. If your trainset has multiple resolutions remember to turn bucketing on. Generally the best results I’ve had have come from generating/acquiring 1280x1280 and then using a graphics app to colour balance, reduce to 1024x1024, unsharp-mask at about 0.67, radius 1, threshold 0. I’m using CorelPhotoPaint2019 for that because you can batch with a recorded script. I didn’t find using non-square images added any value except maybe when I did a couple of person TIs for family.
(2) Captioning. Just wild guessing sometimes! Most advice I’ve seen is for captioning LoRAs not Textual Inversions. Generally I use the trigger token then a comma then a short description. Tried using various caption generators but the results were [bleep]. For TIs I think you’re supposed to caption the things you don’t want in the final TI file. For this MarblingTIXL I prepared images of landscapes with paper marbling patterns and then described the landscape in the hope that the training process would fixate on the stuff that SDXL doesn’t “see” as a landscape. It seemed to work and I’ve been using the same technique for other (unpublished) TIs successfully. On the other paw, I generally try the style template in kohya_ss first (top right of the Parameters... Basic page). If that doesn’t give reasonable results only then do I try captioning as described above.
I’ve uploaded a zip “training data” to this TI. It’s got the trainset images, the .caption files, and the .json I saved from the kohya_ss run. Might be a help to you?
@chromesun , awesome. Thanks so much. I'll try this over the next few days.
@chromesun - just to give you an update - So far I've made progress, but had no real success with SDXL embeddings.
Test1 - Foot pop (pose) embedding - ran successfully with your json, but had no effect generated images, I thought my captions were wrong or I had too few ref images.
Test2 - Your marbling embed - didn't run in Kohya (eventually I found out that I'd knocked one of the settings out)
Test3 - Proa (a type of sailing canoe) - ran successfully in Kohya but had an unusable effect (cutup windsurf boards flying over the sea). (Proa is a word that SD doesn't know, and I've since read that you shouldn't use embeddings to teach SD new words, apparently loras should be used for that!)
Test4 - Your marbling effect again - ran successfully but was EXTREMELY weak compared to your embedding. I compared both your and my json files and they were the same except for line 74.
Your Line 74: "stop_text_encoder_training": 0,
My Line 74: "stop_text_encoder_training_pct": 0,
I did a bit of research and found out that there was an update to Kohya in Oct 2023, where they changed that part. It seems with the settings I used, it now only trains the UNET and not the text encoder. (Maybe you haven't updated kohya for a while)
So I need to do more research to find out what the new settings should be for this. Once again, thanks for your help in getting me this far :)
A couple of other things.
1. I read a post from @malcolmrey saying he was going to start looking at doing SDXL embeddings. That would be awesome :)
2. On a separate note - I'm the opposite of you in that I have no problem doing loras on Kohya. If it's useful to you, I first did a lora by following this guy on youtube, https://www.youtube.com/watch?v=1BCYdd9r1To&t=1s
@Raken You’ve been busy! I can see I’ll need to re-run a previously successful TI to see what happens. MarblingTIXL was made in Sep 2023, so definitely in an old ver of kohya_ss as I’m sure I git pulled kohya_ss in December. I'll try on current setup and see if the result is much weaker. I read about Unet-only training but I thought that was more LoRA than TI. Or is TI a text-encoder-only thing? kohya seems to be training encoder1 and leaving encoder2 blank (or duplicating encoder1 maybe). And then there's the whole thing about encoder1 being the SD v1.5 encoder, and encoder2 being the SD v2 encoder... so both types should work in SDXL! I saw someone say that the old TIs worked OK in Comfy. Not in a1111 though. So confusing :-(
Thanks for the vid link, I'll certainly watch it. That said, I did finally manage a LoRA a couple of days ago: using same trainset as MablingTIXL so I could compare. The LoRA seems weak - perhaps from Unet-only settings, or over/under-trained? And also different from the TI... much more specific, so I can see now I’ll need more subject variability in input images. ’Cos I used landscapes, the LoRA will only do landscapes. Try a coffee machine or a car and it just distorts it a bit. My OOM fails in LoRA training seems to have been because of not reducing net dim & alpha from defaults. Also maybe because I was trying to use Adam... Adafactor and Prodigy worked if I added the secret sauce of extra parameters.
If kohya_ss produces a weak TI with no obvious solution I’ll check my dates and see if I can make a 2nd kohya_ss installation with old version. More reading/research required! At least with your info about line 74 I’ve got somewhere to start looking - thank you.
Or if you get there first please post again!
I agree about @malcolmrey - would be great if some of the high-powered folk pushed SDXL TIs.
Teaching SD new words... were you using proa in the captions? Or just making a proa.safetensors file?
@chromesun I'll certainly let you know if I have any success.
Regarding the proa captions. Yeah, I used 'proa' as my first key word, then a description of what it was 'small single outrigger sailing canoe'. Then had a list of many terms in the ref picture that wouldn't necessarily be in other generated pictures - I had used the captions successfully for a lora before, so I guessed they'd work ok for the TI. But, for now, I'll just keep trying to reproduce your marbling TI, then I'll start looking into my own stuff!
SD1.5 embeddings can certainly be used in comfy, but...sometimes they don't seem to work very well for me. More importantly, they are, at best, the typical SD1.5 low quality.
@Raken You’re right, redoing MarblingTIXL in current kohya_ss (and I did a git pull last night to make sure I was on current version) produces a weaker TI. I’ve trained it twice and the two new versions are similar (to the limit of training randomness) but both are different from original in Sep 2023. Less contrast, less detail, more real elements of prompt coming through instead of being converted to the paper marbling patterns. If I do a basic prompt such as “award-winning xlmrblng15-1300 simple close-up pattern, very detailed, intricate” the new TIs work well but not quite as punchy. More complex prompts such as “award-winning xlmrblng15-1300 magnificent angel walking past seaside chalk cliffs, hundreds of birds in the distance, very detailed, intricate” are hit-and-miss - some gens are great, others are insipid. I’ll need to look at the kohya and kohya_ss sites to see if I can spot what’s changed... or at least grab a zip of old version to compare. At the moment I’ve left the TI training up past 1300 steps, so I’ll see if it’s the same weak story at, say, 4000 steps.
The thing about:
Your Line 74: "stop_text_encoder_training": 0,
My Line 74: "stop_text_encoder_training_pct": 0,
I checked and found that my manual saves were without “_pct” and the auto saves in the model folder had “_pct”. So I think that’s probably not a significant issue.
EDIT: Hmmph :-( 4000 steps is only marginally better. Tried one of my family person TIs and, sure enough, it's weak as well. I can't tell if the earlier kohya had issues so that TIs were stronger than they should be... so it could be that the new behaviour is "better". MarblingTIXL ver 15 was the best out of 22 attempts with different training images, kohya settings, captions. I was surprised when some of those attempts produced poor TIs so I think I'll try some of those old ideas in the current kohya to see what happens. That'll take a few days :-(
EDIT: Backed up current kohya_ss, got the zip of v21.8.10 from 2023sep23 and unpacked it into current kohya_ss, started it as normal, let the downgrade requirements run, set up the old MarblingTIXL and trained it for 1300 steps. Success: back to detailed, punchy output. Different due to randomness of course, so perhaps 1200 or 1400 step TI would be better this time around, but the result is enough to say either
(1) my settings are wrong and I need to adjust for current kohya_ss, or
(2) my settings are right and there’s an issue with current kohya_ss
Probably (1) of course, but not many folk trying SDXL TIs so it could be (2). Or (3) something else!
I’ll go through the kohya/kohya_ss changelogs and try to spot what changes have affected my settings. Maybe need to dl each released version of kohya_ss and try a MarblingTIXL run on each to pinpoint change. Joy :-(
@raken Did you manage to get any further forward with your embeddings? I did a fair few tests over the past week with all sorts of settings... getting basically nowhere. But... kohya_ss got updated to v22.6.0 and I thought what the heck and did a clean install. Well, sorta clean, I didn’t change my python 3.10.9, Git, or the Visual Studio redistributable. But a fresh kohya_ss all the same :-)
I’ve got the MarblingTIXL to train much better now. No weak/non-existent effect any more. It’s not quite the same as the original I published here on Civitai - gives similar look but goes wild more often. Although I’ve been reading through the kohya_ss change logs, the technicalities are beyond me for the moment. So I dl’d the v21.8.10 (that I did the original MarblingTIXL in) and installed it in a different folder. Trying to stick with the new v22.6.0 but comforting to know I can use the older one if need be.
Would it be of any use if I ul the new working version of MarblingTIXL embedding
+ trainset (which is just the same as before)
+ new settings JSON file
+ sample log of what my kohya_ss console looks like when doing a successful run?
The last is because there are some odd things on the console - not sure if it’s just my local setup or if it’s the fun of being on the bleeding edge.
@raken
(sorry if this is a dup - tried adding to other comment thread and it didn't seem to take)
Did you manage to get any further forward with your embeddings? I did a fair few tests over the past week with all sorts of settings... getting basically nowhere. But... kohya_ss got updated to v22.6.0 and I thought what the heck and did a clean install. Well, sorta clean, I didn’t change my python 3.10.9, Git, or the Visual Studio redistributable. But a fresh kohya_ss all the same :-)
I’ve got the MarblingTIXL to train much better now. No weak/non-existent effect any more. It’s not quite the same as the original I published here on Civitai - gives similar look but goes wild more often. Although I’ve been reading through the kohya_ss change logs, the technicalities are beyond me for the moment. So I dl’d the v21.8.10 (that I did the original MarblingTIXL in) and installed it in a different folder. Trying to stick with the new v22.6.0 but comforting to know I can use the older one if need be.
Would it be of any use if I ul the new working version of MarblingTIXL embedding
+ trainset (which is just the same as before)
+ new settings JSON file
+ sample log of what my kohya_ss console looks like when doing a successful run?
The last is because there are some odd things on the console - not sure if it’s just my local setup or if it’s the fun of being on the bleeding edge.
Hi @chromesun ,
Wow, good job on making progress. TBH, I'd kinda lost momentum over the past week. You mention that the technicalities are beyond you - well that is doubly so for me!!! Actually, I was going to drift into making loras (like everyone else). However, your news has given me inspiration again :)
I will also do a clean install of the latest Kohya and try that to see what I get with your marbling ti. Yes, the new json config and sample log would be very helpful if you could upload them.
If I have still have no joy that way, I could also download an old version (v21.8.10) of Kohya to try out - I should have thought of doing that a week ago!!
I see they have a new chat system up here now, maybe we should take our conversation to that.
Thanks for the effort you put in and the news :)
@Raken Shifted to Chat. I think! Please ignore this if the chat shows up OK :-)
Details
Files
Available On (1 platform)
Same model published on other platforms. May have additional downloads or version variants.

















