Update 2/28/25: I release the first successful Flat model (it take too many attempts to count)
The difficulty of this method required more time. After many trails it has become better than Squeeze. I did not expect this...
Details
Base Model: Illustrious
Detox Methods: Flat (v1)
Starting Tips: Use artist tags before using quality tags. Quality tags can improve the visuals, but can remove variability and knowledge of concepts. All detox models require specific prompting. If you do not specify you want an element, it will not read your mind (there is still some natural variance).
Positive Tags (optional): original,newest,masterpiece,best quality,amazing quality,high quality,very aesthetic,absurdres,highres
Using too many positives can make generation slow to denoise (more detail must be added), so use higher steps if you have trouble.
Negative Tags (optional): worst quality,low quality,normal quality,scanned,scanlines,sketch,unfinished,jpeg artifacts,lowres,blurry
bold = destructive, italics = unreliable, quality negatives can make style worse (use as you prefer)
Sampler: Euler A, DPM++.
Scheduler: Normal/Karras/Beta
Update 12/7/24: I release the first successful Squeeze model (it take 4 attempts)
I do not like the results of the other methods yet, so I will wait. It will most likely be a Smooth model because it is closest to Squeeze for quality.
Details
Base Model: Illustrious
Detox Methods: Squeeze (v1)
Starting Tips: Use artist tags before using quality tags. Quality tags can improve the visuals, but can remove variability and knowledge of concepts. All detox models require specific prompting. If you do not specify you want an element, it will not read your mind (there is still some natural variance).
Positive Tags (optional): original,newest,masterpiece,best quality,amazing quality,high quality,very aesthetic,absurdres,highres
Using too many positives can make generation slow to denoise (more detail must be added), so use higher steps if you have trouble.
Negative Tags (optional): worst quality,low quality,normal quality,sketch,unfinished,jpeg artifacts,lowres,blurry
bold = destructive, italics = unreliable, quality negatives can make style worse (use as you prefer)
Sampler: Euler A, I don't test others and it is expected they will fail, Squeeze models expect noise will be added each step.
Scheduler: Normal/Karras
Intro
This is a series of models named "Detox" models. The name "Detox" means that we apply a destructive finetuning process to a base model that will remove weights which contain synthetic attributes and replace them with fresh weights trained on non-synthetic data. A specific set of objectives is chosen for each method, you can read them below.
Detox Methods
The base model is processed and retrained using a specific objective. The methods are below.
Squeeze (v2) - Relearn details and prompt understanding. Version one keeps weights of the base model, version two retrains weights from scratch to retain prompt understanding. LoRAs often break if the base model is heavily trained on synthetic data.
Smooth - Maximize the intersection of the Squeeze method and the base model. It will share more of the base model, so it is good for LoRAs.
Flat - Maximize the stability and do more retraining after. It will discard many parts of the base model, for example the Flat v1 is similar to base SDXL rather than Illustrious. This method might ignore or break your LoRA. It will be a unique model which has consistent details and prompt understanding. The goal of consistency allows it to produce unfavorable results, but it will be a good base model for finetuning (I have not tested this). The model will be sensitive to new data because this method tries to distribute them evenly.
Shine - Maximize the details, but it can ignore prompt knowledge. This method is almost successful now that I rework it to be better. It is now easy to control, and seem to be okay for LoRAs still, but there is blur issue, likely need more training time.
What is synthetic data?
If you see some models trained with generated images, these are considered synthetic because the generation process has noise in all images. If the training data does not specify the difference between normal images and generated images, this noise will be present in the trained model for all images it creates. There are some further issues with generated images, such as the prompts which can include hallucinations. Generated images from SD1 do not respect the prompt, and SDXL can fail this too, but it is less common. If a model is trained to hallucinate, it will do it very well!
Usage
Refer to the Details section above and the example images.
Description
Second successful version. Refer to description for prompting and details on Flat models.
Recommended settings: Euler/Euler A, 30-50 steps, cfg 5.0-7.0, Normal/Karras/Beta.
FAQ
Comments (6)
I like your “squeeze” model the most and I’m still using it. I’d like to move to the illustrious V2.0 series soon—do you have any plans to develop one?
What objectives does the Illustrious v2.0 perform better? If it is a big gap, I will consider it for a new version.
@reptilekiller if you care about cleaner high-res outputs, better handling of natural-language prompts, crisper small-text/glyphs, and a tighter “realism” score, the gap is large enough that v2.0 is worth moving to.
Visual Fidelity (FID)
– v0.1: ~35 FID on Danbooru-style benchmarks
– v2.0: ~25 FID (≈30% reduction in “distance” to real images)
Prompt Adherence (CLIP Alignment)
– v0.1: Average CLIP-score ≈0.24
– v2.0: Average CLIP-score ≈0.27 (+12% semantic alignment)
High-Resolution Artifact Reduction
– At 1536 px and above, v2.0 shows ~40% fewer edge‐bleeding and texture glitches, so your 2K+ renders come out clean.
Natural-Language Prompt Robustness
– v0.1 struggles with anything beyond pure tag lists—CLIP‐based tests show only ~50% correct semantic response to English sentences.
– v2.0 reliably handles mixed English/tag prompts with ~70–75% correct adherence.
Text/Glyph Legibility
– OCR-style tests: v0.1 recovers ~65% of embedded text correctly; v2.0 jumps up to ~90% accuracy on small characters.
LoRA & Fine-Tuning Stability
– v2.0 was trained to be LoRA‐resilient—weight sweeps from 0.1 to 1.0 show <5% variance in image quality, versus ~15% variance on v0.1.
Inference Performance
– Slight trade-off: v2.0’s checkpoint is ~10% larger, adding ~3–5 % to runtime memory and inference time.
@yakinamashake Which version have you tested for your results? I see there are two, normal and "stable". From my quick testing, both versions have an artist knowledge issue, but the normal version expresses the concepts and features better than the "stable" version.
@reptilekiller The Normal version is a mid‐training checkpoint that can produce more varied outputs but lacks stability. The Stable version is a fully converged checkpoint after the final epoch, delivering consistent outputs. It seems that the Stable version is better suited for fine-tuning, though this is just what I gathered from my research since I’m not an expert…
Can you use this as a refiner model or only a base? Was it made with that in mind?








