# ๐ฌ Z-Image Base Anime Finetuning โ Full Technical Test Report
Epoch 100 Evaluation
This guide documents a complete testing and evaluation process of a
Z-Image Base anime finetuning checkpoint, including training details,
inference settings, prompt engineering findings, and sampler recommendations.
All findings are based on real testing with Epoch 100 checkpoint.
------------------------------------------------------------
๐ง TRAINING DETAILS
------------------------------------------------------------
Base Model: Z-Image Base (Tongyi-MAI, released Jan 27, 2026)
Architecture: S3-DiT (Single-Stream Diffusion Transformer)
Text Encoder: Qwen-based (bilingual EN/CN)
Training Type: Checkpoint Finetuning (not LoRA)
Epochs: 100
Steps: 68,830
Dataset Size: 1,375 Anime Images
Tagging System: WD Tagger (Booru-style tags)
Avg Tags/Image: ~47 tags
Unique Tags: 4,834
Total Tag Count: 64,276
------------------------------------------------------------
๐ DATASET RESOLUTION DISTRIBUTION
------------------------------------------------------------
Resolution | Count | Ratio | Quality
-------------|-------|---------|---------
1344x1728 | 132 | 3:4 | โ
Good
768x1086 | 95 | ~9:13 | โ ๏ธ Odd-Size
832x1216 | 47 | ~2:3 | โ
SD-Standard
1152x1536 | 45 | 3:4 | โ
Good
768x1084 | 36 | Odd | โ ๏ธ Odd-Size
768x1024 | 28 | 3:4 | โ
Perfect
896x1152 | 22 | 7:9 | โ
Good
1365x768 | 20 | ~16:9 | โ๏ธ Landscape
1248x1824 | 19 | ~2:3 | โ
Good
768x768 | 19 | 1:1 | โ
Standard
------------------------------------------------------------
๐ท TOP 50 TRAINING TAGS
------------------------------------------------------------
1. 1162x 1girl
2. 1034x looking_at_viewer
3. 1033x solo
4. 1001x breasts
5. 980x long_hair
6. 839x blush
7. 689x smile
8. 595x large_breasts
9. 529x long_sleeves
10. 520x closed_mouth
11. 466x open_mouth
12. 456x bare_shoulders
13. 426x hair_between_eyes
14. 422x shirt
15. 420x thighs
16. 417x blue_eyes
17. 380x cleavage
18. 376x medium_breasts
19. 370x short_hair
20. 354x hair_ornament
21. 344x black_hair
22. 340x collarbone
23. 328x dress
24. 327x simple_background
25. 317x jewelry
26. 308x holding
27. 299x indoors
28. 298x navel
29. 297x sitting
30. 285x outdoors
31. 284x standing
32. 282x gloves
33. 275x skirt
34. 270x very_long_hair
35. 269x jacket
36. 269x white_background
37. 268x animal_ears
38. 259x brown_hair
39. 253x blonde_hair
40. 236x thighhighs
41. 232x white_shirt
42. 225x red_eyes
43. 220x parted_lips
44. 219x multicolored_hair
45. 216x cowboy_shot
46. 214x bow
47. 214x sky
48. 214x sweat
49. 207x ribbon
50. 207x purple_eyes
------------------------------------------------------------
๐ท TOP 50 TRAINING TAGS nsfw
------------------------------------------------------------
1. 139x nipples
2. 120x nude
3. 117x uncensored
4. 111x pussy
5. 100x from_behind
6. 98x lying
7. 94x penis
8. 81x sex
9. 80x covered_nipples
10. 73x completely_nude
11. 72x bra
12. 70x sideboob
13. 67x ass_visible_through_thighs
14. 65x spread_legs
15. 64x vaginal
16. 61x pussy_juice
17. 60x testicles
18. 59x saliva
19. 57x cameltoe
20. 53x erection
21. 52x anus
22. 50x pov
23. 45x sex_from_behind
24. 44x sex_from_behind
25. 41x huge_breasts
26. 40x pubic_hair
27. 39x clothed_sex
28. 38x cum
29. 36x bottomless
30. 35x bent_over
31. 34x wet_clothes
32. 33x oral
33. 32x straddling
34. 31x no_bra
35. 31x breasts_apart
36. 30x ass_grab
37. 29x cum_in_pussy
38. 29x clitoris
39. 27x ahegao
40. 27x rolling_eyes
41. 26x yuri
42. 25x fellatio
43. 25x breasts_out
44. 24x underwear_only
45. 23x bdsm
46. 22x standing_sex
47. 22x cleft_of_venus
48. 22x doggystyle
49. 22x anal
50. 21x cum_overflow
------------------------------------------------------------
โ INFERENCE SETTINGS โ WHAT WORKS
------------------------------------------------------------
Recommended Setup:
CFG: 4 โ 6 (sweet spot confirmed)
Steps: 30 โ 40
Resolution: 768x1024 (primary)
832x1216 (more detail)
ModelSamplingFlow: Shift 3.0 โ important
CFG Normalization: NOT tested
------------------------------------------------------------
๐ SAMPLER & SCHEDULER RESULTS
------------------------------------------------------------
CONFIRMED WORKING (anime-style output):
โ Euler Ancestral + Simple
โ Euler Ancestral + Normal
โ DPM++ 2M + Simple
โ DPM++ 2M + Normal
โ DPM++ 2M SDE + Simple
โ DPM++ 3M SDE + Simple
โ Res Multistep + Simple
โ Res Multistep + Normal
COMPLETELY BROKEN (unrecognizable output):
โ All Karras variants
โ All Exponential variants
Notes:
โ DPM++ 2M SDE and DPM++ 3M SDE tend to produce more realistic-looking backgrounds
โ All 8 working samplers produce top quality results
โ Personal preference decides final choice
------------------------------------------------------------
๐งช PROMPT ENGINEERING FINDINGS
------------------------------------------------------------
WD Tags (Booru-style):
+ Fast to write
+ Good character details
+ Good clothing recognition
- Slightly flatter clothing textures
- Less atmospheric backgrounds
- Less "alive" feeling overall
Fulltext English:
+ Richer clothing details and textures
+ Better atmospheric backgrounds
+ More dynamic and "alive" feeling
+ Utilizes Qwen encoder strength fully
+ Better scene composition
- Slightly longer to write
------------------------------------------------------------
๐ WINNING PROMPT STRUCTURE โ LAYERED FULLTEXT
------------------------------------------------------------
1. Opening line โ Subject + Style
2. Character details โ Clothing + Features
3. Action + Pose
4. Foreground + immediate environment
5. Background description
6. Composition + Lighting + Meta
------------------------------------------------------------
๐ซ NEGATIVE PROMPT FINDINGS
------------------------------------------------------------
Rule:
POSITIVE โ Fulltext
NEGATIVE โ Short keyword tags
------------------------------------------------------------
๐ค TEXT GENERATION CAPABILITY
------------------------------------------------------------
Status after finetuning: INTACT โ
Tested:
โ Comic book covers with title text
โ "BLADE ZERO" title text
โ "ANIME MONTHLY" magazine cover
โ Issue numbers and dates
Notes:
โ Large text works very well
โ Small text slightly blurry (base limitation)
โ Occasional spelling errors (base model behavior)
------------------------------------------------------------
โ KNOWN LIMITATIONS
------------------------------------------------------------
Anatomy:
โ Extra fingers / malformed hands still occur
โ Floating limbs appear occasionally
โ Manageable with negative prompts
โ Known Z-Image Base issue, not training fault
Style Consistency:
โ Base model produces anime style ~50% of the time
โ Finetuned model produces anime style consistently โ
Details:
โ Best detail at CFG 5โ6, Steps 35โ40
โ ModelSamplingFlow Shift 3.0 is essential
โ Without Shift results are noticeably worse
------------------------------------------------------------
๐ QUICK START SETTINGS
------------------------------------------------------------
Node: ModelSamplingFlow โ Shift 3.0
Sampler: DPM++ 2M SDE or Euler Ancestral
Scheduler: Simple
CFG: 5
Steps: 35
Resolution: 768x1024
Prompt style: Layered Fulltext
Negative: Short keyword tagsDescription
Release Version with >100 hours computing time on a RTX5090
FAQ
Comments (6)
Excellent model!
Very nice. How is it it miss to draw any nipples on exposed breast unless specified?
Yeah i know sometimes it happens, i am on it, but in the training it saw a prety big bunch of nipples xD. IDK yet
Other than the list of tags, interesting, can you tell if the checkpoint is aware of specific artist or styles?
The model only knows specifiy anime characters and series, over all the dataset had no artis works or prompting inside. I note it for version 2
V02 is in Training and will be released around 1 week later



















