Hello there, finally attacking Z image base full finetune. it is complexe but interesting.

This model add nudity, anatomy, and variations into the model, helping with proportions posing and angle, adding concepts like genitalias, and also give some push in inbetween body shapes and expressions, ranging from chubby, slender, soft, mucular, realistic proportions, to extreme proportions.

This model, can make now, nude men, nude feminine men, nude transwomen, and nude womens.

call them like that and experiment, there is not real trigger words, the captioning was donne, with Kimi K2.5 for the natural language part, and I added a tag line with Joytag, tout have some repeating concepts to help the texte encoder to grab some recurrent elements. so prompt in a mix of natural and tags.

Should be able to do a lot of various styles.

Shift 5, CFG 4, eular Beta, and 40-50 steps for seeing good results.

It work even better with the lora distil fun 4 steps. with CFG 1, 4 or 6 steps

Still imperfect obviously, V2 is in the back of my mind, with larger Dataset, some better selections in mind, and a deeper, more hard funel technics and some test about BF16 precisions.

still adjusting. Always learning

(Human wrote part)

Z-Image Base Full Finetune – Technical Notes (Experimental Observations)

This model was trained as a full finetune on Z-Image Base (BF16) using Musubi Tuner.

Training was performed on an H100 80GB, at 1024 resolution, with:

Batch size: 9

Gradient accumulation: 1

Full BF16

Flash attention enabled

Gradient checkpointing

Bucketed dataset (many varied buckets)

Dataset of 2100 Images, with very various styles, poses, angles and variations in physiologies.

The following notes are based on practical experimentation.

They are not official documentation, and should be considered empirical observations that worked in this setup.

1. Z-Image Base behaves differently from SD models

Z-Image Base uses a DiT transformer architecture and flow-based timestep sampling.

Compared to SD1.5 / SDXL:

Structural changes take longer to appear.

The model shows strong internal coherence.

It resists abrupt shifts.

Visible improvement often depends heavily on sampling settings.

It feels layered — structural changes must propagate across multiple refinement stages.

Z-Image Base is harder to move, but very stable once shaped.

2. About shift (Flow Shift) During Training

Using:

--timestep_sampling shift

--discrete_flow_shift X

modifies how timesteps are distributed during training.

From experimentation:

Higher shift (≈ 2.5+)

Emphasizes global structure.

Useful for early structural imprinting.

Mid shift (~2.0–2.2)

Appears to consolidate structure.

Balances geometry and detail.

Lower shift (1.5–1.7)

Seems to refine fine details.

Useful for finishing phases.

This suggests a staged approach:

Start higher for structure, progressively lower for refinement.

This is an experimental strategy — not an official rule.

3. Optimizer Choice (Adafactor vs AdamW)

In this setup, Adafactor performed better than AdamW for full finetuning.

Observed behavior:

Lower VRAM usage

Larger stable batch size (9 at 1024 on H100)

More stable long-phase convergence

Example configuration used:

--optimizer_type adafactor

--optimizer_args "relative_step=False" "scale_parameter=False" "warmup_init=False"

--lr_scheduler constant_with_warmup

Again, this is empirical — other setups may vary.

4. Learning Rate Funnel Strategy

A progressive reduction strategy was used:

Early phase: higher LR for structural change

Mid phase: moderate LR for stabilization

Final phases: very low LR for micro-refinement

The core idea:

Move the structure first.

Refine later without breaking global coherence.

Z-Image Base appears to benefit from staged training rather than a flat learning rate schedule.

5. Dataset and Bucketing

The dataset was:

1024 resolution aligned with Z-Image Base

Fully bucketed

With many varied aspect buckets

Multi-distribution (varied morphologies)

Using many buckets helped:

Preserve structural consistency

Avoid overfitting to a narrow framing

Maintain prompt flexibility

6. Sampling Matters More Than Expected

Z-Image Base can look soft or “vaporous” under weak guidance.

Under stronger guidance (e.g. 4.0) and sufficient steps (e.g. 50):

Structural improvements become significantly clearer.

Anatomical refinement is more visible.

Prompt conditioning becomes stronger.

When evaluating finetunes:

Use consistent seeds

Test guidance 3–5

Try multiple flow_shift sampling values (e.g. 3 and 5)

Compare phases side-by-side, and Loss is not that telling, just watch it did not spike, but you will not have deep dives down, your best bet is sampling between phases of training.

The model’s internal changes may not appear under weak sampling settings.

7. Important: These Are Hypotheses

Everything above is based on hands-on experimentation with:

Musubi Tuner

Z-Image Base (BF16)

H100 80GB

1024 resolution

Large batch (9)

Progressive shift funnel

Z-Image Base is complex.

Different datasets, hardware, or goals may respond differently.

These notes should be treated as:

Practical observations

Not universal truth

A starting point for experimentation

This finetune aimed to preserve:

Z-Image Base’s native photorealistic grain

Structural coherence

Prompt responsiveness

Stability under guidance

If you experiment further with Z-Image Base,

structured training and careful sampling evaluation seem essential.

(This note has been made by Ai, to avoid confusion, I am not a nativ speaker, but I reread it and approves and take the responsability of this experience return)

Z-Image Base Full Finetune – Technical Notes (Experimental Observations)

1. Z-Image Base behaves differently from SD models

2. About shift (Flow Shift) During Training

3. Optimizer Choice (Adafactor vs AdamW)

4. Learning Rate Funnel Strategy

5. Dataset and Bucketing

6. Sampling Matters More Than Expected

Compare phases side-by-side, and Loss is not that telling, just watch it did not spike, but you will not have deep dives down, your best bet is sampling between phases of training.

7. Important: These Are Hypotheses

Description

FAQ

Details

Files

globalAnatomyZimage_v10.safetensors

Mirrors

Z-Image Base Full Finetune – Technical Notes (Experimental Observations)

1. Z-Image Base behaves differently from SD models

2. About shift (Flow Shift) During Training

3. Optimizer Choice (Adafactor vs AdamW)

4. Learning Rate Funnel Strategy

5. Dataset and Bucketing

6. Sampling Matters More Than Expected

Compare phases side-by-side, and Loss is not that telling, just watch it did not spike, but you will not have deep dives down, your best bet is sampling between phases of training.

7. Important: These Are Hypotheses

Description

FAQ

What is Global_anatomy_Zimage Base BF16?

How do I use Global_anatomy_Zimage Base BF16?

What should I watch out for with Z-Image models?

What other Z-Image-based models are worth knowing?

Can I use this model commercially?

What files are available and where can I download them?

Details

Files

globalAnatomyZimage_v10.safetensors

Mirrors