Super Mario — Z-Image Turbo LoRA

A character LoRA for Z-Image Turbo bringing everyone's favourite plumber to life with cinematic CGI quality. Clean identity retention, sharp fabric detail, and plays well with a wide range of environments and lighting setups.
Model Details
Base Model Z-Image Turbo (Tongyi-MAI) LoRA Rank 8 Alpha 4 Training Resolution 512px Training Steps 1750 Optimizer AdamW Precision bf16 Trigger Word mrioman
Usage
Trigger Words
mrioman, a 3D CGI style character with short brown hair, wearing a red cap, a black moustache, blue overalls, and white gloves
Example Prompt
masterpiece, absurd res, close up shot of mrioman, a 3D CGI style character with short brown hair, wearing a red cap, a black moustache, blue overalls, and white gloves, sitting comfortably on a grey sofa while holding a slice of pepperoni pizza in one hand and giving a thumbs-up gesture with the other. He has an enthusiastic open-mouthed smile as he watches a video game displayed on a large television screen to his left. The cozy living room background features warm string lights, a lit lamp, bookshelves filled with games, and a window showing it is nighttime outside.
Negative Prompt
(leave empty — Z-Image Turbo does not use negative prompts)
Recommended Settings
Setting Value Sampler res_multistep Scheduler simple Steps 8-9 CFG Scale 1.0 LoRA Strength 0.8 – 1.2 Resolution 768x1280 / 832x1216 / 1024x1024
⚠️ Z-Image Turbo requires CFG 1.0 and no negative prompt. Do not use standard samplers like Euler or DPM — use res_multistep for best results.
Tips
Portrait ratio gives the cleanest character renders — try 768x1280
LoRA strength 1.0 is the sweet spot for most prompts
Drop to 0.8 if colours look oversaturated
Push to 1.2 if Mario identity feels weak
Works great with indoor environments, game-inspired backgrounds, and toy/collectible aesthetics
Background characters from the Mario universe appear naturally at higher strengths
ComfyUI Setup
Place the
.safetensorsfile inComfyUI/models/loras/Add a Load LoRA node to your workflow
Connect it between your Z-Image Turbo checkpoint loader and your sampler
Wire both Model and CLIP outputs through the LoRA node
Set strength to 1.0 and prompt with the trigger word
💙 Enjoyed this LoRA?
Every release is trained on curated datasets built from scratch. If you want to support more character releases, get early access to future drops, and access exclusive models that never go public — come join us on Ko-fi:
I'm also saving toward a hardware upgrade so I can expand into Klein9b and LTX-2.3 video LoKRs and deliver releases faster. Every coffee helps make it happen. ☕
💙 Discord:
Join my Discord, where I talk about all of my releases and entertain requests for future LoRAs!
Description
Comments (9)
Your description image showing another game character, should we expect more game characters to come to Z-Image from you?
Yes, Diana from Pragmata is coming later today! That's the teaser. She's coming out for Z-Image Turbo and Anima!
@Winnougan Will it be trained on Base? Because if trained on base it could be also used on Turbo with better performance.
Base for almost all models, but for Z-Image you can use either base or turbo.
@Winnougan If it is trained on BASE, then it is good. Z-Turbo loras does not working good on Z-Image base. Thank you.
how to train lora like this ?
Easy. There's no secrets here.
1. Make sure you have a capable GPU with at least 12gb of vram (minimum RTX 3060 12gb, but I recommend an RTX 5060TI 16gb or RTX 5070 12gb of vram).
2. Get AI Toolkit from Ostiris on Github
3. Make sure you have a clean, high quality dataset
4. Caption your images - I use Gemma-4 9b in LM Studio, since it's vision-based. Use natural language for Z-Image Turbo
5. Create your lora inside of AI Toolkit - the parameters on default may need adjusting. For example, take lora rank down to 16 instead of 32 if your vram is going into overflow. Lower the input resolution - etc.
6. Use at least 2000 steps or 10-15 epochs.
@Winnougan as for dataset stage, here is the struggle. 1-resolution : what is the capable ones ? blend more than in same dataset or use only one resolution?. 2-what inside pictures ! should be relative , if i want to train pictures of x , x should be in close-up or not ? it should contain another things around it or mask it ? how then the diffusion stage will understand that thing is a part of the image if requested , or not. so many struggle relates to the dataset only to be used, SDXL is my start test to go, if this part is the same concept for any model base train.
just prepare the dataset and its pairing with text files that holds tags , still no decent guide anywhere about only this stage , i need to understand only this stage, if you can help
@amazingbeauty
1. the dataset stage resolution, I prefer 1024x1024 or 1024x1280 for portrait. I don't do landscape data.
2. What should be inside? The character, clear with their distinct features, uniform, etc. Backgrounds should be included - never plain white.
3. I don't recommend mixing and matching resolutions.
4. SDXL, which you're training for, requires distinct tagging in your captions



