π¨ QWEN-Anime | Beta3-AIO
Advanced Anime Generation with Image Editing
β οΈ Note: All versions are combined on this model card for convenience.
π’ VERSION UPDATES
π¨ VERSION 3 - LATEST (Beta3-AIO)
π MAJOR UPDATE - Image Editing Revolution!
NEW FEATURES:
β¨ Image Editing functionality - Edit 1-3 images simultaneously
π Dual workflow - Text-to-Image AND Image-to-Image
π¦ Upgraded base model - Qwen Image Edit 2511 (from 2509)
β‘ Faster generation - 4 steps minimum (down from 8)
π Custom uncut model - Qwen 2.5 VL 7B FP8 for maximum creative freedom
π NSFW capabilities - Partial nudity and clothing removal possible
π¦ FP8 only - Other formats available on request
IMPROVEMENTS:
Combine multiple characters from different images
Transform and merge scenes
Style transfer between images
Enhanced detail preservation
More consistent results
TIPS FOR BEST RESULTS:
Results depend on seed, prompt, and input images
For NSFW content: Load NSFW image as second image for better guidance
Experiment with different combinations
WORKFLOW EVOLUTION:
V1/V2: Text-to-Image only
V3: Text-to-Image + Image-to-Image β
π VERSION 2 (Beta2-AIO)
FEATURES:
All-in-One format (no separate VAE/Text Encoder needed)
Two variants: Full (20+ steps) and Pruned (6-8 steps)
FP8 precision (26.99 GB)
Integrated VAE + Text Encoder
Single file, plug-and-play
IMPROVEMENTS:
Easier setup vs Beta1
Same quality, simpler workflow
Lightning LoRA compatible
π¦ VERSION 1 (Beta1 - Legacy)
FEATURES:
Original release
FP16 only (38.05 GB)
Requires separate VAE + Text Encoder
Base training
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
BETA3-AIO β β BETA2-AIO β BETA2 β BETA1
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Image Editing β 1-3 imgs β β No β β No β β No
Text-to-Image β Yes β β Yes β β Yes β β Yes
Base Model 2511 β 2509 β 2509 β Base
Min Steps 4 β 6-8 β 6-8 β 20+
Setup Single β Single β 3 file β 3 file
VAE/Encoder Integrated β Integrated β Separateβ Separate
NSFW β Limited β β οΈ Limited β β οΈ β β οΈ
File Size 27 GB β 27 GB β 19-38 β 38 GB
Format FP8 β FP8 β Multi β FP16
Speed (8 steps) β‘β‘β‘ β β‘β‘β‘ β β‘β‘ β β‘
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
π― Available Versions & Formats
π Beta3-AIO (Recommended) β
Format:
π‘ FP8 (26.99 GB) - 4+ steps, CFG 3.5
Other formats: Available on request
What's included:
β Image Editing (1-3 images)
β Text-to-Image
β Integrated VAE + Text Encoder
β Uncut model for creative freedom
Settings:
Steps: 4-20 (recommend 20 for quality)
CFG: 3.5
Sampler: Euler
Scheduler: Beta
π Beta2-AIO
Variants:
π’ Full Model FP8 (26.99 GB) - 20+ steps, CFG 2.5-4.0 (quality mode)
π‘ Pruned Model FP8 (26.99 GB) - 6-8 steps, CFG 1.0 (speed mode)
What's included:
β Integrated VAE + Text Encoder
β Single file, plug-and-play
β Use regular "Load Checkpoint" node
π§ Beta2 (Safetensors & GGUF)
Requires separate VAE + Text Encoder
SafeTensor Versions:
πͺ BF16 (38.05 GB)
π¦ FP16 (38.05 GB)
π¨ FP8 (19.03 GB)
GGUF Versions: β οΈ Requires ComfyUI-GGUF
πΉ F16 (38.07 GB)
πΉ Q8 (20.23 GB)
πΉ Q6_K (15.63 GB)
πΉ Q4_K_S (10.72 GB)
π¦ Beta1 (Legacy)
FP16 only (38.05 GB)
Requires separate VAE + Text Encoder
π§ͺ TEST RESULTS
π¨ Beta3-AIO Image Editing Test
Tested on Nvidia RTX 4060 with Euler sampler
Test: Multi-Image Composition
Prompt:
Place the two figures in a fantasy medieval tavern, laughing and clinking two beer glasses.Images Used:
Image 1: Character A
Image 2: Character B
Result:
Successfully combined both characters
Tavern setting accurately generated
Natural interaction and poses
Consistent anime style maintained
π Beta2-AIO Test Results
Tested on Nvidia RTX 4060 with Euler A sampler
π’ Full Model fp8 (20+ Steps Version)
Test 1: Elegant Shrine Maiden
Resolution: 1024Γ1024 | Steps: 24 | CFG: 3.6 | Time: ~176.48s

Prompt:
anime, masterpiece, best quality, 1girl, shrine maiden, long black hair,
red hakama, white kimono top, holding paper talisman, sacred shrine background,
cherry blossoms falling, soft sunlight, detailed face, serene expression,
traditional japanese architecture, torii gate in background,
cinematic lighting, depth of field
Test 2: Cyberpunk Street Scene
Resolution: 1536Γ1024 | Steps: 28 | CFG: 4.0 | Time: ~229.52s

Prompt:
anime, 2k quality, ultra-detailed, 1girl, cyberpunk hacker,
neon-lit tokyo street, rain reflections, holographic advertisements,
purple and cyan color scheme, tech wear jacket, mechanical arm augmentation,
confident pose, sharp focus, cinematic composition, bokeh background,
night city atmosphere, detailed eyes
Test 3: Fantasy Dragon Knight
Resolution: 832Γ1216 | Steps: 32 | CFG: 3.8 | Time: ~227.94s

Prompt:
anime, masterpiece, high detail, 1girl, dragon knight,
silver armor with blue accents, flowing cape, dragon companion beside her,
epic fantasy landscape, castle ruins background, dramatic sky,
wind effect on hair and cape, detailed armor patterns,
heroic pose, cinematic lighting, depth of field
π‘ Pruned Model fp8 (6-8 Steps Version)
Test 4: Cozy Cafe Moment
Resolution: 1024Γ1024 | Steps: 8 | CFG: 1.0 | Time: ~32.47s

Prompt:
anime, best quality, 1girl, casual outfit, sitting in cafe,
holding coffee cup, warm lighting, bokeh background,
soft smile, detailed eyes, cozy atmosphere,
window light, autumn colors, relaxed pose
Test 5: Magical Girl Transformation
Resolution: 512Γ768 | Steps: 7 | CFG: 1.0 | Time: ~19.28s

Prompt:
anime, masterpiece, 1girl, magical girl, transformation pose,
sparkles and light effects, flowing hair, colorful costume,
magic circle background, dynamic composition,
vibrant colors, detailed ribbons, glowing effects
Test 6: Beach Sunset Portrait
Resolution: 1024Γ1536 | Steps: 6 | CFG: 1.0 | Time: ~32.07s

Prompt:
anime, best quality, 1girl, summer dress, beach sunset,
golden hour lighting, ocean waves, soft wind effect on hair,
warm colors, peaceful expression, detailed face,
cinematic sunset, depth of field, romantic atmosphere
βοΈ SETTINGS & USAGE
π― Recommended Settings by Version
π¨ Beta3-AIO Settings
Text-to-Image Mode:
Steps: 4-8
CFG: 1
Sampler: Euler
Scheduler: Simpel
Resolution: 512Γ512 to K4
Image Editing Mode:
Steps: 4-8
CFG:1
Sampler: Euler
Images: 1-3 (Image 1 required)
Tip: Higher steps for complex edits
NSFW Content:
Load NSFW reference as Image 2 or 3
Be specific in prompt
Results vary - experiment with seeds
π’ Beta2-AIO Full Model (Quality Mode)
Steps: 20-32
CFG: 2.5-4.0 (sweet spot: 3.6)
Sampler: Euler A, Euler Normal, Beta, Simple
Use for: High quality, detailed work, final renders
π‘ Beta2-AIO Pruned Model (Speed Mode)
Steps: 6-8 (optimal: 8)
CFG: 1.0 (max 2.0, but stay at 1.0)
Sampler: Euler A recommended
Use for: Fast iterations, testing, quick generations
π Universal Settings (All Versions)
Resolution: 512Γ512 to 2048Γ1152
VRAM: 8GB+ recommended
Lightning LoRAs: Compatible (4-step or 8-step)
π‘ Which Version Should I Choose?
Choose Beta3-AIO if: β
β Want image editing capabilities β Need to combine multiple images β Want latest features and improvements β Need NSFW capabilities β Want fastest base model (4+ steps)
Choose Beta2-AIO (Pruned) if:
β Want fastest text-to-image (6-8 steps) β Need quick iterations/testing β Prefer simplicity (single file) β 8GB+ VRAM available
Choose Beta2-AIO (Full) if:
β Want maximum quality β Need more control (CFG 2.5-4.0) β Creating final/detailed work β Prefer traditional workflow
Choose Beta2 FP8 if:
β Want flexibility (separate VAE/encoder) β Using custom VAE/encoders β Need maximum compatibility
Choose Beta2 GGUF if:
β Limited VRAM (6-8GB) β Want smallest files (Q4 = 10GB) β CPU inference needed
Choose Beta1 if:
β Compatibility with old workflows β Testing/comparison purposes
π§ INSTALLATION GUIDE
π¦ Beta3-AIO (Easiest!)
Download Beta3-AIO FP8
Place in
ComfyUI/models/checkpoints/Load with standard "Load Checkpoint" node
For Image Editing: Use provided workflow
Generate!
No extra files needed!
π¦ Beta2-AIO
Download your preferred version (Full or Pruned)
Place in
ComfyUI/models/checkpoints/Load with standard "Load Checkpoint" node
Generate!
No extra files needed!
π§ Beta2 (Safetensors)
Download checkpoint β
diffusion_models/Download Text Encoder β
text_encoders/QWEN/Download VAE β
vae/QWEN/Use "Load Diffusion Model" node
πΎ Beta2 (GGUF)
Install ComfyUI-GGUF: https://github.com/city96/ComfyUI-GGUF
Download GGUF β
unet/Download Text Encoder + VAE (same as Safetensors)
Use "GGUF Loader" node
π PROMPTING TIPS
βοΈ General Tips
Quality Tags (All Versions):
anime, masterpiece, best quality, ultra-detailed,
2k resolution, sharp focus, cinematic lighting
Style Modifiers:
MOE STYLE, official art, anime coloring,
detailed eyes, depth of field, bokeh
Negative Prompt:
low quality, blurry, bad anatomy, bad hands,
text, watermark, mutation, distorted
π¨ Beta3-AIO Specific Tips
Text-to-Image Prompts:
anime girl with long blue hair, wearing school uniform,
cherry blossoms in background, soft lighting, detailed
eyes, anime style, high quality
Image Editing Prompts (Single Image):
change hair color to pink, add cat ears, school uniform,
keep the same pose and composition
Image Editing Prompts (Multiple Images):
combine the character from image 1 with the background
from image 2, match the lighting and style, anime aesthetic
Important for Editing:
Be specific about changes
Describe both images and desired result
Mention style consistency if needed
Natural language works best
π§ Beta3-AIO Specifications
Base Model: Qwen Image Edit 2511 Text Encoder: Qwen 2.5 VL 7B FP8 (uncut) Precision: FP8 Format: AIO (All-in-One) File Size: ~27 GB VRAM: 8GB minimum Steps: 4-20 (4 min, 20 recommended) CFG: 3.5 Sampler: Euler Scheduler: Beta
Capabilities:
Text-to-Image generation
Image-to-Image editing (1-3 images)
Character combination
Scene composition
Style transfer
Partial NSFW support
π CONTENT NOTICE
β οΈ NSFW Capabilities
Beta3-AIO:
β Partial nudity - Supported
β Clothing removal - Possible (results vary)
β Artistic nudity - Breasts/underboob
β Full explicit content - Not supported
π Age restriction - 18+ only, use responsibly
Tips for NSFW:
Load NSFW reference image as Image 2 or 3
Results depend on seed, prompt, and input images
Experiment with different combinations
Beta2-AIO & Earlier:
β οΈ Limited NSFW - Artistic nudity (breasts/underboob) supported
β Full explicit content - Not supported
π Age restriction - 18+ only
β FAQ
General Questions
Q: Which version should I download? A: Beta3-AIO for latest features + image editing. Beta2-AIO Pruned for fastest text-to-image.
Q: Do I need separate VAE/encoder for AIO versions? A: No! AIO has everything integrated.
Q: Can I use Lightning LoRAs? A: Yes! All versions support Lightning LoRAs (4-step or 8-step).
Beta3-AIO Specific
Q: How many images can I edit at once? A: 1-3 images (Image 1 required, Images 2-3 optional).
Q: Can I still do text-to-image with Beta3? A: Yes! Beta3 supports both Text-to-Image AND Image-to-Image.
Q: How do I get better NSFW results? A: Load an NSFW reference image as Image 2 or 3 for guidance.
Q: What's the minimum steps for Beta3? A: 4 steps minimum, but 20 steps recommended for quality.
Beta2-AIO Specific
Q: What's the difference between Full and Pruned AIO? A: Full = quality mode (CFG 2.5-4.0, 20+ steps). Pruned = speed mode (CFG 1.0, 6-8 steps).
Q: Why does Pruned need CFG 1.0? A: It's optimized for low-step high-speed generation. CFG 1.0 works best.
Q: Can I use CFG 3.0 with Pruned? A: Not recommended. Max is 2.0, but results are best at 1.0.
Compatibility
Q: Is quality different between AIO and Beta2 FP8? A: No, same training - AIO just bundles files together.
Q: Which has better quality: Full AIO or Beta2 BF16? A: Beta2 BF16 has slightly better precision, but difference is minimal.
Q: Can I mix versions (e.g., Beta3 with Beta2 VAE)? A: Not recommended. Each version is optimized as a complete package.
π CREDITS
Training: Custom dataset, Dual Tesla P40 GPUs Base Model: Qwen Image Edit 2511 (Beta3), 2509 (Beta2) Text Encoder: Qwen 2.5 VL 7B FP8 (uncut for Beta3) Architecture: Qwen-Image-Edit framework Community: Thanks to all Beta1, Beta2, and Beta3 testers!
Special Thanks:
Qwen team for the base models
ComfyUI community for feedback
All users who provided testing data
π QUICK START
Getting Started (Beta3-AIO)
Download Beta3-AIO FP8 from files section
Place in
ComfyUI/models/checkpoints/Download the provided workflow
Load workflow in ComfyUI
Choose mode:
Text-to-Image: Write prompt, generate
Image Editing: Upload 1-3 images, write edit prompt, generate
Generate amazing anime art!
Version Information
Current Version: Beta3-AIO β Previous Versions: Beta2-AIO, Beta2, Beta1 Release Date: December 2025 License: Apache 2.0 Format: Safetensors (AIO)
Created with β€οΈ for the anime AI community
Choose Beta3-AIO for the complete experience!
Description
π¦ Here's what each file contains:
π§ SafeTensor Versions:
πͺ Pruned Model bf16 (38.05 GB) = BF16 version
π¦ Pruned Model fp16 (38.05 GB) = FP16 version
π¨ Pruned Model fp8 (19.03 GB) = FP8 version
πΎ GGUF Versions:
(Requires ComfyUI-GGUF: https://github.com/city96/ComfyUI-GGUF)
πΉ Full Model fp16 (38.07 GB) = F16 GGUF
πΉ Full Model fp8 (20.23 GB) = Q8 GGUF
πΉ Full Model bf16 (15.63 GB) = Q6_K GGUF
πΉ Full Model nf4 (10.72 GB) = Q4_K_S GGUF
FAQ
Comments (20)
Hey!
Thanks for providing such an awesome checkpoint to the community!
The degree of details is astonishing!
I saw, that you used the edit model as base. However, if i try to use it with my edit workflow I get weird results. (Mostly the subject unchanged and oversaturated in the middle and artifacts to the sides.) Is the model after the finetune no longer capable to work as edit model, or do I just need another workflow?
@darios_manaris245 Hey, thanks a lot for your comment!
The part of the model responsible for editing images has been overwritten during finetuning. So if you want to edit images, you should use the original model instead. If you want your edits to match the style of my version, you can use my LoRA βQwen Anime V2β: https://civitai.com/models/1994924?modelVersionId=2373282
π¨ Update Time! π¨
Iβm currently uploading QWEN-Anime-Beta2! π
π§ Precision Versions Included
πͺ BF16
π¦ PF16
π¨ FP8
πΎ GGUF Versions
πΉ F16
πΉ Q8
πΉ Q6
πΉ Q4
Iβm still figuring out the best way to list them during the upload β ideally, Iβd like everything to appear on one single model card instead of spreading the files across separate tabs.
Stay tuned! πβ¨
GGUF Q8 and PF16 Safetensor are still uploading
I have to be honest and admit that I am positively surprised, I am still in the testing phase but the results so far are excellent, I downloaded the fp8 and went with 25 steps but then I tried it with the 8 step LoRa and I am really satisfied, an excellent checkpoint that can replace dozens of additional LoRa, congratulations on the excellent model and thanks for sharing ....
@DiabolicX Thank you so much for your commentβI'm really glad to hear that youβre enjoying the checkpoint! Itβs great to know that the FP8 version and even the 8-step LoRA workflow are working so well for you. π
If you like how the 8-step FP8 variant performs, I think youβll be happy with what Iβm releasing in the next few days: an all-in-one version in two editionsβone optimized for 20+ steps and another designed for 6β8 steps. Both can be used normally with the βLoad Checkpointβ node.
Thanks again for the support and for taking the time to test it! πβ¨
@SeeSeeLPΒ You're welcome, this is a really great checkpoint and a real refresh on the platform, you really did a good job with this model, as I already said, I first started testing this model with 25 steps, but then I tried with 8 step lora (in this case 10 steps) and the result was equally good, honestly I can't wait for the new version , thanx one more for sharing
β οΈ Update on the GGUF Versions
It looks like the GGUF builds arenβt running correctly. The ARCHS type has changed from qwen to qwen_image, which is causing issues.
Iβll be updating each GGUF version individually to fix this.
Thanks for your patience! π
Waiting for FP8 version, it's still too difficult to run now
@513080545 Hi! Thanks for your interest! π
Good news: The FP8 version is already available!
All versions are uploaded - please check the "Details" section on the right side. Click on "7 Files" to see all available models.
π¦ Here's what each file contains:
π§ SafeTensor Versions:
πͺ Pruned Model bf16 (38.05 GB) = BF16 version
π¦ Pruned Model fp16 (38.05 GB) = FP16 version
π¨ Pruned Model fp8 (19.03 GB) = FP8 version β This is what you're looking for!
πΎ GGUF Versions:
(Requires ComfyUI-GGUF: https://github.com/city96/ComfyUI-GGUF)
πΉ Full Model fp16 (38.07 GB) = F16 GGUF
πΉ Full Model fp8 (20.23 GB) = Q8 GGUF
πΉ Full Model bf16 (15.63 GB) = Q6_K GGUF
πΉ Full Model nf4 (10.72 GB) = Q4_K_S GGUF
The FP8 safetensors version (19.03 GB) is perfect for 8GB VRAM and runs great! This information is also listed at the top of the model description under "SafeTensor Versions".
Happy generating! π¨
Thanks for the update!
Is there a chance for a V3 with working edit capabilities?
Unfortunately no π
My hardware simply isn't strong enough to support training edit-capable models.
When training, I already need to load the base model, the text encoder and the VAE into VRAM β and 48GB VRAM is just barely enough.
My dataset has to stay in normal system RAM already.
For edit capabilities I would need to load an additional copy of the model (EMA) during training, so the training process can correct and update the editing-related parts of the model at every step.
That's my current understanding, and maybe future training scripts wonβt require a second model anymore β but right now my hardware canβt handle it.
I could try restoring the original edit-related layers and then check whether the anime look and everything else stays intact.
That might actually work, so Iβll give it a try.
Thanks for your question β it gave me a new idea to follow up on!
please specify in future is it able to image edit or not, i broke my mind figuring out why it is not working :((
but great job, man !
@SeeSeeLPΒ Sounds like a good idea. I hope you are successful! That would be a really nice addition.
@darios_manaris245 Qwen Anime V3 is currently being tested.
- Image Editing β
- New base Qwen Image Edit 2511 β
What's still missing:
- Finding out how good it turned out! β
- Workflow and descriptions β

















