PONY CLIP 100k Finetune - Pony_100k_CLIP-G

NSFW

PONY CLIP 100k Finetune

JoyCLIP is a further advancement of this CLIP

Note this is still a great base to start a finetune of PONY CLIP as the gradient was 50 out. Starting with this model you will be in the 4-8 range.

100k is a full finetune of base pony CLIP-L and CLIP-G
100k can be used in any model base V6, Autism, Anime or Realistic (Even non-pony SDXL Models)
CLIP-G took 68GB and 30 hours to train.

Forge users, you will need to download Comfy UI as CLIP replacement is only supported for FLUX, this may apply to Auto 1111 also. Once a model is saved with replaced clip it can be used in Forge or Auto

Can be run in any UI (Forge, Auto1111, Comfy UI) the model will be downcast by default. These settings improve complexity but are not required. (Full FP32 is not recommended but FP32 CLIP is)

Comfy UI --fp32-text-enc OR --force-fp32
Forge/Auto1111 --clip-in-fp32 OR --all-in-fp32

Description

FAQ

Comments (15)

wd40thJun 1, 2025

CivitAI

Sorry, I must be doing something wrong, but it generates only noise for me... Do I need to select Clip-l as vae for Pony_100k_CLIP-G model or should I download Clip-g vae somewhere?

Here is the error from console:

Loading Model: {'checkpoint_info': {'filename': '/webui_forge_cu121_torch231/webui/models/Stable-diffusion/xl/ponyCLIP100kFinetune_pony100kCLIPG.safetensors', 'hash': '8d98e610'}, 'additional_modules': ['/webui_forge_cu121_torch231/webui/models/text_encoder/clip_l.safetensors'], 'unet_storage_dtype': None}

Traceback (most recent call last):

File "/webui_forge_cu121_torch231/webui/backend/loader.py", line 274, in forge_loader

state_dicts, estimated_config = split_state_dict(sd, additional_state_dicts=additional_state_dicts)

File "/webui_forge_cu121_torch231/webui/backend/loader.py", line 240, in split_state_dict

guess = huggingface_guess.guess(sd)

File "/webui_forge_cu121_torch231/webui/repositories/huggingface_guess/huggingface_guess/__init__.py", line 7, in guess

result.unet_key_prefix = [unet_key_prefix]

AttributeError: 'NoneType' object has no attribute 'unet_key_prefix'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):

File "/webui_forge_cu121_torch231/webui/modules_forge/main_thread.py", line 30, in work

self.result = self.func(*self.args, **self.kwargs)

File "/webui_forge_cu121_torch231/webui/modules/txt2img.py", line 131, in txt2img_function

processed = processing.process_images(p)

File "/webui_forge_cu121_torch231/webui/modules/processing.py", line 836, in process_images

manage_model_and_prompt_cache(p)

File "/webui_forge_cu121_torch231/webui/modules/processing.py", line 804, in manage_model_and_prompt_cache

p.sd_model, just_reloaded = forge_model_reload()

File "/webui_forge_cu121_torch231/webui/venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context

return func(*args, **kwargs)

File "/webui_forge_cu121_torch231/webui/modules/sd_models.py", line 504, in forge_model_reload

sd_model = forge_loader(state_dict, additional_state_dicts=additional_state_dicts)

File "/webui_forge_cu121_torch231/webui/venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context

return func(*args, **kwargs)

File "/webui_forge_cu121_torch231/webui/backend/loader.py", line 276, in forge_loader

raise ValueError('Failed to recognize model type!')

ValueError: Failed to recognize model type!

Failed to recognize model type!

akak7952Jun 1, 2025

Dua clip loader. Not as vae.

n_ArnoJun 4, 2025· 1 reaction

CivitAI

Since i don't use ComfyUI, i gave a shoot to the CLIP-L finetune by manually replacing the keys in one of my model with success and it is very nice. 😊

But when i tried to do the same with CLIPG, i hit a wall and i need to understand how to adapt the key names:

In a SDXL model:

conditioner.embedders.1.model.transformer.resblocks.0.attn.in_proj_bias conditioner.embedders.1.model.transformer.resblocks.0.attn.in_proj_weight conditioner.embedders.1.model.transformer.resblocks.0.attn.out_proj.bias conditioner.embedders.1.model.transformer.resblocks.0.attn.out_proj.weight conditioner.embedders.1.model.transformer.resblocks.0.ln_1.bias conditioner.embedders.1.model.transformer.resblocks.0.ln_1.weight conditioner.embedders.1.model.transformer.resblocks.0.ln_2.bias conditioner.embedders.1.model.transformer.resblocks.0.ln_2.weight conditioner.embedders.1.model.transformer.resblocks.0.mlp.c_fc.bias conditioner.embedders.1.model.transformer.resblocks.0.mlp.c_fc.weight conditioner.embedders.1.model.transformer.resblocks.0.mlp.c_proj.bias conditioner.embedders.1.model.transformer.resblocks.0.mlp.c_proj.weight

In the CLIPG model:

text_model.encoder.layers.0.layer_norm1.bias text_model.encoder.layers.0.layer_norm1.weight text_model.encoder.layers.0.layer_norm2.bias text_model.encoder.layers.0.layer_norm2.weight text_model.encoder.layers.0.mlp.fc1.bias text_model.encoder.layers.0.mlp.fc1.weight text_model.encoder.layers.0.mlp.fc2.bias text_model.encoder.layers.0.mlp.fc2.weight text_model.encoder.layers.0.self_attn.k_proj.bias text_model.encoder.layers.0.self_attn.k_proj.weight text_model.encoder.layers.0.self_attn.out_proj.bias text_model.encoder.layers.0.self_attn.out_proj.weight text_model.encoder.layers.0.self_attn.q_proj.bias text_model.encoder.layers.0.self_attn.q_proj.weight text_model.encoder.layers.0.self_attn.v_proj.bias text_model.encoder.layers.0.self_attn.v_proj.weight

In the attention head, i need to understand how to combine the Query, Key, Value proj weights into the "In" proj weights (and figure out which mlp.fc is which).

If you have any pointers it would be great, otherwise, i'll just check a bit more the CLIP code from SDXL 😉

n_ArnoJun 4, 2025· 1 reaction

Well, i figured it out thanks to the diffusers convert code: https://github.com/huggingface/diffusers/blob/c934720629837257b15fd84d27e8eddaa52b76e6/scripts/convert_diffusers_to_original_stable_diffusion.py#L233

n_ArnoJun 4, 2025· 1 reaction

Well, i managed to merge it in one of my own models and even in FP16, it looks marvelous. Now, if i add the source of the Clip in my model description, would i be allowed to release this version or would you prefer i keep it to myself?

Felldude

Author

Jun 4, 2025· 1 reaction

@n_Arno All CLIP models subject to the original license and required to be open source. So your welcome to do what you wish with it without attribution, thanks for asking though

androsynth7610Jun 4, 2025· 1 reaction

CivitAI

hmm using the L model for flux is improving the output notably, however you have to use the right textencoder like T5XXL - looks better than the pro model, interesting

Felldude

Author

Jun 4, 2025

I have not tested 100k CLIP-L with T5xxl or FlanT5xxl - was it a NSFW finetune that showed improvement or base

androsynth7610Jun 4, 2025· 2 reactions

@Felldude interesting question, normally i use jibmix but it holds true for dev as well, seems to affect models in general, aside of improving general picture quality overall it gets much more details right that are usually dropped in longer prompts

Felldude

Author

Jun 4, 2025· 1 reaction

@androsynth7610 Very interesting findings, I would not have expected any improvement on Flux given it was PONY based and then realigned to VIT CLIP - Thanks for posting

androsynth7610Jun 4, 2025· 1 reaction

@Felldude fanntastic work

Felldude

Author

Jun 20, 2025

I tested the CLIP-L in FLUX, it appears to be static with the 100k CLIP for PONY, It's possible that it did not load correctly for you

androsynth7610Nov 22, 2025

no its not a loading mistake but i guess the workflows are as different as they come

generater1997Jun 5, 2025

CivitAI

I'm a bit confused, sorry. Is this supposed to be used as CLIP only? As in, I can you CyberRealism Pony for the model and this as the CLIP driving it?

Felldude

Author

Jun 6, 2025· 2 reactions

Yes clip-g and clip-l are the two clips in sdxl or pony

Checkpoint

Pony

by Felldude

Download (Beta) View on CivitAI