CivArchive
    Preview 79792524

    PONY CLIP 100k Finetune

    JoyCLIP is a further advancement of this CLIP

    Note this is still a great base to start a finetune of PONY CLIP as the gradient was 50 out. Starting with this model you will be in the 4-8 range.

    • 100k is a full finetune of base pony CLIP-L and CLIP-G

    • 100k can be used in any model base V6, Autism, Anime or Realistic (Even non-pony SDXL Models)

    • CLIP-G took 68GB and 30 hours to train.

    Forge users, you will need to download Comfy UI as CLIP replacement is only supported for FLUX, this may apply to Auto 1111 also. Once a model is saved with replaced clip it can be used in Forge or Auto


    • Comfy UI --fp32-text-enc OR --force-fp32

    • Forge/Auto1111 --clip-in-fp32 OR --all-in-fp32

    Description

    FAQ

    Comments (15)

    wd40thJun 1, 2025
    CivitAI

    Sorry, I must be doing something wrong, but it generates only noise for me... Do I need to select Clip-l as vae for Pony_100k_CLIP-G model or should I download Clip-g vae somewhere?

    Here is the error from console:

    Loading Model: {'checkpoint_info': {'filename': '/webui_forge_cu121_torch231/webui/models/Stable-diffusion/xl/ponyCLIP100kFinetune_pony100kCLIPG.safetensors', 'hash': '8d98e610'}, 'additional_modules': ['/webui_forge_cu121_torch231/webui/models/text_encoder/clip_l.safetensors'], 'unet_storage_dtype': None}

    Traceback (most recent call last):

    File "/webui_forge_cu121_torch231/webui/backend/loader.py", line 274, in forge_loader

    state_dicts, estimated_config = split_state_dict(sd, additional_state_dicts=additional_state_dicts)

    File "/webui_forge_cu121_torch231/webui/backend/loader.py", line 240, in split_state_dict

    guess = huggingface_guess.guess(sd)

    File "/webui_forge_cu121_torch231/webui/repositories/huggingface_guess/huggingface_guess/__init__.py", line 7, in guess

    result.unet_key_prefix = [unet_key_prefix]

    AttributeError: 'NoneType' object has no attribute 'unet_key_prefix'

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last):

    File "/webui_forge_cu121_torch231/webui/modules_forge/main_thread.py", line 30, in work

    self.result = self.func(*self.args, **self.kwargs)

    File "/webui_forge_cu121_torch231/webui/modules/txt2img.py", line 131, in txt2img_function

    processed = processing.process_images(p)

    File "/webui_forge_cu121_torch231/webui/modules/processing.py", line 836, in process_images

    manage_model_and_prompt_cache(p)

    File "/webui_forge_cu121_torch231/webui/modules/processing.py", line 804, in manage_model_and_prompt_cache

    p.sd_model, just_reloaded = forge_model_reload()

    File "/webui_forge_cu121_torch231/webui/venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context

    return func(*args, **kwargs)

    File "/webui_forge_cu121_torch231/webui/modules/sd_models.py", line 504, in forge_model_reload

    sd_model = forge_loader(state_dict, additional_state_dicts=additional_state_dicts)

    File "/webui_forge_cu121_torch231/webui/venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context

    return func(*args, **kwargs)

    File "/webui_forge_cu121_torch231/webui/backend/loader.py", line 276, in forge_loader

    raise ValueError('Failed to recognize model type!')

    ValueError: Failed to recognize model type!

    Failed to recognize model type!

    akak7952Jun 1, 2025

    Dua clip loader. Not as vae.

    n_ArnoJun 4, 20251 reaction
    CivitAI

    Since i don't use ComfyUI, i gave a shoot to the CLIP-L finetune by manually replacing the keys in one of my model with success and it is very nice. 馃槉

    But when i tried to do the same with CLIPG, i hit a wall and i need to understand how to adapt the key names:

    In a SDXL model:

    conditioner.embedders.1.model.transformer.resblocks.0.attn.in_proj_bias conditioner.embedders.1.model.transformer.resblocks.0.attn.in_proj_weight conditioner.embedders.1.model.transformer.resblocks.0.attn.out_proj.bias conditioner.embedders.1.model.transformer.resblocks.0.attn.out_proj.weight conditioner.embedders.1.model.transformer.resblocks.0.ln_1.bias conditioner.embedders.1.model.transformer.resblocks.0.ln_1.weight conditioner.embedders.1.model.transformer.resblocks.0.ln_2.bias conditioner.embedders.1.model.transformer.resblocks.0.ln_2.weight conditioner.embedders.1.model.transformer.resblocks.0.mlp.c_fc.bias conditioner.embedders.1.model.transformer.resblocks.0.mlp.c_fc.weight conditioner.embedders.1.model.transformer.resblocks.0.mlp.c_proj.bias conditioner.embedders.1.model.transformer.resblocks.0.mlp.c_proj.weight

    In the CLIPG model:

    text_model.encoder.layers.0.layer_norm1.bias text_model.encoder.layers.0.layer_norm1.weight text_model.encoder.layers.0.layer_norm2.bias text_model.encoder.layers.0.layer_norm2.weight text_model.encoder.layers.0.mlp.fc1.bias text_model.encoder.layers.0.mlp.fc1.weight text_model.encoder.layers.0.mlp.fc2.bias text_model.encoder.layers.0.mlp.fc2.weight text_model.encoder.layers.0.self_attn.k_proj.bias text_model.encoder.layers.0.self_attn.k_proj.weight text_model.encoder.layers.0.self_attn.out_proj.bias text_model.encoder.layers.0.self_attn.out_proj.weight text_model.encoder.layers.0.self_attn.q_proj.bias text_model.encoder.layers.0.self_attn.q_proj.weight text_model.encoder.layers.0.self_attn.v_proj.bias text_model.encoder.layers.0.self_attn.v_proj.weight

    In the attention head, i need to understand how to combine the Query, Key, Value proj weights into the "In" proj weights (and figure out which mlp.fc is which).

    If you have any pointers it would be great, otherwise, i'll just check a bit more the CLIP code from SDXL 馃槈

    n_ArnoJun 4, 20251 reaction

    Well, i managed to merge it in one of my own models and even in FP16, it looks marvelous. Now, if i add the source of the Clip in my model description, would i be allowed to release this version or would you prefer i keep it to myself?

    Felldude
    Author
    Jun 4, 20251 reaction

    @n_Arno聽All CLIP models subject to the original license and required to be open source. So your welcome to do what you wish with it without attribution, thanks for asking though

    androsynth7610Jun 4, 20251 reaction
    CivitAI

    hmm using the L model for flux is improving the output notably, however you have to use the right textencoder like T5XXL - looks better than the pro model, interesting

    Felldude
    Author
    Jun 4, 2025

    I have not tested 100k CLIP-L with T5xxl or FlanT5xxl - was it a NSFW finetune that showed improvement or base

    androsynth7610Jun 4, 20252 reactions

    @Felldude聽interesting question, normally i use jibmix but it holds true for dev as well, seems to affect models in general, aside of improving general picture quality overall it gets much more details right that are usually dropped in longer prompts

    Felldude
    Author
    Jun 4, 20251 reaction

    @androsynth7610聽Very interesting findings, I would not have expected any improvement on Flux given it was PONY based and then realigned to VIT CLIP - Thanks for posting

    androsynth7610Jun 4, 20251 reaction

    @Felldude聽fanntastic work

    Felldude
    Author
    Jun 20, 2025

    I tested the CLIP-L in FLUX, it appears to be static with the 100k CLIP for PONY, It's possible that it did not load correctly for you

    androsynth7610Nov 22, 2025

    no its not a loading mistake but i guess the workflows are as different as they come

    generater1997Jun 5, 2025
    CivitAI

    I'm a bit confused, sorry. Is this supposed to be used as CLIP only? As in, I can you CyberRealism Pony for the model and this as the CLIP driving it?

    Felldude
    Author
    Jun 6, 20252 reactions

    Yes clip-g and clip-l are the two clips in sdxl or pony

    Checkpoint
    Pony

    Details

    Downloads
    1,808
    Platform
    CivitAI
    Platform Status
    Available
    Created
    6/1/2025
    Updated
    5/12/2026
    Deleted
    -

    Available On (1 platform)

    Same model published on other platforms. May have additional downloads or version variants.