Long CLIP (Distilled)
Teacher/Student Distillation from 248/218 token length to projected 77
Pruned for use in SDXL, FLUX, SD 1.5, SD3, Hunyaun Video
DO NOT USE IN HI-DREAM, PONY (In most cases), or iLLustrious
Some of the top onsite models built with FP32 Distilled CLIP/FP32 VAE
Forcing FP32 CLIP recommended for Comfy, Forge, Auto1111
HiDream CLIP has been trained on a distillation set and the 248 and 218 token lengths reduced to 77 based on the pooled vision/text model output.
Description
FAQ
Comments (7)
Because that's what heroes do...
Jokes aside, I'm loving what you're doing, beside not understanding the tech part, the results are amazing!
Thanks
Are you planning on making a version of this for Pony??
Once again, amazing work!
It is unlikely as I would have to duplicate the pony weights or project them to 248 and the character structure would likely be lost. It would be very helpful as some artist name + character name exceeded the token limit just by virtue of how they get broken down
It survived the projection and FT
Can you explain what it does in practical terms and the difference between a normal clip-l
It should help with longer prompts, a token summarize







