Google FLAN
Tested working with COMFY and FORGE
Description
FAQ
Comments (9)
What kind of hardware user supposed to have for 17Gb Clip? I get OOM even with 9Gb in Comfy with 12Gb Vram
So for a 12GB Video card 12GB of system ram will be allotted for "shared video memory" if you have 32GB of ram that would leave about 18GB for CPU offloading, that would be a bit tight so I would recommend setting a large amount of virtual memory
Also set the FP32 CLIP command so that comfy does not even try to load it into GPU, don't have highvram or GPU only set
There are several solutions:
1) Full offloading in CPU and system RAM
2) Using virtual VRAM in RAM
3) Multi-GPU and distribute models/layers across GPUs
a 24GB Vram GPU
If it does not fit in vRAM, then just offload it into CPU RAM or the second CUDA vRAM. For this install ComfyUI-MultiGPU custom nodes from ComfyUI-Manager and replace your checkpoint, clip and vae loaders with the matching multiGPU versions. It is quite straightforward to try. T5xxl is only run at the beginning, so even if executed slowly in the CPU RAM, the impact on the overall run time will be relatively small.
The lack of any appealing images is no deterrence for me, as having seen that book cover I am certain as I am in god that this will be perfect.
Pruned as "removed encoder" or pruned as "remove all thats not needed"?
Based on size it seems like both?
Comfy makes use of the remaining blocks, but in its current configuration would not need the blocks to handle translation or logic
