Flux.1-Dev Hyper NF4:
Source: https://huggingface.co/ZhenyaYang/flux_1_dev_hyper_8steps_nf4/tree/main from ZhenyaYang converted Hyper-SD to NF4 (8 steps)
Flux.1-Dev BNB NF4 (v1 & v2):
Source: https://huggingface.co/lllyasviel/flux1-dev-bnb-nf4/tree/main from lllyasviel
Flux.1-Schnell BNB NF4:
Source: https://huggingface.co/silveroxides/flux1-nf4-weights/tree/main from silveroxides
ComfyUI: https://github.com/comfyanonymous/ComfyUI_bitsandbytes_NF4
Forge: https://github.com/lllyasviel/stable-diffusion-webui-forge/discussions/981
💪Train your own model: https://runpod.io?ref=gased9mt
🍺 Join my discord: https://discord.com/invite/pAz4Bt3rqb
Description
Main model in bnb-nf4
T5xxl in fp8e4m3fn
CLIP-L in fp16
VAE in bf16
FAQ
Comments (103)
I'm quite happy this is here. Downloaded it already, but this just means more people will get to use Dev edition, you know the people with 8GB give or take a bit as well. I just hope you had permission... from Ill, no offense.
I don´t, but I reached out. I have to take it down if he is not cool with it.
@RalFinger Cool. Just lookin out.
Yeah it runs really well on my 8GB 3070TI . I'm getting sub 55 second outputs using NF4 through Forge. Also that NF4 setting if used in SDXL speeds up outputs for both SDXL and Pony by about 25-30% as well. Good times.
ELI5 Comfy Tutorial?
e.g What folder does this model go into?
(Sorry but my small brain cannot keep up with the speed of AI image generation developments!)
Hey there! No problem, always feel free to ask! At this point the model only works with Forge. You can read the post here: https://github.com/lllyasviel/stable-diffusion-webui-forge/discussions/981#top
@RalFinger Got it! Thanks for replying I really appreciate it.
@apricotgames Update! Here is your comfy node: https://github.com/comfyanonymous/ComfyUI_bitsandbytes_NF4
@RalFinger Amazing! Thanks!
lllyasviel writes that RTX 20xx series GPUs might not benefit from NF4 but an NVIDIA post does seem to indicate it should. Should I download this for my RTX 2060 12GB?
Hey grannikji, i don´t tink you should. He wrote: "If your device is GPU with GTX 10XX/20XX then your device may not support NF4, please download the flux1-dev-fp8.safetensors."
Source: https://github.com/lllyasviel/stable-diffusion-webui-forge/discussions/981
I have a 2080 Ti (11GB vram) and 48GB of system ram, and my inference speed using the FP8 Dev/Schnell models went from about 21s/it at 1024x1024 res, down to about 2.5s/it using the NF4 Dev model, using the ComfyUI NF4 Checkpoint Loader node. So it definitely works for 20xx series cards that support NF4.
You can open a command prompt and type "nvidia-smi" to see your video card's basic info and supported CUDA version. If it's higher than 11.7 as stated in Illyasviel's github post, your card should benefit from the NF4 models. My 2080 Ti supports Cuda 12.5, and I'd guess all other 20xx series cards should also support Cuda 12.5, but I haven't confirmed this myself.
But as always, you just have to test for yourself to see if your system benefits from it.
how to install ComfyUI_bitsandbytes_NF4
i have error
0.0 seconds (IMPORT FAILED): F:\ComfyUI\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI_bitsandbytes_NF4
Hey there NoobFromEgypt, please post your question here: https://github.com/comfyanonymous/ComfyUI_bitsandbytes_NF4/issues
@RalFinger Thanks i will do that
I'm having the same issues I will have to watch the GitHub for a solution...
@moxie1776 if you found something please let me know
@raidmachine132017712 I did that when i installed the custom node
@raidmachine132017712 btw they change it to bitsandbytes>=0.43.0
@moxie1776 https://www.youtube.com/watch?v=NkKyu181d_U
it is not make sense but that as same i did with cmd not power shall but it is fix it some how
lllyasviel is genius once again. Can't believe how good it works and how little VRAM it uses. And I haven't yet tried FORGE even.
is nf4 will just run n cpu ? and what is bnb is?
BNB is acronym of BitsandBytes.
It’s a little more precise than normal NF4.
Getting an error with the 5gb schnell version involving AttributeError: 'NoneType' object has no attribute 'unet_key_prefix'
'NoneType' object has no attribute 'unet_key_prefix'
*** Error completing request
unsure what this is about, I also notice you're using 20 steps still on that model, wouldn't the point of making schnell nf4 to make it even faster
Not sure about the error, but you are right about the steps, I was just generating images while uploading and didn´t notice that error, thanks for pointing that out!
@RalFinger I tried the 5gb mode, I'll try the other schnell one, unsure what the difference is supposed to be. If you got them working I presume they do and I can probably solve the issue somewhow. Did you disable Xformers or anything, I've read that can solve some errors currently.
I'm curious about your speeds with 4 steps.
How to run 6GB version? Comfy and Forge don't support it:
"ERROR: Could not detect model type of"
"TypeError: 'NoneType'object is not iterable"
Hey SuperSmuser, did you update ComfyUI? I am running forge, so maybe this helps: https://github.com/comfyanonymous/ComfyUI_bitsandbytes_NF4/issues/2
@RalFinger Have you run the 6GB version, yes or no? If not, then it is most likely just not supported yet. Because you can't use a clip separately in forge. And comfy node is just a copy of the forge implementation for now. I have 11GB versions working in both forge and comfy, but I need a version without clips because I have only a 6GB vram.
Apparently it's an xformers issue with Forge in the latest update. The suggestion given by some people is to disable xformers or uninstall it.
You can disable xformers in the args by set COMMANDLINE_ARGS= --disable-xformers in the webui-user.bat file.
@RalFinger Seems to be an xformers compatibility issue with Forge.
Comfy supports it but you need to manually upgrade bitsandbytes to latest version.
@Birnir in what node? CheckpointLoaderNF4? it's just get error message when just loading the 6 GB version, of course it's on the latest version, I just update it, just the same error. the 6 GB version seems just without any CLIP, it's still gives me error when load CLIP from DualClipLoader.
@Birnir that 6gb file is the unet or all with clips ?
FYI to others, I couldn't get the 5gb schnell to work for whatever reason.
Then in the 10gb model, I was getting an error where forge was looking for the files forge installs for the backend for the Dev version but for schnell.
Follow the path it gives you
Stable Diffusion WebUI Forge\backend\huggingface\black-forest-labs/FLUX.1-schnell
go up one directory and copy the contents from the flux dev folder into a new folder with the name above FLUX.1-schnell, then it should run.
On a laptop 4060 8GB, I'm getting 25 seconds or so which is fucking insane...
Thanks OP and the original DEV of the Nf4 implementation (he has a weird username and i'll never remember it heh)
May i ask what BNB means?
What is the difference between them?
BNB = "bitsandbytes" there was an update from illyasviel earlier on sd forge about the bitsandbytes and NF4/FP4:
"(BitsandBytes is a standard low-bit accelerator that are already used by most Large Language Models like LLama, Phi, etc)"
https://github.com/lllyasviel/stable-diffusion-webui-forge/discussions/981
As for the specifics of it all, I don't fully understand myself
@ChikkiBotta Ah thx m8. I was a little bit lost there. xD
This model is insanely fast with really good quality. In the images section you can see some of my creations with old DALL-E 3 prompts i have saved.
For me, the speed is the same as fp8 in ComfyUI. Is it supposed to be faster?
it's a pity the 5.5gb version won't load in comfyui :'(
I'm downloading the 10.7gb version now, will report back to say if it's any faster on my subpar hardware :P
Update: 10.7gb version ran about 4 times slower than the standard fp8 schnell model with clips and vae loaded separately.
@BilboTaggins it even more slower than other fp8 all in one flux s ?
Where did you find the 5.5gb version, I don't see it on CivitAI?
Hey guys, I had to remove the file since it was UNet only and couldn´t be used yet, sorry for the missunderstanding!
Something I don't understand: What happened to clip_L and T5 encoder? Are they merged into these files? Wouldn't that reduce the quality significantly? T5 fp16 alone was larger than these files
Yeah, I think that it is using T5xxl in fp8e4m3fn instead of t5xxl fp16. Not sure how much of an effect that has on quality or prompt adherence.
I just added more information to the model:
Main model in bnb-nf4
T5xxl in fp8e4m3fn
CLIP-L in fp16
VAE in bf16
yes nf4 quality is worse ... not counting model is 4 bit then t5xx is 8 bit ... so degrade quality even more
TypeError: 'NoneType' object is not iterable
@Retyz i need flux Schnell for lower step
try --disable-xformers
i fixed this in forge with
--disable-xformers --attention-pytorch
Error occurred when executing CheckpointLoaderNF4: load_checkpoint_guess_config() got an unexpected keyword argument 'model_options' File "C:\ComfyUI_windows_portable\ComfyUI\execution.py", line 152, in recursive_execute output_data, output_ui = get_output_data(obj, input_data_all) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\ComfyUI_windows_portable\ComfyUI\execution.py", line 82, in get_output_data return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\ComfyUI_windows_portable\ComfyUI\execution.py", line 75, in map_node_over_list results.append(getattr(obj, func)(**slice_dict(input_data_all, i))) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI_bitsandbytes_NF4\__init__.py", line 178, in load_checkpoint out = comfy.sd.load_checkpoint_guess_config(ckpt_path, output_vae=True, output_clip=True, embedding_directory=folder_paths.get_folder_paths("embeddings"), model_options={"custom_operations": OPS})
update comfyui to fix this.
any benefit of using this 5gb nf4 except it use less memory ? any speed up in steps ?
I had to remove the file since it was UNet only and couldn´t be used yet, sorry for the missunderstanding!
@RalFinger i need the unet only already lol i'm searching for schnell merge that capable of 4 steps only and unet only fp8 or nf4..
3080 10gb first run schnell 25s, 2nd+ run shows lowvram mode 1735 and it takes 60s. I'm using ComfyUI.
I had the same issue with my 2080 Ti 11GB card. The way I fixed the increasing iteration time was by adding one of the following command line arguments to my ComfyUI desktop shortcut (Note - only use one of these at a time, they don't work together. Ive also ordered them from top to bottom in the order that worked best for me, but you just have to try each one to see which works best for your system):
--use-quad-cross-attention
--use-split-cross-attention
--use-pytorch-cross-attention
Good luck and hope this helps.
You can try Forge UI, another UI interface based on automatic 1111. I can use this 10.7Gb dev model with my RTX 3050 8Gb RAM. Aprox 1 minute 40 seconds to create 1 image with the recommended settings, and no problems with RAM.
@Lolailo Forge working fine, I think my comfyui is a mess.
Is this RalFinger model, or just repost of models from lllyasviel?
File "E:\webui_forge_cu121_torch21\webui\modules\call_queue.py", line 74, in f
res = list(func(*args, **kwargs))
TypeError: 'NoneType' object is not iterable
I have same error and saw this discussion on editing the model.py but I had no luck after editing, but a couple of people said it worked for them. Hope it helps. https://github.com/lllyasviel/stable-diffusion-webui-forge/discussions/981#discussioncomment-10321551
The author of this model, points toward this same discussion as well at the Forge HUB page
lllyasviel's model on huggingface is 11.5GB, while your model is 10.7GB. Can you explain why is that?
i was gonna ask this too.
If you download the model, it has the same size. I guess its civit calculating bits and bytes differently
@RalFinger can confirm this. Actually its huggingface that has the wrong size mentioned... Its 10.7GB for both.
Huggingface uses SI system (1GB = 10⁹ bytes, instead of usual 2³⁰ bytes).
NF4+t5xxl_fp8 will not give you any speed boost compared to FP8+t5xxl_fp16 if your videocard has 16+GB VRAM. Tested in ComfyUI.
upd. If you don't have enough RAM or the checkpoint is on the HDD, you will get a speed boost, but this has nothing to do with the performance of your GPU!
upd. I test pytorch version: 2.4.0+cu121 & pytorch version: 2.4.0+cu124, results below. Hardware: RTX4070TI Super (16gb), i7 12700, 64GB RAM cl16 3600 d.rank, nvme 7000mbps.
cu124, fp8 and fp16 t5xxl.
6 seconds first launch.
26 sec - 1024x1024, 20 steps, euler simple.
cu124, nf4 and fp8 t5xxl.
3 seconds first launch.
26 sec - 1024x1024, 20 steps, euler simple.
cu121, fp8 and fp16 t5xxl.
27 sec first launch.
28 sec - 1024x1024, 20 steps, euler simple.
cu121, nf4 and fp8 t5xxl.
9 sec first launch.
28 sec - 1024x1024, 20 steps, euler simple.
please help guys i can't understand this err
"Error occurred when executing UNETLoader: Error(s) in loading state_dict for Flux: size mismatch for img_in.weight: copying a param with shape torch.Size([98304, 1]) from checkpoint, the shape in current model is torch.Size([3072, 64]). size mismatch for time_in.in_layer.weight: copying a param with shape torch.Size([393216, 1]) from checkpoint, the shape in current model is torch.Size([3072, 256]). size mismatch for time_in.out_layer.weight: copying a param with shape torch.Size([4718592, 1]) from checkpoint, the shape in current model is torch.Size([3072, 3072]). size mismatch for vector_in.in_layer.weight: copying a param with shape torch.Size([1179648, 1]) from checkpoint, the shape in current model is torch.Size([3072, 768]). size mismatch for vector_in.out_layer.weight: copying a param with shape torch.Size([4718592, 1]) from checkpoint, the shape in current model is torch.Size([3072, 3072]). size mismatch for guidance_in.in_layer.weight: copying a param with shape torch.Size([393216, 1]) from checkpoint, the shape in current model is torch.Size([3072, 256]). size mismatch for guidance_in.out_layer.weight: copying a param with shape torch.Size([4718592, 1]) from checkpoint, the shape in current model is torch.Size([3072, 3072]). size mismatch for txt_in.weight: copying a param with shape torch.Size([6291456, 1]) from checkpoint, the shape in current model is torch.Size([3072, 4096]). size mismatch for double_blocks.0.img_mod.lin.weight: copying a param with shape torch.Size([28311552, 1]) from checkpoint, the shape in current model is torch.Size([18432, 3072]). size mismatch for double_blocks.0.img_attn.qkv.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for double_blocks.0.img_attn.proj.weight: copying a param with shape torch.Size([4718592, 1]) from checkpoint, the shape in current model is torch.Size([3072, 3072]). size mismatch for double_blocks.0.img_mlp.0.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([12288, 3072]). size mismatch for double_blocks.0.img_mlp.2.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([3072, 12288]). size mismatch for double_blocks.0.txt_mod.lin.weight: copying a param with shape torch.Size([28311552, 1]) from checkpoint, the shape in current model is torch.Size([18432, 3072]). size mismatch for double_blocks.0.txt_attn.qkv.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for double_blocks.0.txt_attn.proj.weight: copying a param with shape torch.Size([4718592, 1]) from checkpoint, the shape in current model is torch.Size([3072, 3072]). size mismatch for double_blocks.0.txt_mlp.0.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([12288, 3072]). size mismatch for double_blocks.0.txt_mlp.2.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([3072, 12288]). size mismatch for double_blocks.1.img_mod.lin.weight: copying a param with shape torch.Size([28311552, 1]) from checkpoint, the shape in current model is torch.Size([18432, 3072]). size mismatch for double_blocks.1.img_attn.qkv.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for double_blocks.1.img_attn.proj.weight: copying a param with shape torch.Size([4718592, 1]) from checkpoint, the shape in current model is torch.Size([3072, 3072]). size mismatch for double_blocks.1.img_mlp.0.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([12288, 3072]). size mismatch for double_blocks.1.img_mlp.2.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([3072, 12288]). size mismatch for double_blocks.1.txt_mod.lin.weight: copying a param with shape torch.Size([28311552, 1]) from checkpoint, the shape in current model is torch.Size([18432, 3072]). size mismatch for double_blocks.1.txt_attn.qkv.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for double_blocks.1.txt_attn.proj.weight: copying a param with shape torch.Size([4718592, 1]) from checkpoint, the shape in current model is torch.Size([3072, 3072]). size mismatch for double_blocks.1.txt_mlp.0.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([12288, 3072]). size mismatch for double_blocks.1.txt_mlp.2.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([3072, 12288]). size mismatch for double_blocks.2.img_mod.lin.weight: copying a param with shape torch.Size([28311552, 1]) from checkpoint, the shape in current model is torch.Size([18432, 3072]). size mismatch for double_blocks.2.img_attn.qkv.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for double_blocks.2.img_attn.proj.weight: copying a param with shape torch.Size([4718592, 1]) from checkpoint, the shape in current model is torch.Size([3072, 3072]). size mismatch for double_blocks.2.img_mlp.0.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([12288, 3072]). size mismatch for double_blocks.2.img_mlp.2.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([3072, 12288]). size mismatch for double_blocks.2.txt_mod.lin.weight: copying a param with shape torch.Size([28311552, 1]) from checkpoint, the shape in current model is torch.Size([18432, 3072]). size mismatch for double_blocks.2.txt_attn.qkv.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for double_blocks.2.txt_attn.proj.weight: copying a param with shape torch.Size([4718592, 1]) from checkpoint, the shape in current model is torch.Size([3072, 3072]). size mismatch for double_blocks.2.txt_mlp.0.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([12288, 3072]). size mismatch for double_blocks.2.txt_mlp.2.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([3072, 12288]). size mismatch for double_blocks.3.img_mod.lin.weight: copying a param with shape torch.Size([28311552, 1]) from checkpoint, the shape in current model is torch.Size([18432, 3072]). size mismatch for double_blocks.3.img_attn.qkv.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for double_blocks.3.img_attn.proj.weight: copying a param with shape torch.Size([4718592, 1]) from checkpoint, the shape in current model is torch.Size([3072, 3072]). size mismatch for double_blocks.3.img_mlp.0.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([12288, 3072]). size mismatch for double_blocks.3.img_mlp.2.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([3072, 12288]). size mismatch for double_blocks.3.txt_mod.lin.weight: copying a param with shape torch.Size([28311552, 1]) from checkpoint, the shape in current model is torch.Size([18432, 3072]). size mismatch for double_blocks.3.txt_attn.qkv.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for double_blocks.3.txt_attn.proj.weight: copying a param with shape torch.Size([4718592, 1]) from checkpoint, the shape in current model is torch.Size([3072, 3072]). size mismatch for double_blocks.3.txt_mlp.0.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([12288, 3072]). size mismatch for double_blocks.3.txt_mlp.2.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([3072, 12288]). size mismatch for double_blocks.4.img_mod.lin.weight: copying a param with shape torch.Size([28311552, 1]) from checkpoint, the shape in current model is torch.Size([18432, 3072]). size mismatch for double_blocks.4.img_attn.qkv.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for double_blocks.4.img_attn.proj.weight: copying a param with shape torch.Size([4718592, 1]) from checkpoint, the shape in current model is torch.Size([3072, 3072]). size mismatch for double_blocks.4.img_mlp.0.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([12288, 3072]). size mismatch for double_blocks.4.img_mlp.2.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([3072, 12288]). size mismatch for double_blocks.4.txt_mod.lin.weight: copying a param with shape torch.Size([28311552, 1]) from checkpoint, the shape in current model is torch.Size([18432, 3072]). size mismatch for double_blocks.4.txt_attn.qkv.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for double_blocks.4.txt_attn.proj.weight: copying a param with shape torch.Size([4718592, 1]) from checkpoint, the shape in current model is torch.Size([3072, 3072]). size mismatch for double_blocks.4.txt_mlp.0.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([12288, 3072]). size mismatch for double_blocks.4.txt_mlp.2.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([3072, 12288]). size mismatch for double_blocks.5.img_mod.lin.weight: copying a param with shape torch.Size([28311552, 1]) from checkpoint, the shape in current model is torch.Size([18432, 3072]). size mismatch for double_blocks.5.img_attn.qkv.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for double_blocks.5.img_attn.proj.weight: copying a param with shape torch.Size([4718592, 1]) from checkpoint, the shape in current model is torch.Size([3072, 3072]). size mismatch for double_blocks.5.img_mlp.0.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([12288, 3072]). size mismatch for double_blocks.5.img_mlp.2.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([3072, 12288]). size mismatch for double_blocks.5.txt_mod.lin.weight: copying a param with shape torch.Size([28311552, 1]) from checkpoint, the shape in current model is torch.Size([18432, 3072]). size mismatch for double_blocks.5.txt_attn.qkv.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for double_blocks.5.txt_attn.proj.weight: copying a param with shape torch.Size([4718592, 1]) from checkpoint, the shape in current model is torch.Size([3072, 3072]). size mismatch for double_blocks.5.txt_mlp.0.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([12288, 3072]). size mismatch for double_blocks.5.txt_mlp.2.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([3072, 12288]). size mismatch for double_blocks.6.img_mod.lin.weight: copying a param with shape torch.Size([28311552, 1]) from checkpoint, the shape in current model is torch.Size([18432, 3072]). size mismatch for double_blocks.6.img_attn.qkv.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for double_blocks.6.img_attn.proj.weight: copying a param with shape torch.Size([4718592, 1]) from checkpoint, the shape in current model is torch.Size([3072, 3072]). size mismatch for double_blocks.6.img_mlp.0.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([12288, 3072]). size mismatch for double_blocks.6.img_mlp.2.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([3072, 12288]). size mismatch for double_blocks.6.txt_mod.lin.weight: copying a param with shape torch.Size([28311552, 1]) from checkpoint, the shape in current model is torch.Size([18432, 3072]). size mismatch for double_blocks.6.txt_attn.qkv.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for double_blocks.6.txt_attn.proj.weight: copying a param with shape torch.Size([4718592, 1]) from checkpoint, the shape in current model is torch.Size([3072, 3072]). size mismatch for double_blocks.6.txt_mlp.0.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([12288, 3072]). size mismatch for double_blocks.6.txt_mlp.2.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([3072, 12288]). size mismatch for double_blocks.7.img_mod.lin.weight: copying a param with shape torch.Size([28311552, 1]) from checkpoint, the shape in current model is torch.Size([18432, 3072]). size mismatch for double_blocks.7.img_attn.qkv.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for double_blocks.7.img_attn.proj.weight: copying a param with shape torch.Size([4718592, 1]) from checkpoint, the shape in current model is torch.Size([3072, 3072]). size mismatch for double_blocks.7.img_mlp.0.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([12288, 3072]). size mismatch for double_blocks.7.img_mlp.2.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([3072, 12288]). size mismatch for double_blocks.7.txt_mod.lin.weight: copying a param with shape torch.Size([28311552, 1]) from checkpoint, the shape in current model is torch.Size([18432, 3072]). size mismatch for double_blocks.7.txt_attn.qkv.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for double_blocks.7.txt_attn.proj.weight: copying a param with shape torch.Size([4718592, 1]) from checkpoint, the shape in current model is torch.Size([3072, 3072]). size mismatch for double_blocks.7.txt_mlp.0.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([12288, 3072]). size mismatch for double_blocks.7.txt_mlp.2.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([3072, 12288]). size mismatch for double_blocks.8.img_mod.lin.weight: copying a param with shape torch.Size([28311552, 1]) from checkpoint, the shape in current model is torch.Size([18432, 3072]). size mismatch for double_blocks.8.img_attn.qkv.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for double_blocks.8.img_attn.proj.weight: copying a param with shape torch.Size([4718592, 1]) from checkpoint, the shape in current model is torch.Size([3072, 3072]). size mismatch for double_blocks.8.img_mlp.0.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([12288, 3072]). size mismatch for double_blocks.8.img_mlp.2.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([3072, 12288]). size mismatch for double_blocks.8.txt_mod.lin.weight: copying a param with shape torch.Size([28311552, 1]) from checkpoint, the shape in current model is torch.Size([18432, 3072]). size mismatch for double_blocks.8.txt_attn.qkv.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for double_blocks.8.txt_attn.proj.weight: copying a param with shape torch.Size([4718592, 1]) from checkpoint, the shape in current model is torch.Size([3072, 3072]). size mismatch for double_blocks.8.txt_mlp.0.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([12288, 3072]). size mismatch for double_blocks.8.txt_mlp.2.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([3072, 12288]). size mismatch for double_blocks.9.img_mod.lin.weight: copying a param with shape torch.Size([28311552, 1]) from checkpoint, the shape in current model is torch.Size([18432, 3072]). size mismatch for double_blocks.9.img_attn.qkv.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for double_blocks.9.img_attn.proj.weight: copying a param with shape torch.Size([4718592, 1]) from checkpoint, the shape in current model is torch.Size([3072, 3072]). size mismatch for double_blocks.9.img_mlp.0.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([12288, 3072]). size mismatch for double_blocks.9.img_mlp.2.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([3072, 12288]). size mismatch for double_blocks.9.txt_mod.lin.weight: copying a param with shape torch.Size([28311552, 1]) from checkpoint, the shape in current model is torch.Size([18432, 3072]). size mismatch for double_blocks.9.txt_attn.qkv.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for double_blocks.9.txt_attn.proj.weight: copying a param with shape torch.Size([4718592, 1]) from checkpoint, the shape in current model is torch.Size([3072, 3072]). size mismatch for double_blocks.9.txt_mlp.0.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([12288, 3072]). size mismatch for double_blocks.9.txt_mlp.2.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([3072, 12288]). size mismatch for double_blocks.10.img_mod.lin.weight: copying a param with shape torch.Size([28311552, 1]) from checkpoint, the shape in current model is torch.Size([18432, 3072]). size mismatch for double_blocks.10.img_attn.qkv.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for double_blocks.10.img_attn.proj.weight: copying a param with shape torch.Size([4718592, 1]) from checkpoint, the shape in current model is torch.Size([3072, 3072]). size mismatch for double_blocks.10.img_mlp.0.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([12288, 3072]). size mismatch for double_blocks.10.img_mlp.2.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([3072, 12288]). size mismatch for double_blocks.10.txt_mod.lin.weight: copying a param with shape torch.Size([28311552, 1]) from checkpoint, the shape in current model is torch.Size([18432, 3072]). size mismatch for double_blocks.10.txt_attn.qkv.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for double_blocks.10.txt_attn.proj.weight: copying a param with shape torch.Size([4718592, 1]) from checkpoint, the shape in current model is torch.Size([3072, 3072]). size mismatch for double_blocks.10.txt_mlp.0.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([12288, 3072]). size mismatch for double_blocks.10.txt_mlp.2.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([3072, 12288]). size mismatch for double_blocks.11.img_mod.lin.weight: copying a param with shape torch.Size([28311552, 1]) from checkpoint, the shape in current model is torch.Size([18432, 3072]). size mismatch for double_blocks.11.img_attn.qkv.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for double_blocks.11.img_attn.proj.weight: copying a param with shape torch.Size([4718592, 1]) from checkpoint, the shape in current model is torch.Size([3072, 3072]). size mismatch for double_blocks.11.img_mlp.0.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([12288, 3072]). size mismatch for double_blocks.11.img_mlp.2.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([3072, 12288]). size mismatch for double_blocks.11.txt_mod.lin.weight: copying a param with shape torch.Size([28311552, 1]) from checkpoint, the shape in current model is torch.Size([18432, 3072]). size mismatch for double_blocks.11.txt_attn.qkv.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for double_blocks.11.txt_attn.proj.weight: copying a param with shape torch.Size([4718592, 1]) from checkpoint, the shape in current model is torch.Size([3072, 3072]). size mismatch for double_blocks.11.txt_mlp.0.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([12288, 3072]). size mismatch for double_blocks.11.txt_mlp.2.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([3072, 12288]). size mismatch for double_blocks.12.img_mod.lin.weight: copying a param with shape torch.Size([28311552, 1]) from checkpoint, the shape in current model is torch.Size([18432, 3072]). size mismatch for double_blocks.12.img_attn.qkv.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for double_blocks.12.img_attn.proj.weight: copying a param with shape torch.Size([4718592, 1]) from checkpoint, the shape in current model is torch.Size([3072, 3072]). size mismatch for double_blocks.12.img_mlp.0.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([12288, 3072]). size mismatch for double_blocks.12.img_mlp.2.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([3072, 12288]). size mismatch for double_blocks.12.txt_mod.lin.weight: copying a param with shape torch.Size([28311552, 1]) from checkpoint, the shape in current model is torch.Size([18432, 3072]). size mismatch for double_blocks.12.txt_attn.qkv.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for double_blocks.12.txt_attn.proj.weight: copying a param with shape torch.Size([4718592, 1]) from checkpoint, the shape in current model is torch.Size([3072, 3072]). size mismatch for double_blocks.12.txt_mlp.0.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([12288, 3072]). size mismatch for double_blocks.12.txt_mlp.2.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([3072, 12288]). size mismatch for double_blocks.13.img_mod.lin.weight: copying a param with shape torch.Size([28311552, 1]) from checkpoint, the shape in current model is torch.Size([18432, 3072]). size mismatch for double_blocks.13.img_attn.qkv.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for double_blocks.13.img_attn.proj.weight: copying a param with shape torch.Size([4718592, 1]) from checkpoint, the shape in current model is torch.Size([3072, 3072]). size mismatch for double_blocks.13.img_mlp.0.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([12288, 3072]). size mismatch for double_blocks.13.img_mlp.2.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([3072, 12288]). size mismatch for double_blocks.13.txt_mod.lin.weight: copying a param with shape torch.Size([28311552, 1]) from checkpoint, the shape in current model is torch.Size([18432, 3072]). size mismatch for double_blocks.13.txt_attn.qkv.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for double_blocks.13.txt_attn.proj.weight: copying a param with shape torch.Size([4718592, 1]) from checkpoint, the shape in current model is torch.Size([3072, 3072]). size mismatch for double_blocks.13.txt_mlp.0.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([12288, 3072]). size mismatch for double_blocks.13.txt_mlp.2.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([3072, 12288]). size mismatch for double_blocks.14.img_mod.lin.weight: copying a param with shape torch.Size([28311552, 1]) from checkpoint, the shape in current model is torch.Size([18432, 3072]). size mismatch for double_blocks.14.img_attn.qkv.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for double_blocks.14.img_attn.proj.weight: copying a param with shape torch.Size([4718592, 1]) from checkpoint, the shape in current model is torch.Size([3072, 3072]). size mismatch for double_blocks.14.img_mlp.0.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([12288, 3072]). size mismatch for double_blocks.14.img_mlp.2.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([3072, 12288]). size mismatch for double_blocks.14.txt_mod.lin.weight: copying a param with shape torch.Size([28311552, 1]) from checkpoint, the shape in current model is torch.Size([18432, 3072]). size mismatch for double_blocks.14.txt_attn.qkv.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for double_blocks.14.txt_attn.proj.weight: copying a param with shape torch.Size([4718592, 1]) from checkpoint, the shape in current model is torch.Size([3072, 3072]). size mismatch for double_blocks.14.txt_mlp.0.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([12288, 3072]). size mismatch for double_blocks.14.txt_mlp.2.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([3072, 12288]). size mismatch for double_blocks.15.img_mod.lin.weight: copying a param with shape torch.Size([28311552, 1]) from checkpoint, the shape in current model is torch.Size([18432, 3072]). size mismatch for double_blocks.15.img_attn.qkv.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for double_blocks.15.img_attn.proj.weight: copying a param with shape torch.Size([4718592, 1]) from checkpoint, the shape in current model is torch.Size([3072, 3072]). size mismatch for double_blocks.15.img_mlp.0.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([12288, 3072]). size mismatch for double_blocks.15.img_mlp.2.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([3072, 12288]). size mismatch for double_blocks.15.txt_mod.lin.weight: copying a param with shape torch.Size([28311552, 1]) from checkpoint, the shape in current model is torch.Size([18432, 3072]). size mismatch for double_blocks.15.txt_attn.qkv.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for double_blocks.15.txt_attn.proj.weight: copying a param with shape torch.Size([4718592, 1]) from checkpoint, the shape in current model is torch.Size([3072, 3072]). size mismatch for double_blocks.15.txt_mlp.0.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([12288, 3072]). size mismatch for double_blocks.15.txt_mlp.2.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([3072, 12288]). size mismatch for double_blocks.16.img_mod.lin.weight: copying a param with shape torch.Size([28311552, 1]) from checkpoint, the shape in current model is torch.Size([18432, 3072]). size mismatch for double_blocks.16.img_attn.qkv.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for double_blocks.16.img_attn.proj.weight: copying a param with shape torch.Size([4718592, 1]) from checkpoint, the shape in current model is torch.Size([3072, 3072]). size mismatch for double_blocks.16.img_mlp.0.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([12288, 3072]). size mismatch for double_blocks.16.img_mlp.2.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([3072, 12288]). size mismatch for double_blocks.16.txt_mod.lin.weight: copying a param with shape torch.Size([28311552, 1]) from checkpoint, the shape in current model is torch.Size([18432, 3072]). size mismatch for double_blocks.16.txt_attn.qkv.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for double_blocks.16.txt_attn.proj.weight: copying a param with shape torch.Size([4718592, 1]) from checkpoint, the shape in current model is torch.Size([3072, 3072]). size mismatch for double_blocks.16.txt_mlp.0.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([12288, 3072]). size mismatch for double_blocks.16.txt_mlp.2.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([3072, 12288]). size mismatch for double_blocks.17.img_mod.lin.weight: copying a param with shape torch.Size([28311552, 1]) from checkpoint, the shape in current model is torch.Size([18432, 3072]). size mismatch for double_blocks.17.img_attn.qkv.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for double_blocks.17.img_attn.proj.weight: copying a param with shape torch.Size([4718592, 1]) from checkpoint, the shape in current model is torch.Size([3072, 3072]). size mismatch for double_blocks.17.img_mlp.0.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([12288, 3072]). size mismatch for double_blocks.17.img_mlp.2.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([3072, 12288]). size mismatch for double_blocks.17.txt_mod.lin.weight: copying a param with shape torch.Size([28311552, 1]) from checkpoint, the shape in current model is torch.Size([18432, 3072]). size mismatch for double_blocks.17.txt_attn.qkv.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for double_blocks.17.txt_attn.proj.weight: copying a param with shape torch.Size([4718592, 1]) from checkpoint, the shape in current model is torch.Size([3072, 3072]). size mismatch for double_blocks.17.txt_mlp.0.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([12288, 3072]). size mismatch for double_blocks.17.txt_mlp.2.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([3072, 12288]). size mismatch for double_blocks.18.img_mod.lin.weight: copying a param with shape torch.Size([28311552, 1]) from checkpoint, the shape in current model is torch.Size([18432, 3072]). size mismatch for double_blocks.18.img_attn.qkv.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for double_blocks.18.img_attn.proj.weight: copying a param with shape torch.Size([4718592, 1]) from checkpoint, the shape in current model is torch.Size([3072, 3072]). size mismatch for double_blocks.18.img_mlp.0.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([12288, 3072]). size mismatch for double_blocks.18.img_mlp.2.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([3072, 12288]). size mismatch for double_blocks.18.txt_mod.lin.weight: copying a param with shape torch.Size([28311552, 1]) from checkpoint, the shape in current model is torch.Size([18432, 3072]). size mismatch for double_blocks.18.txt_attn.qkv.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for double_blocks.18.txt_attn.proj.weight: copying a param with shape torch.Size([4718592, 1]) from checkpoint, the shape in current model is torch.Size([3072, 3072]). size mismatch for double_blocks.18.txt_mlp.0.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([12288, 3072]). size mismatch for double_blocks.18.txt_mlp.2.weight: copying a param with shape torch.Size([18874368, 1]) from checkpoint, the shape in current model is torch.Size([3072, 12288]). size mismatch for single_blocks.0.linear1.weight: copying a param with shape torch.Size([33030144, 1]) from checkpoint, the shape in current model is torch.Size([21504, 3072]). size mismatch for single_blocks.0.linear2.weight: copying a param with shape torch.Size([23592960, 1]) from checkpoint, the shape in current model is torch.Size([3072, 15360]). size mismatch for single_blocks.0.modulation.lin.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for single_blocks.1.linear1.weight: copying a param with shape torch.Size([33030144, 1]) from checkpoint, the shape in current model is torch.Size([21504, 3072]). size mismatch for single_blocks.1.linear2.weight: copying a param with shape torch.Size([23592960, 1]) from checkpoint, the shape in current model is torch.Size([3072, 15360]). size mismatch for single_blocks.1.modulation.lin.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for single_blocks.2.linear1.weight: copying a param with shape torch.Size([33030144, 1]) from checkpoint, the shape in current model is torch.Size([21504, 3072]). size mismatch for single_blocks.2.linear2.weight: copying a param with shape torch.Size([23592960, 1]) from checkpoint, the shape in current model is torch.Size([3072, 15360]). size mismatch for single_blocks.2.modulation.lin.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for single_blocks.3.linear1.weight: copying a param with shape torch.Size([33030144, 1]) from checkpoint, the shape in current model is torch.Size([21504, 3072]). size mismatch for single_blocks.3.linear2.weight: copying a param with shape torch.Size([23592960, 1]) from checkpoint, the shape in current model is torch.Size([3072, 15360]). size mismatch for single_blocks.3.modulation.lin.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for single_blocks.4.linear1.weight: copying a param with shape torch.Size([33030144, 1]) from checkpoint, the shape in current model is torch.Size([21504, 3072]). size mismatch for single_blocks.4.linear2.weight: copying a param with shape torch.Size([23592960, 1]) from checkpoint, the shape in current model is torch.Size([3072, 15360]). size mismatch for single_blocks.4.modulation.lin.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for single_blocks.5.linear1.weight: copying a param with shape torch.Size([33030144, 1]) from checkpoint, the shape in current model is torch.Size([21504, 3072]). size mismatch for single_blocks.5.linear2.weight: copying a param with shape torch.Size([23592960, 1]) from checkpoint, the shape in current model is torch.Size([3072, 15360]). size mismatch for single_blocks.5.modulation.lin.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for single_blocks.6.linear1.weight: copying a param with shape torch.Size([33030144, 1]) from checkpoint, the shape in current model is torch.Size([21504, 3072]). size mismatch for single_blocks.6.linear2.weight: copying a param with shape torch.Size([23592960, 1]) from checkpoint, the shape in current model is torch.Size([3072, 15360]). size mismatch for single_blocks.6.modulation.lin.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for single_blocks.7.linear1.weight: copying a param with shape torch.Size([33030144, 1]) from checkpoint, the shape in current model is torch.Size([21504, 3072]). size mismatch for single_blocks.7.linear2.weight: copying a param with shape torch.Size([23592960, 1]) from checkpoint, the shape in current model is torch.Size([3072, 15360]). size mismatch for single_blocks.7.modulation.lin.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for single_blocks.8.linear1.weight: copying a param with shape torch.Size([33030144, 1]) from checkpoint, the shape in current model is torch.Size([21504, 3072]). size mismatch for single_blocks.8.linear2.weight: copying a param with shape torch.Size([23592960, 1]) from checkpoint, the shape in current model is torch.Size([3072, 15360]). size mismatch for single_blocks.8.modulation.lin.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for single_blocks.9.linear1.weight: copying a param with shape torch.Size([33030144, 1]) from checkpoint, the shape in current model is torch.Size([21504, 3072]). size mismatch for single_blocks.9.linear2.weight: copying a param with shape torch.Size([23592960, 1]) from checkpoint, the shape in current model is torch.Size([3072, 15360]). size mismatch for single_blocks.9.modulation.lin.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for single_blocks.10.linear1.weight: copying a param with shape torch.Size([33030144, 1]) from checkpoint, the shape in current model is torch.Size([21504, 3072]). size mismatch for single_blocks.10.linear2.weight: copying a param with shape torch.Size([23592960, 1]) from checkpoint, the shape in current model is torch.Size([3072, 15360]). size mismatch for single_blocks.10.modulation.lin.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for single_blocks.11.linear1.weight: copying a param with shape torch.Size([33030144, 1]) from checkpoint, the shape in current model is torch.Size([21504, 3072]). size mismatch for single_blocks.11.linear2.weight: copying a param with shape torch.Size([23592960, 1]) from checkpoint, the shape in current model is torch.Size([3072, 15360]). size mismatch for single_blocks.11.modulation.lin.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for single_blocks.12.linear1.weight: copying a param with shape torch.Size([33030144, 1]) from checkpoint, the shape in current model is torch.Size([21504, 3072]). size mismatch for single_blocks.12.linear2.weight: copying a param with shape torch.Size([23592960, 1]) from checkpoint, the shape in current model is torch.Size([3072, 15360]). size mismatch for single_blocks.12.modulation.lin.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for single_blocks.13.linear1.weight: copying a param with shape torch.Size([33030144, 1]) from checkpoint, the shape in current model is torch.Size([21504, 3072]). size mismatch for single_blocks.13.linear2.weight: copying a param with shape torch.Size([23592960, 1]) from checkpoint, the shape in current model is torch.Size([3072, 15360]). size mismatch for single_blocks.13.modulation.lin.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for single_blocks.14.linear1.weight: copying a param with shape torch.Size([33030144, 1]) from checkpoint, the shape in current model is torch.Size([21504, 3072]). size mismatch for single_blocks.14.linear2.weight: copying a param with shape torch.Size([23592960, 1]) from checkpoint, the shape in current model is torch.Size([3072, 15360]). size mismatch for single_blocks.14.modulation.lin.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for single_blocks.15.linear1.weight: copying a param with shape torch.Size([33030144, 1]) from checkpoint, the shape in current model is torch.Size([21504, 3072]). size mismatch for single_blocks.15.linear2.weight: copying a param with shape torch.Size([23592960, 1]) from checkpoint, the shape in current model is torch.Size([3072, 15360]). size mismatch for single_blocks.15.modulation.lin.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for single_blocks.16.linear1.weight: copying a param with shape torch.Size([33030144, 1]) from checkpoint, the shape in current model is torch.Size([21504, 3072]). size mismatch for single_blocks.16.linear2.weight: copying a param with shape torch.Size([23592960, 1]) from checkpoint, the shape in current model is torch.Size([3072, 15360]). size mismatch for single_blocks.16.modulation.lin.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for single_blocks.17.linear1.weight: copying a param with shape torch.Size([33030144, 1]) from checkpoint, the shape in current model is torch.Size([21504, 3072]). size mismatch for single_blocks.17.linear2.weight: copying a param with shape torch.Size([23592960, 1]) from checkpoint, the shape in current model is torch.Size([3072, 15360]). size mismatch for single_blocks.17.modulation.lin.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for single_blocks.18.linear1.weight: copying a param with shape torch.Size([33030144, 1]) from checkpoint, the shape in current model is torch.Size([21504, 3072]). size mismatch for single_blocks.18.linear2.weight: copying a param with shape torch.Size([23592960, 1]) from checkpoint, the shape in current model is torch.Size([3072, 15360]). size mismatch for single_blocks.18.modulation.lin.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for single_blocks.19.linear1.weight: copying a param with shape torch.Size([33030144, 1]) from checkpoint, the shape in current model is torch.Size([21504, 3072]). size mismatch for single_blocks.19.linear2.weight: copying a param with shape torch.Size([23592960, 1]) from checkpoint, the shape in current model is torch.Size([3072, 15360]). size mismatch for single_blocks.19.modulation.lin.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for single_blocks.20.linear1.weight: copying a param with shape torch.Size([33030144, 1]) from checkpoint, the shape in current model is torch.Size([21504, 3072]). size mismatch for single_blocks.20.linear2.weight: copying a param with shape torch.Size([23592960, 1]) from checkpoint, the shape in current model is torch.Size([3072, 15360]). size mismatch for single_blocks.20.modulation.lin.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for single_blocks.21.linear1.weight: copying a param with shape torch.Size([33030144, 1]) from checkpoint, the shape in current model is torch.Size([21504, 3072]). size mismatch for single_blocks.21.linear2.weight: copying a param with shape torch.Size([23592960, 1]) from checkpoint, the shape in current model is torch.Size([3072, 15360]). size mismatch for single_blocks.21.modulation.lin.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for single_blocks.22.linear1.weight: copying a param with shape torch.Size([33030144, 1]) from checkpoint, the shape in current model is torch.Size([21504, 3072]). size mismatch for single_blocks.22.linear2.weight: copying a param with shape torch.Size([23592960, 1]) from checkpoint, the shape in current model is torch.Size([3072, 15360]). size mismatch for single_blocks.22.modulation.lin.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for single_blocks.23.linear1.weight: copying a param with shape torch.Size([33030144, 1]) from checkpoint, the shape in current model is torch.Size([21504, 3072]). size mismatch for single_blocks.23.linear2.weight: copying a param with shape torch.Size([23592960, 1]) from checkpoint, the shape in current model is torch.Size([3072, 15360]). size mismatch for single_blocks.23.modulation.lin.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for single_blocks.24.linear1.weight: copying a param with shape torch.Size([33030144, 1]) from checkpoint, the shape in current model is torch.Size([21504, 3072]). size mismatch for single_blocks.24.linear2.weight: copying a param with shape torch.Size([23592960, 1]) from checkpoint, the shape in current model is torch.Size([3072, 15360]). size mismatch for single_blocks.24.modulation.lin.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for single_blocks.25.linear1.weight: copying a param with shape torch.Size([33030144, 1]) from checkpoint, the shape in current model is torch.Size([21504, 3072]). size mismatch for single_blocks.25.linear2.weight: copying a param with shape torch.Size([23592960, 1]) from checkpoint, the shape in current model is torch.Size([3072, 15360]). size mismatch for single_blocks.25.modulation.lin.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for single_blocks.26.linear1.weight: copying a param with shape torch.Size([33030144, 1]) from checkpoint, the shape in current model is torch.Size([21504, 3072]). size mismatch for single_blocks.26.linear2.weight: copying a param with shape torch.Size([23592960, 1]) from checkpoint, the shape in current model is torch.Size([3072, 15360]). size mismatch for single_blocks.26.modulation.lin.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for single_blocks.27.linear1.weight: copying a param with shape torch.Size([33030144, 1]) from checkpoint, the shape in current model is torch.Size([21504, 3072]). size mismatch for single_blocks.27.linear2.weight: copying a param with shape torch.Size([23592960, 1]) from checkpoint, the shape in current model is torch.Size([3072, 15360]). size mismatch for single_blocks.27.modulation.lin.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for single_blocks.28.linear1.weight: copying a param with shape torch.Size([33030144, 1]) from checkpoint, the shape in current model is torch.Size([21504, 3072]). size mismatch for single_blocks.28.linear2.weight: copying a param with shape torch.Size([23592960, 1]) from checkpoint, the shape in current model is torch.Size([3072, 15360]). size mismatch for single_blocks.28.modulation.lin.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for single_blocks.29.linear1.weight: copying a param with shape torch.Size([33030144, 1]) from checkpoint, the shape in current model is torch.Size([21504, 3072]). size mismatch for single_blocks.29.linear2.weight: copying a param with shape torch.Size([23592960, 1]) from checkpoint, the shape in current model is torch.Size([3072, 15360]). size mismatch for single_blocks.29.modulation.lin.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for single_blocks.30.linear1.weight: copying a param with shape torch.Size([33030144, 1]) from checkpoint, the shape in current model is torch.Size([21504, 3072]). size mismatch for single_blocks.30.linear2.weight: copying a param with shape torch.Size([23592960, 1]) from checkpoint, the shape in current model is torch.Size([3072, 15360]). size mismatch for single_blocks.30.modulation.lin.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for single_blocks.31.linear1.weight: copying a param with shape torch.Size([33030144, 1]) from checkpoint, the shape in current model is torch.Size([21504, 3072]). size mismatch for single_blocks.31.linear2.weight: copying a param with shape torch.Size([23592960, 1]) from checkpoint, the shape in current model is torch.Size([3072, 15360]). size mismatch for single_blocks.31.modulation.lin.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for single_blocks.32.linear1.weight: copying a param with shape torch.Size([33030144, 1]) from checkpoint, the shape in current model is torch.Size([21504, 3072]). size mismatch for single_blocks.32.linear2.weight: copying a param with shape torch.Size([23592960, 1]) from checkpoint, the shape in current model is torch.Size([3072, 15360]). size mismatch for single_blocks.32.modulation.lin.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for single_blocks.33.linear1.weight: copying a param with shape torch.Size([33030144, 1]) from checkpoint, the shape in current model is torch.Size([21504, 3072]). size mismatch for single_blocks.33.linear2.weight: copying a param with shape torch.Size([23592960, 1]) from checkpoint, the shape in current model is torch.Size([3072, 15360]). size mismatch for single_blocks.33.modulation.lin.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for single_blocks.34.linear1.weight: copying a param with shape torch.Size([33030144, 1]) from checkpoint, the shape in current model is torch.Size([21504, 3072]). size mismatch for single_blocks.34.linear2.weight: copying a param with shape torch.Size([23592960, 1]) from checkpoint, the shape in current model is torch.Size([3072, 15360]). size mismatch for single_blocks.34.modulation.lin.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for single_blocks.35.linear1.weight: copying a param with shape torch.Size([33030144, 1]) from checkpoint, the shape in current model is torch.Size([21504, 3072]). size mismatch for single_blocks.35.linear2.weight: copying a param with shape torch.Size([23592960, 1]) from checkpoint, the shape in current model is torch.Size([3072, 15360]). size mismatch for single_blocks.35.modulation.lin.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for single_blocks.36.linear1.weight: copying a param with shape torch.Size([33030144, 1]) from checkpoint, the shape in current model is torch.Size([21504, 3072]). size mismatch for single_blocks.36.linear2.weight: copying a param with shape torch.Size([23592960, 1]) from checkpoint, the shape in current model is torch.Size([3072, 15360]). size mismatch for single_blocks.36.modulation.lin.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for single_blocks.37.linear1.weight: copying a param with shape torch.Size([33030144, 1]) from checkpoint, the shape in current model is torch.Size([21504, 3072]). size mismatch for single_blocks.37.linear2.weight: copying a param with shape torch.Size([23592960, 1]) from checkpoint, the shape in current model is torch.Size([3072, 15360]). size mismatch for single_blocks.37.modulation.lin.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model is torch.Size([9216, 3072]). size mismatch for final_layer.linear.weight: copying a param with shape torch.Size([98304, 1]) from checkpoint, the shape in current model is torch.Size([64, 3072]). size mismatch for final_layer.adaLN_modulation.1.weight: copying a param with shape torch.Size([9437184, 1]) from checkpoint, the shape in current model is torch.Size([6144, 3072]). File "E:\cumfy ui\ComfyUI_windows_portable\ComfyUI\execution.py", line 152, in recursive_execute output_data, output_ui = get_output_data(obj, input_data_all) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\cumfy ui\ComfyUI_windows_portable\ComfyUI\execution.py", line 82, in get_output_data return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\cumfy ui\ComfyUI_windows_portable\ComfyUI\execution.py", line 75, in map_node_over_list results.append(getattr(obj, func)(**slice_dict(input_data_all, i))) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\cumfy ui\ComfyUI_windows_portable\ComfyUI\nodes.py", line 836, in load_unet model = comfy.sd.load_unet(unet_path, dtype=dtype) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\cumfy ui\ComfyUI_windows_portable\ComfyUI\comfy\sd.py", line 645, in load_unet model = load_unet_state_dict(sd, dtype=dtype) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\cumfy ui\ComfyUI_windows_portable\ComfyUI\comfy\sd.py", line 637, in load_unet_state_dict model.load_model_weights(new_sd, "") File "E:\cumfy ui\ComfyUI_windows_portable\ComfyUI\comfy\model_base.py", line 225, in load_model_weights m, u = self.diffusion_model.load_state_dict(to_load, strict=False) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\cumfy ui\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 2189, in load_state_dict raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
Close"
Up till now not compatible with comfy. You have to use https://github.com/lllyasviel/stable-diffusion-webui-forge instead (until update is available for comfy).
@pampelmann it is compatible with Comfy https://github.com/comfyanonymous/ComfyUI_bitsandbytes_NF4
Put checkpoint in the \ComfyUI\models\checkpoints folder, not unet.
In this comment I say thank you to lllyasviel, join in the comments)
Very fast in Forge
Once you understand how to install everything NF4 is so amazing. Quality and speed at the same time, love it! 💕
Is anyone mange to run a lora with this version yet?
Yes one old LoRA that seems to work from SDXL to SD3 to Flux, but it doesn't do much and you have to set it really high. So kinda useless, I'm afraid we need seperate NF4 LoRA's again.
Pick a LoRA, set it at weight like 10 and see what happens, it doesn't even do what the LoRA should do, its just like its another seed. My woman in dress suddenly looked the other way.
@JayNL it is give me an error i'm afraid
@JayNL oh hold up, so LORA's need to be made specifically for NF4 version models?
@beepbopbip yes it is give me an error with this version and seems to not worknig in the regular version and heard it need kinda workflow to work in comfyui
I hope someone find a way to make loras compatible otherwise we are screwed.
Is there any chance of this working on a 6GB of VRAM?
I've had no problem so far with running the straight flux-schnell model even without bitandbytes on 6GB. It takes about 32gb of normal ram as well, though.
@Ferarn I only have 16gb of normal ram, I guess my potato pc can't run it.
@Luxerion if you have 4GB VRAM and lots of virtual memory you can.
@Luxerion Try, it may takes longer tho. Try small latent images and grow until you or the pc cant stand it anymore :D
maybe it can, but, what's the point.
i mean, i'm first to try new stuff, but one must be real.
me first don't have super duper card, i tried, i played a little, and for that, it is worth waiting.
when curiosity is satisfied, only that is left is knowledge that flux can be run on low end cards, but only highest tier of cards can run flux. if you know what i mean. :)
what vae?
@RalFinger, plan on updating to this version... https://huggingface.co/lllyasviel/flux1-dev-bnb-nf4/blob/main/flux1-dev-bnb-nf4-v2.safetensors ?
It's sooooooooooo much faster. If you'd believe that. I know it's not Comfy yet, but maybe you could make it so. Works in Forge, of course.
Thank you for the info, i am on it 😊
I made some comparisons between different sampling steps
Details
Files
flux1DevHyperNF4Flux1DevBNB_flux1DevBNBNF4V1.safetensors
Mirrors
flux1DevHyperNF4Flux1DevBNB_flux1DevBNBNF4V1.safetensors
nf4Flux1_nf4Bnb.safetensors
pixelforge.safetensors
flux1-dev-bnb-nf4.safetensors
flux1-dev-bnb-nf4.safetensors
flux1DevSchnellBNB_flux1DevBNBNF4.safetensors
flux1-dev-bnb-nf4.safetensors
flux1-dev-bnb-nf4.safetensors
flux1-dev-bnb-nf4.safetensors
flux1-dev-bnb-nf4.safetensors
Available On (1 platform)
Same model published on other platforms. May have additional downloads or version variants.
