https://github.com/Dao-AILab/flash-attention
Flash Attention 2, 2.8.3 precompiled for latest ComfyUI 3.50 Win.
Attention method, necessary for some models.
Couldn't find a wheel for Win, Python 3.13 Pytorch3.8.0,CU12.9 which what ComfyUI is using, so compiled myself one.
Is a long , annoying process that required a beefy PC, so I saved you some time.
When Comfy changes and I need a updated compile, I ll update here again if enough people find it useful..
Description
2.8.3 ,Cu12.9, Py3.12, Pytorch3.8.0 Win
FAQ
Comments (6)
Install:
ComfyUI_windows_portable\python_embeded\python.exe -m pip install [DirToTheWheelFIle]flash_attn-2.8.3+cu129torch2.8.0cxx11abiTRUE-cp313-cp313-win_amd64.whl
i am running pytorch version: 2.9.1+cu128 with Python version: 3.12.0, do you think your wheel will work for my set up, or will i have to compile my own wheel as well. also, if i need to compile, do you have a step by step guide on how to compile my own wheel? T.I.A.
@Noob_ee FA2 is EXTREMELY finicky about platform, so I dont think it will work. You can always try, but likely will fail.
About compiling it, is a bit complex but nothing crazy. It just takes a bunch of time.
Found instructions here:
https://huggingface.co/lldacing/flash-attention-windows-wheel
@PabloFG greatly appreciated. ill try it out when i am not at my pc. i read somewhere it will take a few hours to compile from scratch. i wish there was a wheel already for my set up. ill keep you posted.
@PabloFG just to update, ive compiled a flash attention for my set up. It did not take too long.
Theres a weird error message about schema funct during inference, but the models requiring FA2 works.. no idea why the error.
Inference also seems fast.. if someone has an idea about it, Ill appreciate to know
