HUNYUAN | Img 2 Vid LeapFusion
Requirements: LeapFusion Lora v2 (544p) or v1 (320p)
In short: it uses a special LORA to do the trick.
It works combined with avaible loras around. Prompting helps a lot but works even without.
Raise resolution for more consistence and similarity with input image.
*you may want to change steps on your needs. I used few steps for testing.

Bonus TIPS:
Here an article with all tips and trick i'm writing as i test this model since December:
https://civarchive.com/articles/9584
you will get a lot of precious quality of life tips to build and improving your hunyuan experience.
no need to buzz me, ty💗 ..feedbacks are much more appreciated.
Description
FAQ
Comments (77)
Yeah but
(IMPORT FAILED) comfyui-art-venture
(IMPORT FAILED) ComfyUI-SaveImageWithMetaData
(IMPORT FAILED) ComfyUI-ImageMetadataExtension
:(
uhm why this? maybe try install this ode manually? i don't know.
it should work. the workflows is pretty simple, with common nodes
I did everything,i even added a new code which i found as a solution on github for comfyui-art-venture,this time it broke my reActor but i fixed it after deleting the code :D
Does this technique allow both source and target frames or just source?
justs source
Seems like they updated their lora model yesterday with one trained on a higher resolution (960x544 instead of 512x320)
https://huggingface.co/leapfusion-image2vid-test/image2vid-960x544/blob/main/img2vid544p.safetensors
yes this V2 workflow is about this update 😁
extremely slow, gets stuck on sampler.
Now i need help on sageattention,its not working.i tried downloading sageattention and its in my folder but nope
i have it in User\ComfyUI\SageAttention
Sageattention doesn't work as a node. Follow the guide and install (preferably use a conda enviroment for comfyui)
https://github.com/thu-ml/SageAttention?tab=readme-ov-file#install-package
Now if you have more than one gpu and one is capable of sage v2 ( i have a 1660 and 3090) then you will get an error.
what i did to fix that (because im on ubuntu and gpu's index weirdly i guess) was to modify setup.py
line 108 "for i in range(device_count):"
to
"for i in range(device_count-1):"
No quotes obviously
@tangentplum598 I'm at the last step and its saying this,
AttributeError: module 'pkgutil' has no attribute 'ImpImporter'. Did you mean: 'zipimporter'?
@Santaonholidays does this help https://stackoverflow.com/questions/77364550/attributeerror-module-pkgutil-has-no-attribute-impimporter-did-you-mean#:~:text=Due%20to%20the%20removal%20of%20the%20long%2Ddeprecated%20pkgutil.ImpImporter%20class%2C%20the%20pip%20command%20may%20not%20work%20for%20Python%203.12.
It would be helpful to know the enviroment you are installing in
@tangentplum598 I was using the comfyui desktop app terminal doesn't that work? :D
@Santaonholidays what
@tangentplum598 Nvm,i don't know how to make a conda .env
Installation of this was really hard. I had to redo the whole python cuda and triton installation and use a version of comfy that has embedded python 3.11, that was my main issue is my comfy used 3.10. Install system wide python 3.11, cuda 12.4, visual studio with the right packages, and triton 3.1. Build triton wheel then build sage and it should work if everything is installed like that right. Need a 30 series+ nvidia video card minimum.
@tenstrip My Brain exploded xD
What is about the other attns in the hunyuan wrapper llike sdpd, comfy, fast att? Will they not work?
@sikasolutionsworldwide709 I tried that and it does work but only if youre lucky,i tried an ice cream to make it drip and it was a static pixelated picture
@Santaonholidays The i2v lora is not perfect almost half of the gens are usually just bad and after messing with settings for a few days it seems like lowering the lora to 0.7 or 0.5 can help but you need the prompt to make the model line the motion up for the input image perfectly or it just peels into some completely random noise. And definitely don't use any Lora's that aren't trained on motion they ruin it more. This is just a scuffed shortcut because the real i2v model release will have much more contextual data and training to base off an input image.
@sikasolutionsworldwide709 They work but sage cut a massive amount off time for me on a 3090, 50 step 3 seconds went down to under 2 minutes at 540p it's definitely the only one that's really optimizing things.
@tenstrip thats why i need to fix it to be cool like you c:
This workflow works great, thanks for continuing to develop it. I do have 1 issue. I can do 512x768 but I can't do 768x768 w/o running out of memory? I am on a 4090 so I am wondering if there is some setting I need to change when we go up in resolution?
Reduce the total frame rate or lower the resolution
@fayer1688 Thanks, I was able to generate 576x576, that seems to be fine.
@yajukun Don't set it to a square resolution, crop the image to a rectangle, use a rectangular resolution, and you can set the height higher. I tried 896x496 num_frames 101 and it worked fine
@fayer1688 Oh, let me try that! Thanks!
@fayer1688 I just tried 896x496 @101 frames and got OOM error? Not sure if it's becuase I am using 3 Loras including the new img2vid one? Maybe I could do 101 is I only use x2 Loras? Seems like 73 frames works with 3 Loras tho.
@yajukun Don't use teacache, it will use a lot of extra video memory
@fayer1688 Oh, interesting! I will try to bypass it and test it out again. Thanks!
Anyone with this problem?
HyVideoTextImageEncode
unsupported operand type(s) for //: 'int' and 'NoneType'
same here
non-stop now even on the original django workflows. Something changed since the first i2v examples.. Even when I seemingly randomly get it to work, all I get is blackscreens anyway.
Hi, I think we have seen a similar issue before, it had to do with having to downgrade transformers. Take a look at the comments section for this article: https://stable-diffusion-art.com/hunyuan-video-ip2v/
I feel like I should comment that I turned enhance-a-video down to 0 then just bypassed it and never went back. I think it's not latching to the initial image and is reading the prompt and overlaying random noise that is meant for t2v. It's what causes those weird bubbles and warped distortions in the bg.
yeah i noticed that happening sometimes too, but is usefull. depends on a lot of factors.
size, steps...prompt... is all so unstable 🤣
I hope you can point me in the right direction, you said you got this to work on an RTX 3090.
But the model + lora already take 24 GB of VRAM and the text encoder model another 16GB.
So how did you make that work. Your workflow doesn't work for me like this.
Were you able to get it working? I have a 4090 not 3090 but same 24GB VRAM and I can run it with the 544p img2vid Lora plus 2 other Loras. Per @fayer1688 suggestion, I use 896x496. If you are getting OOM issues maybe try use the smaller 240p img2vid Lora and a smaller image and see if that runs?
@yajukun Not exactly, I just completly reinstalled Comfy and now it works. It's slow as hell, but it works. It's just most of the time the output is very fragmented and I don't know why.
@Jellon So today I tried another members Hunyuan WF that had missing nodes, when I installed them it started uninstalling my triton and sage, borked my comfy. So I too had to reinstall today. I did noticed that my intial reinstall ran this img2vid workflow w/o popup errors but all my output looks like ghosts that are glitching out. Is that what your output looks like? I remember I missed that config.ini security setting in the Manager and wasn't sure if that borked my setting up of sage, triton, hunyuan nodes? So I removed all the nodes and reinstalled everything. All my crap seems to work ok now. I just tested 896x496 @73 frames (using the 544p img2vid Lora and x2 other Loras), I finish it in about 1:40 with teacache and about 2:45 w/ teacache bypassed. I can also confirm that @fayer1688 was correct, by turning off teacache I was able to extend my frames out to 101 w/o getting OOM errors.
@yajukun I don't think so. A PC restart seems to have fixed the issue for now without me doing anything. I restarted comfy a couple of times already, but that didn't help. Weird issue. I hope it doesn't come up again.
@Jellon Something that I just noticed in my cmd window when I queue my prompts, if the Request to Load HyVideoModle = False, the output will be borked no matter how many time I generate. It will not give me a GUI error, it will allow me to keep generating garbage. I adjust Lora strengths and it reloaded the model, then I'm good to go. Glad you got it working.
For some reason this workflow crashes my ComfyUI, it apparently loads but I see no nodes at all.
guys theres a new native workflow on kijai wrapper github. can anyone try it and confirm if it works wth fast fp8 model? because while workflow itself works without errors, all I get is noisy artifacts. adding other loras didnt help.
UPD: I downloaded regular fp8 model and it seem to work well, still wondering if its possible to make fast one work
Stuck at 59% after much messing around to get it this far. It is stuck at
encoded latents shape torch.Size([1, 16, 1, 64, 64])
and goes no further? I edited out a sections of transformer models timm at lines 24 and 25 because that was giving an error and was advised that would fix that part at least. Now it's just sitting there. There seems to be drive activity (LED is flashing on and off at a constant rate). But not much else.
For anyone else getting this error remove --lowvram from the startup batch file.
However I am unable to get satisfactory output from any kijai based workflows. I either get OOM's or garbled output at lower res/frames. I can get up to 720x720 at 129 frames on a 4080 Super on a 5800X3D with only 32GB of DDR4. Yet on kijai I get OOM's eveb at 480x480.
If someone can work out how to get a gguf and custom sampler version of this workflow it would be a godsend.
no idea what i'm doing wrong, i just get a black video output with no visual. the workflow goes through each node but the output is just a blank, black video file. wtf?
same thing is happening to me. did you find the problem?
Try this fix https://github.com/kijai/ComfyUI-HunyuanVideoWrapper/issues/118
One of the fixes fixed my problem.
AMD users will have an issue using SDPA with torch version < 2.5.1 giving black screen, and not choosing bf16 in the Hunyuan model loader which also can give black screen.
I'm attempting to run this on a 4070 12 GB and am getting OOM errors. Is there any hope for this, or are there any tweaks I can make to get it to run?
I initially had trouble running Kija's nodes on my 12GB 4080 too until I set the HunYuanVideo Model Loader to auto_cpu_offload and turn off upcast_rope. In addition I enabled quantization for the text encoder (bnb_nf4) by installing bitsandbytes on Windows.
If I didn't enable quantization my 32 GB RAM would be entirely in use by the llm and clip models and my VRAM would fill up entirely. With quantization turned on the llm and clip models only take around ~15 GB and some VRAM. When it's loading the main video model that memory gets released back to the pool.
Of course resolution and number of frames can still cause OOM's, reduce those until you use around 90/95% of VRAM when the video is being generated.
@funscripter627 Thanks for the tips! The first couple config changes got me a little leeway but it's still running OOM very often. I installed bitsandbytes using the pip installer in comfyui manager and am getting an error "module 'bitsandbytes' has no attribute 'nn'". Is there a parameter that I have to configure for defining this? Thanks.
@radiantResistor No problem man. I'm unfortunately unfamiliar with that specific error. Maybe try to ask ChatGPT. I think I installed it by opening my venv and executing the pip command. I'm using bitsandbytes-0.45.1 if that helps.
Hello. Is there a way to change the sampler )( like euler or others) in this workflow? Thank You
DownloadAndLoadHyVideoTextEncoder
Failed to import transformers.models.timm_wrapper.configuration_timm_wrapper because of the following error (look up to see its traceback): cannot import name 'ImageNetInfo' from 'timm.data'
Any ideas how I can solve this please?
For me the error was gone after replacing the timm folder in mentioned path with version 1.0.13
Is it possible without sageattn? (Because of Windows)
Tried comfy attn but the generated Video was only some black white Pixels. I am not shure if attn was the issue
I was able to get sage attention installed on windows. It won't work if you're running comfy in Stability Matrix because there's a bug in how a python package is loaded. You need to install triton which as a bunch of dependencies then it should install okay. https://github.com/woct0rdho/triton-windows Just carefully follow the directions under "install from wheel". It's worth the pain; a real boost in speed and capabilities.
@radiantResistor using Stability Matrix 🤣
Maybe i got it run, but now i stuck on vram allocation. Have only 12G.
Not working. sloooooooooow
4070 Ti S 16G, the progress is stuck at 0%.Help me, please. Thanks.
Everytime i want to create a video using the flow for the second time in a row, i.e. just simply running the task for the second time, my 4090 crashed and I need to reboot my PC. Anyone faced this issue? I am using Puthon 3.12.8 for ComfyUI.
Greetings.
Maybe you can help:
shape '[1, 1536, 1, 128]' is invalid for input of size 3342336
It stops at HunyuanVideoSampler node.
Regards,
It means there is a res mismatch.The image size and the sampler size is different. Do a image resize for your input image and connect the width and height outputs from the resizer node to the sampler inputs.
@jtsanborn Thanks I´ll try.
Help, please
If choose llm_model from Kijai, then error:
DownloadAndLoadHyVideoTextEncoder
Unrecognized configuration class <class 'transformers.models.llava.configuration_llava.LlavaConfig'> for this kind of AutoModel: AutoModel.
Your workflow almost made my 4070TIS die heroically. Suddenly, the computer crashed with a click! Oh my God!
For anyone having issues with this workflow or i2v using Hunyuan in general may I suggest you try Cosmos workflow v1.2. I am getting good results from it. The only downside is that it is not great for nsfw.
Hi @LatentDream, any chance of updating your workflow to work with Skyreels Img2Vid? Thanks!
doesnt pass text part
just stop after loading model and vae
same
Run into some problem... any ideas? I tried updating everything a can think of...
HyVideoSampler 3:20:49:54 - Return type mismatch between linked nodes: feta_args, received_type(TEACACHEARGS) mismatch input_type(FETAARGS)20:49:54 - Return type mismatch between linked nodes: context_options, received_type(FETAARGS) mismatch input_type(HYVIDCONTEXT)Could you please update this workflow to use this: https://github.com/deepbeepmeep/HunyuanVideoGP
it's not the same thing at all...Here it's i2v, hunyuangp is for t2v and it can't be transposed to comfyui.
All that is is fast hy compiled + sageattn in its own gradio env. They make it seem like something new you can easily recreate it in comfy.
