Mostly based on martial arts movies of the 70s (and some 80s). This time I used some videos to both capture some of the combat, but also some typical camera shots. Results may vary. It really struggles with fast character movements and probably need a whole lot more video clips to really be able to capture martial arts.
I've noticed it deviates from the style with longer clips.
Some example prompts from the training set:
"In the foreground, two men are engaged in combat. One man, wearing a sleeveless brown shirt and white pants, is on the left, while the other, in a red martial arts gi, is on the right. The man in red is using a trident to strike the other, who parries the blow. The background shows a dense, green forest under a clear blue sky. The camera angle is low, capturing the intensity of the fight."
"A man wearing traditional Chinese attire, He has long, dark hair and is wearing a black hat with a green emblem on the front. The background is a solid dark purple, creating a stark contrast with the subject. The camera angle is slightly above eye level, capturing the person from the shoulders up. The camera pans right to reveal a man dressed in white. He has long black hair and a stern expression. He stands in front of yellow curtains with red sashes. The lighting is soft and even, with a slight reddish hue on the person's face."
Description
Same output as diffusers without the need for modified comfy or special nodes
FAQ
Comments (2)
I tried to use your lora, and I like it! Just a few suggestions:
-Include some example prompts for it. I found that feeding all the prompts used for training or a few of them and asking Chatgpt to summarise them in a single prompt is very useful. I HATE natural language prompts because of it, simple danbooru tags are simpler and work well, but most models require NL to work.
-At what strength should it be used? I tried at 1, but it makes the lora a bit harder to use.
-Have you used a very high rank for training? The lora is 1.2GB, which is very large for a Hunyuan lora. Looks like a rank 64 or larger.
-Since you used clips from old wuxia movies, I think the best aspect ratio for generations should be 4:3, since all of those were originally filmed on this aspect ratio (something like 640x480)?
I found that your model performs best using Hunyuan Video instead of FastVideo.
Thanks!
Afaik, Hunyuan is trained on natural language, so I figured the lora should be the same. I also had a dislike for NL prompts, but for video I realized I like them. They can pick up on things that tags miss.
I usually have strength 0.8-0.9 for my loras and conditioning on 8-10.
It's rank 64. The Comfy lora is so large due to the conversion process. I'm afraid I haven't been able to shrink it. It's about x4 the size of the diffusers lora.
I've mostly used screen caps for training, and aspect ratio may vary. For the next version I'll use more clips. That will hopefully make action shots better.
But it is a difficult lora to work with. I usually have to do quite a lot of generations to get good results.


