WAN VACE Clip Joiner - Smooth AI video transitions for Wan, LTX-2, Hunyuan, and any other video source

WAN VACE Clip Joiner - Smooth AI video transitions for Wan, LTX-2, Hunyuan, and any other video source - v2.4

NSFW

New feature: seamless looping

ComfyUI Frontend Compatibility Notice

Affected versions: ComfyUI_frontend 1.40.x – 1.42.9 (known good: <= 1.39.19 or >= 1.42.10)

Recent ComfyUI frontend updates have introduced significant issues with subgraph functionality that affect this workflow.

If you are affected, this message appears in your ComfyUI console right after you start a workflow run:

Failed to validate prompt for output 499: 
* ColorMatch 587:586: 
 - Required input is missing: image_target 
* Basic data handling: IfElse 598: 
 - Required input is missing: if_false

The workflow may appear to run correctly, but only parts of it will actually produce output. It won't finish with a properly joined video.

If you see this warning and the workflow isn't running as expected, downgrade your ComfyUI frontend to 1.39.19 or upgrade to 1.42.10, and reload a fresh copy of the workflow.

What it Does

Point this workflow at a directory of clips and it will automatically stitch them together. It's designed to work well with a few clips or dozens. At each transition, Wan VACE generates new frames guided by context on both sides, replacing the seam with motion that flows naturally between the clips. Noisy or artifacted frames at clip boundaries get replaced in the same pass. How many context frames and generated frames are used is configurable.

The workflow runs with either Wan 2.1 VACE or Wan 2.2 Fun VACE. Input clips can come from anywhere - Wan, LTX-2, phone footage, stock video, whatever you have.

If you want the result to loop cleanly, there's a toggle for that.

Usage

Put your input clips in their own directory, named so they sort in the order you want them joined.
Configure the workflow parameters. The notes in the workflow have full details on each one.
Set the index to 0.
Queue the workflow. You need to queue it once per transition. That's N-1 times for N clips, or N times if looping is enabled.

Setup

This is not a ready to run workflow. You need to configure it to fit your system.

What runs well on my system will not necessarily run well on yours. Configure this workflow to use a VACE model of the same type that you use in your standard Wan workflow. Detailed configuration and usage instructions can be found in the workflow. Please read carefully.

Dependencies

I've used native nodes and tried to keep the custom node dependencies to a minimum. The following packages are required. All of them are installable through the Manager.

Note: I have not tested this workflow under the new Nodes 2.0 UI.

Configuration and Models

You'll need some combination of these models to run the workflow. As already mentioned, this workflow will not run properly on your system until you configure it properly. You probably already have a Wan video generation workflow that runs well on your system. You need to configure this workflow similarly to your generation workflow.

Wan 2.2 Fun VACE
- bf16 and fp8
- GGUF
Wan 2.1 VACE
- fp16
- GGUF
Kijai’s extracted Fun Vace 2.2 modules, for loading along with standard T2V models. Native use examples here.
- bf16
- GGUF

The Sampler subgraph contains KSampler nodes and model loading nodes. Inference is isolated in subgraphs, so it should be easy to modify this workflow for your preferred setup. Replace the provided sampler subgraph with one that implements your setup, then plug it into the workflow. Have your way with these until it feels right to you.

Just make sure all the subgraph inputs and outputs are correctly getting and setting data, and crucially, that the diffusion model you load is one of Wan2.2 Fun VACE or Wan2.1 VACE. GGUFs work fine, but non-VACE models do not. An example alternate sampler subgraph for VACE 2.1 is included.

Enable sageattention and torch compile if you know your system supports them.

Troubleshooting

The size of tensor a must match the size of tensor b at non-singleton dimension 1 - Check that both dimensions of your input videos are divisible by 16 and change this if they're not. Fun fact: 1080 is not divisible by 16!
Brightness/color shift - VACE can sometimes affect the brightness or saturation of the clips it generates. I don't know how to avoid this tendency, I think it's baked into the model, unfortunately. Disabling lightx2v speed loras can help, as can making sure you use the exact same lora(s) and strength in this workflow that you used when generating your clips. Some people have reported success using a color match node before output of the clips in this workflow. I think specific solutions vary by case, though. The most consistent mitigation I have found is to interpolate framerate up to 30 or 60 fps after using this workflow. The interpolation decreases how perceptible the color shift is. The shift is still there, but it's spread out over 60 frames instead over 16, so it doesn't look like a sudden change to our eyes any more.
Regarding Framerate - The Wan models are trained at 16 fps, so if your input videos are at some higher rate, you may get sub-optimal results. At the very least, you'll need to increase the number of context and replace frames by whatever factor your framerate is greater than 16 fps in order to achieve the same effect with VACE. I suggest forcing your inputs down to 16 fps for processing with this workflow, then re-interpolating back up to your desired framerate.
IndexError: list index out of range - Your input video may be too small for the parameters you have specified. The minimum size for a video will be (context_frames + replace_frames) * 2 + 1. Confirm that all of your input videos have at least this minimum number of frames.
If you can't make the workflow work, update ComfyUI and try again. If you're not willing to update ComfyUI, I can't help you. We have to be working from the same starting point.
Feel free to open an issue on github. This is the most direct way to engage me. If you want a head start, paste your complete console log from a failed run into your issue.

Changelog

v2.5
- Seamless Loops - Enable the Make Loop toggle and the workflow will generate a smooth transition between your final input video and the first one, allowing the video to be played on a loop.
- Much lower RAM usage during final assembly - Enabled by default, VideoHelperSuite's Meta Batch Manager drastically reduces the amount of system RAM consumed while concatenating frames. If you were running out of RAM on the final step because you were joining hundreds or thousands of frames, that shouldn't be a problem any more. Additional details in the workflow notes.
v2.4 Minor tweaks. Adjust sage attention, torch compile defaults.
v2.3 This release prioritizes workflow reliability and maintainability. Core functionality remains unchanged. These changes reduce surface area for failures and improve debuggability. Stability and deterministic operation take priority over convenience features.
- Looping workflow discontinued – While still functional, the loop-based approach obscured workflow status and complicated targeted reruns for specific transitions. The batch workflow provides better visibility and control.
- Reverted to lossless fv1 intermediate files – The 16-bit PNG experiment provided no practical benefit and made addressing individual joins more cumbersome. Returning to the proven method.
- New custom nodes for cleaner workflows – WAN VACE Prep Batch and VACE Batch Context encapsulate operations that are awkward to express in visual nodes but straightforward in Python. Load Videos From Folder (simple) replaces the KJNodes equivalent to eliminate problematic VideoHelperSuite dependencies that fail in some environments.
- Enhanced console logging – Additional diagnostic output when Debug=True to aid troubleshooting.
- Fewer custom node dependencies
The Lightweight Workflow has moved to its own page. Check it out if you just need to quickly join two clips without the overhead required by the full workflow.
v2.2 Complexity Reduction Release
- Removed fancy model loader which was causing headaches for safetensors users without any gguf models installed, and vice-versa.
- Removed the MOE KSampler and TripleKSampler subgraphs. You can still use these samplers, but it's up to you to bring them and set them up.
- Custom node dependencies reduced.
- Un-subgraphed some functions. Sadly, this powerful and useful feature is still too unstable to distribute to users on varying versions of ComfyUI.
- Updated documentation.
v2.1
- Add Prune Outputs to Video Combine nodes, preventing extra frames from being added to the output
v2.0 - Workflow redesign. Core functionality is the same, but hopefully usability is improved
- (Experimental) New looping workflow variant that doesn't require manual queueing and index manipulation. I am not entirely comfortable with this version and consider it experimental. The ComfyUI-Easy-Use For Loop implementation is janky and requires some extra, otherwise useless code to make it work. But it lets you run with one click! Use with caution. All VACE join features are identical between the workflows. Looping is the only difference.
- (Experimental) Added cross fade at VACE boundaries to mitigate brightness/color shift
- (Experimental) Added color match for VACE frames to mitigate brightness/color shift
- Save intermediate work as 16 bit png instead of ffv1 to mitigate brightness/color shift
- Integrated video join into the main workflow. It will run automatically after the last iteration. No more need to run the join part separately.
- More documentation
- Inputs and outputs are logged to the console for better progress tracking
v1.2 - Minor Update 2025-Oct-13
- Sort the input directory list.
v1.1 - Minor Update 2025-Oct-11
- Preserve input framerate in workflow VACE outputs. Previously, all output was forced to 16fps. Note, you must manually set the framerate in the Join & Save output.
- Changed default model/sampler to Wan 2.2 Fun VACE fp8/KSampler. GGUF, MoE, 2.1 are still available in the bypassed subgraphs.

Description

v2.4 - Minor tweaks. Adjust sage attention, torch compile defaults.

FAQ

Comments (21)

TheRopeDudeFeb 13, 2026· 2 reactions

CivitAI

Thank you so much for your work!

arkkotangensFeb 25, 2026

CivitAI

Hi. Thank you so much for this workflow, it's a brilliant solution.

I'd like to clarify one thing.

I'm using the workflow with Wan Vace 2.1 on a 4090. Wan Vace 2.2 always gave me some kind of small grainy warping in new frames, Wan Vace 2.1 works perfectly.

I don't use loras or other optimizations, as I'm more interested in quality, not frame stitching speed. But I noticed that VRAM usage during generation stays at 78 percent. The total frame stitching time ranges from an hour and a half (first run) to an hour (subsequent runs). Comfy updated 3 days ago.

Is this normal? I suspect there's a bottleneck somewhere.

__Bob__

Author

Feb 25, 2026

Hi, I'm glad you find the workflow useful.

It's interesting that you get better results with 2.1. For me it's the opposite. I suppose it must vary according to parameters and content type.

It sounds like you're saying that one iteration of the workflow runs for an hour to an hour and a half. Is that correct, or is that the total time to generate all the transitions and assemble them into the final video? How many clips are you joining together?

This is fundamentally the same as any Wan workflow. It just uses a different model and includes some extra plumbing for control setup and scalable batching. The time to generate one transition should be roughly the same as the time it takes to run a normal Wan t2v generation at the same resolution, frames and parameters. If that's not your experience, then yes, there might be a bottleneck somewhere.

On the other hand, if you're using 30 steps for each generation and doing that for a large number of clips, an hour doesn't seem unreasonable.

Depending on the total number of frames in the final joined video, the final stitching step can be very resource intensive. Before the work files can be joined into one video, all frames must be loaded into system RAM, and this can quickly fill up memory and trigger slowdown due to paging. If this happens to be the problem, I have a good solution that I haven't integrated into the workflow yet.

If you want, share more details about your environment and how you're using the workflow (number of inference steps, number of input clips, resolution, length, system RAM, VRAM, etc) and we can try to identify whether there's a bottleneck.

ron01468Feb 27, 2026

CivitAI

Thanks for making this workflow. A couple of questions as I get started:

1. For files in the input folder: do they need to be sorted alphanumerically in the correct order? Like Wan_0001, Wan_0002, etc?

2. Can the input videos be longer than 5 seconds/81frames? I typically gen 97 or 113 frames.

3. I normally upscale (SeedVR2) and interpolate (to 32fps). Should I do that on individual clips and then use this joiner or would that cause problems because of the increased size?

Thanks!

__Bob__

Author

Feb 27, 2026· 2 reactions

Hi,

Yes your input files should be named such that a lexical sort will put them into the order you want. This means that numbers in the names should be zero-padded, as in your example. Without zero-padding, they'll sort wrong: wan_1, wan_10, wan_2, etc.

Input videos can be any length that your system RAM can handle, as long they're long enough to provide the amount of context and replace frames that you configure. The workflow will complain if this isn't the case. 97 or 113 frames should be fine.

If you can help it, join the clips first, then run scaling and interpolation. If you scale and interpolate first, the workflow will require more time and system resources to accomplish the joins. Additionally, you'll need to specify more context and replace frames (twice as many for 32fps) for the VACE controls in order for frame generation to be effective. This will also increase VRAM and runtime requirements because VACE will need to generate more frames.

guiltyai69Mar 12, 2026

CivitAI

Is it possible to use this with only one clip to make a looping video?

__Bob__

Author

Mar 12, 2026

The workflow needs two or more clips or else there's nothing to join. You're probably better off trying a workflow made for producing looping video. I think I've seen some that use VACE.

guiltyai69Mar 13, 2026

@__Bob__ Unfortunately the only workflows i've been able to find for looping videos all use Wan2.1 VACE and not Wan2.2 VACE, so the results don't look as good.

agentgerbilMar 22, 2026

CivitAI

Keep getting this error and i dont know how to fix it

Failed to validate prompt for output 499:
* ColorMatch 587:586:
- Required input is missing: image_target
* Basic data handling: IfElse 598:
- Required input is missing: if_false

__Bob__

Author

Mar 23, 2026

It sounds like some nodes have come unconnected. People report similar issues caused by a recent ComfyUI frontend package update. If you have updated ComfyUI in the past week or so, that may be the cause. If this sounds like your problem, downgrading ComfyUI to a version with comfyui_frontend_package==1.39.19 or lower might be most straightforward solution.

Otherwise, try:

- close all open copies of this workflow, restart ComfyUI, and then load a fresh, untouched copy of the workflow

- if that didn't help, and you're comfortable editing workflows, you can try reconnecting the nodes. I don't recommend this if you're not familiar with the basic mechanics of ComfyUI editing. The nodes mentioned in the console messages you shared are on the right side of the workflow in the group labelled Post-Processing: Color Match. You'll need to edit your workflow to match the screenshots at this link: https://imgur.com/a/0E2qslO. Note the second image shows the nodes inside the Apply Color Match subgraph.

zs7758Mar 23, 2026

I’ve encountered this issue too — even after downgrading the ComfyUI version, it still hasn’t been resolved. Have you found a solution?

__Bob__

Author

Mar 23, 2026

@zs7758 If you have downgraded ComfyUI, make sure that you open a new copy of the workflow after that. The broken ComfyUI version may have saved a bad copy of the workflow, so start fresh.

zs7758Mar 23, 2026

@__Bob__ I tried downgrading the ComfyUI version, but it didn’t help — the same error still appears.

zs7758Mar 23, 2026

@__Bob__ Yes, after downgrading the ComfyUI version, I reloaded the workflow, but the same error still occurred.

zs7758Mar 23, 2026

@__Bob__ I recommend that in the next version, you remove the node with the “bracket-style” (often called “Primitive Node” or “Node Group Port”). I’ve found this type of node is prone to issues — especially after ComfyUI updates, where the likelihood of encountering abnormal problems is very high. I’ve tried every possible method, including reloading the workflow, but nothing has worked. It seems that this “bracket-style” node fails to properly receive data — even though I carefully checked all node connections, they appear correct. I suspect that this kind of port node sometimes incorrectly identifies the actual data (e.g., Tensor) flowing from its “backend” connection. Though visually a wire extends from it, the real image data does not actually pass through to downstream nodes. I suggest you switch to KJ’s Set and Get nodes.

agentgerbilMar 23, 2026

the lightweight workflow works still at least, just takes more time to join clips

__Bob__

Author

Mar 23, 2026

@zs7758 Are you referring to the input and output connectors in the subgraph when you say "bracket-style node"? Those are not optional. That's how subgraphs work.

The recent comfyui_frontend_package update to 1.41.x broke many subgraph-related features in previously working workflows. Apparently it's also a problem here.

I'm sorry you're having trouble, but until ComfyUI fixes their broken front end, the only real solution seems to be downgrading to an earlier version. In case it helps, I'm currently running ComfyUI 0.17.0+ComfyUI_frontend 1.39.19. A newly-downloaded copy of this workflow runs fine with this setup.
After you downgrade, don't just reopen the workflow after that. Open an untouched copy, extracted from the zip file you got from CivitAI.

zs7758Mar 23, 2026

@__Bob__ Thank you for your guidance. Even after downgrading ComfyUI to version 0.17.0, the issue persisted — since the ComfyUI_Frontend version was 1.41. I then tried downgrading to ComfyUI version 0.16.0 and ComfyUI_Frontend version 1.39.19, and the workflow ran successfully. It’s now working properly.

For the newer versions of ComfyUI, I still recommend replacing the input/output connectors used in subgraphs. Actually, it’s not necessary to adopt this method for workflow simplification. For beginners, it’s even more beneficial to avoid using subgraphs — as it’s easier to understand and debug.

zs7758Mar 23, 2026

@__Bob__ I have another question to ask you: How can I prevent these videos from being merged together? For example, if I have four 5-second clips, I want to preserve them as either two 10-second clips or four individual 5-second clips — instead of merging them into one 20-second video.

Once merged, I still need to use mmAudio to generate audio, perform video frame interpolation, and upscale resolution. If the total length is too long, mmAudio will consume more VRAM and take longer. After generating the audio, I manually merge the four clips back together.

What should I do to achieve this?

__Bob__

Author

Mar 23, 2026

@zs7758 If you have 4 clips but only want to join 2 of them together, just put those 2 in the input directory. Or, even simpler, try my Lightweight Edition workflow. It's linked on this page and is meant to join two clips together rather than the larger batch that this workflow is intended for.

However, if your goal is to ultimately join all 4 into one longer video after some other processing, you'll want to generate transitions between all of the clips, not just 1+2 and 3+4, but also 2+3. So consider using the work files created by this workflow instead of its final output. In the project directory that you specify, a new directory called vace-work is created. As the workflow runs, it puts all of the generated files into vace-work. Then the final step combines everything in vace-work into one long video.

The vace-work files are saved with the lossless ffv1 codec. If the mkv format makes it hard for you to work with, you can find the three "Lossless Save" nodes in the workflow and change their output format to something that suits you better.

zs7758Mar 23, 2026

@__Bob__ Thank you! I found a node that can split a 20-second video into four separate clips. My goal is to use your workflow to seamlessly transition between segments at the end of each clip. I’ll process each segment individually for audio generation, frame interpolation, and resolution upscaling.

Thank you for your excellent work!

Workflows

Wan Video 14B t2v

by __Bob__

Download (Beta) View on CivitAI