The LTX 2.3 KJ Stripped 1.1 workflow is a specialized audio-visual (AV) generation pipeline designed to interpolate between two static images. It utilizes a "Guided" architecture to ensure the generated video adheres strictly to a designated start and end point.
Core Architecture & Processing
Model Loading: The workflow employs the LTX-Video 2.3 22B Distilled transformer model via a specialized KJ loader node.
Dual-Modality Processing: It initializes both an
EmptyLTXVLatentVideoand anLTXVEmptyLatentAudiospace.AV Concatenation: Visual and audio latents are merged into a single latent stream using the
LTXVConcatAVLatentnode, allowing the sampler to process both simultaneously.
Frame Guidance System
Start Frame Anchor: An image is loaded and preprocessed to serve as the reference for
frame_idx: 0.End Frame Anchor: A second image is loaded and preprocessed to serve as the reference for the final frame (
frame_idx: -1).Guide Application: The
LTXVAddGuidenodes apply these images to the latent space with a configurable strength (set to 0.7 in this version) to dictate the video’s trajectory.
Sampling & Decoding
KSampler Configuration: The workflow uses a high-speed 8-step Euler sampler with a simple scheduler.
Separation & Decoding: After sampling, the
LTXVSeparateAVLatentnode splits the results back into distinct video and audio streams.Tiled Decoding: Visuals are decoded using
VAEDecodeTiledto manage high-resolution output (1024x1024) efficiently.
Enhancement & Output
NVIDIA RTX Integration: The workflow includes an
RTXVideoSuperResolutionnode that provides a 2x hardware-accelerated upscale on the final image sequence.Final Assembly: The
VHS_VideoCombinenodes produce two versions of the video—a base generation and a super-resolved version—both featuring synchronized, AI-generated audio.
Description
FAQ
Comments (4)
Bro, this is clean.... 👍👍👍
I just get static non moving image at the end. WF had all nodes already, no errors in terminal.
Theres a cleanup node for that towards the end of the workflow, check to make sure it isn't bypassed. The Latent Crop node has the output latent pass through it to clean up those end frames.
Where's the LORA loader? Can't really use it without that.
