CivArchive

    Welcome to my ๐Ÿ’ซ๐ŸŽฆ Friendly LTX-2 T2V+I2V+Lipsync

    LTX-2.3 better in everything! Coming soon...

    โœจ Less mess, more magic

    UniVibe - Lipsync all-in one version with HQ TTS VibeVoice model is released.

    New v1.2 with simplified model loading, with quality and perfomance improvements.

    LTX-2 is a new video generation model with 19b parameters under the hood. This is the first DiT-based (Diffusion Transformer) foundation model that generates synchronized audio and video simultaneously in a single pass! It supports native 4K resolution at up to 50 FPS, providing cinematic-grade fidelity suitable for professional VFX and film production and it is capable of generating clips up to 10โ€“20 seconds with consistent style and motion.

    ๐Ÿ’ป System requirements:

    • Minimum system requirements for 540p i2v and 720p t2v:

    RTX 3000-s, 8GB+ VRAM, 45GB+ RAM, 8-core processor, SSD, latest ComfyUI

    ๐Ÿš€ Low VRAM optional optimization:

    • For systems with low VRAM use --reserve-vram ComfyUI parameter in run_nvidia_gpu.bat:ย --reserve-vram 4ย (or other number in GB).

    ๐Ÿ“Œ Detailed tips and links to models in the workflow

    โœจ Workflow features:

    • Extremely user-friendly interface

    • Maximum performance and optimization from 8GB of VRAM: GGUF or 8-step distilled model with fp4 or fp8 text encoder + MultiGPU memory optimization

    • All-in-one: i2v, t2v, and interpolation

    • Convenient one-click mode switching

    • Generation time setting in seconds

    • Lora support (up to 3)

    • Detailed tips and links to all necessary models

    • Manual random seed for complete control over generations

    ๐Ÿค—๐Ÿ™๐Ÿผ Thanks to Lightricks Team

    Original repo โ€” GitHub

    Description

    • Even greater VRAM & RAM optimization

    • Links to new LTX-2 fp4 - the best model for balancing quality and performance

    • VAE fixed and optimized

    • Bugs fixed

    • Tips have been updated

    FAQ

    Comments (11)

    vokar28Feb 9, 2026
    CivitAI

    It doesn't seem to lipsync properly. The audio and video are generated, but they don't line up (no mouth movement at all). I prompted things like "she is saying:" with the text used in the audio section. Perhaps I missed something?

    RusselX
    Author
    Feb 10, 2026

    @vokar28 This is a known issue especially with vertical video on LTX-2. It is less common on horizontal videos. But you can use LTX-2-Image2Vid-Adapter lora, it helps often. There is a link in the workflow. Lightricks Team team promise to fix this in future model updates.

    vokar28Feb 10, 2026ยท 1 reaction

    @RusselXย Cool, I'll give that a shot. Thanks!

    hamajorFeb 10, 2026
    CivitAI

    Does this work for video 2 video?

    RusselX
    Author
    Feb 10, 2026ยท 1 reaction

    @hamajor not for now, only image 2 video and text 2 video

    waltuh_07Feb 17, 2026ยท 1 reaction
    CivitAI

    thanks for your work but why you didn't include the model links and file locations in the description?

    RusselX
    Author
    Feb 19, 2026

    @waltuh_07 Hi! You can find links in the worflow in links section

    33251215a613Feb 25, 2026
    CivitAI

    Where does one find ltx-av-step-1751000_vocoder_24K.safetensors I tried google-fu and Claude but they couldn't locate it on the open web. Said it was a part of LTX-2's offical release, but no trace of it found, looks like it's a VAE but can't tell if it's audio or video?

    RusselX
    Author
    Feb 26, 2026ยท 1 reaction

    @33251215a613 You don't need this file separately, you need to load audio vae - LTX2_audio_vae_bf16.safetensors and video vae - LTX2_video_vae_bf16.safetensors and put them in the appropriate checkboxes in LTX-2 Module

    jv12802224Feb 26, 2026
    CivitAI

    hi

    When I tested the structure in the prompt, node speaker main (sampler female) and node sampler2 (sampler male)

    [1]: female text....

    [2]: male text....

    [1]: female text....

    The problem is that the last one [1] makes the male speak this text with a female voice. If I delete this last one [1], then there is no problem.

    Shouldn't [1] represent the main speech node and [2] represent the speaker2 node?

    RusselX
    Author
    Feb 26, 2026

    @jv12802224 Hello! Yes you are right. [1] is for main and [2] for second speaker.
    You should write without ":" and strictly the next line like this:
    [1] female
    [2] male
    [1] female
    or you can also use this format:
    Speaker 1: female
    Speaker 2: male
    Speaker 1: female

    Workflows
    LTXV2

    Details

    Downloads
    639
    Platform
    CivitAI
    Platform Status
    Available
    Created
    2/8/2026
    Updated
    5/13/2026
    Deleted
    -

    Files

    friendlyLTX2T2VI2V_univibe11Lipsync.zip

    friendlyLTX2T2VI2V_univibe11Lipsync.zip

    Mirrors

    friendlyLTX2T2VI2V_univibe11Lipsync.zip

    Mirrors