Welcome to my 💫🎦 Friendly LTX-2 T2V+I2V+Lipsync
LTX-2.3 better in everything! Coming soon...
✨ Less mess, more magic
UniVibe - Lipsync all-in one version with HQ TTS VibeVoice model is released.
New v1.2 with simplified model loading, with quality and perfomance improvements.
LTX-2 is a new video generation model with 19b parameters under the hood. This is the first DiT-based (Diffusion Transformer) foundation model that generates synchronized audio and video simultaneously in a single pass! It supports native 4K resolution at up to 50 FPS, providing cinematic-grade fidelity suitable for professional VFX and film production and it is capable of generating clips up to 10–20 seconds with consistent style and motion.
💻 System requirements:
Minimum system requirements for 540p i2v and 720p t2v:
RTX 3000-s, 8GB+ VRAM, 45GB+ RAM, 8-core processor, SSD, latest ComfyUI
🚀 Low VRAM optional optimization:
For systems with low VRAM use --reserve-vram ComfyUI parameter in run_nvidia_gpu.bat:
--reserve-vram 4(or other number in GB).
📌 Detailed tips and links to models in the workflow
✨ Workflow features:
Extremely user-friendly interface
Maximum performance and optimization from 8GB of VRAM: GGUF or 8-step distilled model with fp4 or fp8 text encoder + MultiGPU memory optimization
All-in-one: i2v, t2v, and interpolation
Convenient one-click mode switching
Generation time setting in seconds
Lora support (up to 3)
Detailed tips and links to all necessary models
Manual random seed for complete control over generations
🤗🙏🏼 Thanks to Lightricks Team
Original repo — GitHub
Description
❗For correct operation, you need to update the ComfyUI-GGUF node
The workflow is built on separate nodes from Kijai, links to the models have been updated
Links to the updated vae and the lightweight distilled lora. Q6 model and text encoder are recommended as the most balanced
Resolution control logic has been changed: you can now switch between manual and auto resolution for both T2V and I2V generation
Added FPS adjustment from 16 to 60
Added VRAM Optimizer+ video memory load monitoring, allowing you to adjust the optimal value for your system
The workflow has been cleaned of all unnecessary details to reduce RAM consumption
Tooltips have been updated
FAQ
Comments (2)
Hi! I’m testing this workflow exactly as provided, without changing any nodes or models, and I’m consistently getting this error when it tries to load Gemma 3 GGUF as the text encoder:
Unexpected text model architecture type in GGUF file: 'gemma3'The same error also happens with Gemma 2 GGUF, so it doesn’t seem to be a corrupted file. The error comes from ComfyUI-GGUF loader.py during text model loading.
Could you please confirm:
which Python version you used to test this workflow (e.g. 3.10 / 3.11)?
and whether a specific ComfyUI-GGUF version or fork is required for Gemma 3 GGUF support?
I just want to align my environment with yours, without modifying the workflow.
Thanks!
@Marconasc_ Hi! I had the same error with outdated GGUF node. I test it on latest ComfyUI nightly version and GGUF node 1.1.10 (no specific fork, just regular node), it works well.
My config: Python 3.13.9, torch 2.9.0+cu130