⚠️ If you see this workflow as updated, nothing has changed, I had to re-upload the file because the site has been broken lately...
This workflow is a modular and flexible text/image/audio-to-video generation system built in ComfyUI, designed to give full control over video creation using LTX-based models. It allows you to easily switch between multiple generation modes—such as text-to-video, image-to-video, lipsync, and fully guided animation—by enabling or disabling grouped nodes.
The pipeline supports advanced features including LoRA-based character and style conditioning, voice identity transfer (ID LoRA), custom or generated audio, and ControlNet-guided animation using reference videos. Users can also incorporate keyframe images for structured motion control or rely on a single reference image for consistent character appearance.
Performance and quality can be balanced through options like half-resolution sampling with 2× upscaling, as well as post-processing tools like the LTX detailer.
Main features
Modular, toggle-based workflow (quickly switch modes)
Text, image, audio, and ControlNet-driven video generation
LoRA support (character, style, and voice via ID LoRA)
Custom or AI-generated audio with automatic syncing
Reference images + up to 4 keyframes (FFLF animation control)
ControlNet video guidance with hybrid reference support
Half-res sampling + 2× upscaling for faster high-quality results
LTX detailer for enhanced final output
Common Setups
Text to video:
All bypassers disabled + Prompt + Default audioImage to video:
Prompt + Reference image + Default audioLipsync:
Prompt + Reference image + Custom audioAudio to video:
Prompt + Custom audio onlyCharacter LoRA + voice reference:
Prompt + Character LoRA + ID LoRA + Default audioVoice reference to video:
Prompt + ID LoRA + Default audio
OR
Prompt + ID LoRA + Reference image + Default audioCharacter animation:
Prompt + ControlNet + Reference image + (Custom or Default audio)First frame → last frame:
Prompt + Keyframe 1 + Keyframe 2 + (Custom or Default audio)First → middle → last frame:
Prompt + Keyframe 1 + Keyframe 2 + Keyframe 3 + (Custom or Default audio)Character animation with custom voice:
Prompt + Reference image + ID LoRA + ControlNet + Default audio
Detailed instructions are contained in the workflow itself:
Red nodes are instructions and useful notes.
Yellow nodes are configurable elements you can adjust to your needs.

Description
First version release