Use LTX 2.3 as a Text-to-Speech (TTS) model.
This workflow id designed to generate audio only output from a prompt and a speech sample. The generated audio will clone the sample voice and apply it to the specified dialogue and prompt. If you want to make more than one video with a consistent character voice, then you can't rely on LTX's random voice assignment, so this workflow will give you a consistent voice that you can use to create whatever spoken script you want.
Voice cloning is made possible with the ID-LoRA models and new ComfyUI node, "LTXV Reference Audio". In my experience, the video generated with this LoRA isn't very reliable or high-quality, so I had better results from creating an audio file and applying that pre-made audio track to a new LTX 2.3 video based on a starting image. Those results haven't been 100% perfect either, but the success rate was higher than any other method I tried.
IMPORTANT: If you haven't updated your ComfyUI installation after about March 25, 2026, you will need to run an update to get the new ComfyUI native node for ID-LoRA.
Please see the ID-LoRA github page for important guidance on prompt formatting and usage. The default values in this workflow have worked well for me, but all the nodes are clearly exposed and labeled so you can tweak and experiment to get your own favorite results.
Github: https://github.com/ID-LoRA/ID-LoRA
One more piece of advice for generating audio tracks with LTX 2.3: The length of the generated audio clip is very important, probably more important than any other setting in the workflow. If the time is too short, LTX will rush through some of the script with almost no pause in between sentences and the result doesn't sound as natural as LTX is capable of doing. If the time is too long, LTX will stretch out pauses and sometimes repeat sections of the script. My recommendation is to say your script out loud to yourself, in a normal conversation pace, and time yourself doing it. Use that time as the duration for your clip and then adjust it longer or shorter as needed.
Finally, don't be afraid to generate multiple audio clips if you have a long script with several breaks or pauses and LTX can't seem to get the pauses right. It's much easier to combine audio files into one track than it is to combine video and there are lots of online tools to help with that. When you assemble your own final audio track, you can insert pauses as long or short as you want.