CivArchive
    LTX2.3 | Z-Image + Ollama T2I to I2V | 12GB Friendly - v1.0
    NSFW

    LTX 2.3 I2V | LLM Enhancer | T2I -> I2V | Detailer/Upscale

    Features:

    Turn any Text to Image to a Image 2 Video.

    Z-Image T2V:

    • Uses an initial relative simple prompt

    • Enhances the prompt via Ollama

    • Image selection (if more then 1 image is generated as draft)

    • Details and Upscales (via SeedVR2) the selected image

    LTX 2.3 I2V:

    • Image and (short context) pro,mpt is sent to Ollama for enhanced prompt.

    • Upscale sampler

    Ollama part can be skipped by flipping the true/false switches. Same goes for the T2I part if you want to supply your own image.

    Runs without issues on a 12GB GPU with a Q8 GGUF model.

    Note: For both the T2V and I2V Ollama enhancement, I'm using the qwen3-vl-abliterated-8b model myself (still need to evaluate a Gemma based model).

    Description

    FAQ

    Comments (8)

    JellaiMar 29, 2026
    CivitAI

    Are these examples using the full model? fp8/Q8? I assume not distilled because the quality is too good.

    dutchit288
    Author
    Mar 29, 2026· 1 reaction

    All with regular Q8 GGUF (not distilled). Most with the Vantage one (from https://huggingface.co/vantagewithai/LTX-2.3-GGUF/tree/main/dev), some others with Unsloth.

    JellaiMar 29, 2026

    @dutchit288 That's a relief. I grabbed the Unsloth one. Do you think it's about the same as Vantage? or are you team Vantage now? Thanks for answering these questions btw.

    dutchit288
    Author
    Mar 29, 2026· 1 reaction

    @Jellai The Vantage one does seem to be more consistent on longer (30 sec) videos, but haven't really compared enough to say that reliably.

    fakolonyaApr 26, 2026
    CivitAI

    Can I ask how 12 gb vram system manage to run around 24gb Q8 gguf model?

    also, how long does it take to make a video? (also provide length and resolution info pls)

    dutchit288
    Author
    Apr 26, 2026· 1 reaction

    You'd probably need to ask ComfyUI (and/or LTX) developers for the first one :), but it runs quite fine. Newer ComfyUI utilizes VRAM offloading.
    As far as generation time, length/resolution: Usually between 10 and 14 minutes for up to 30 seconds with 1408x769 and close to 20/25 minutes when pushing it to 40 seconds (any longer and quality seriously degrades in my experience).

    fakolonyaApr 27, 2026

    @dutchit288 can you tell me which exact models do you use in this workflow including z-image model if it is no problem? I will download and test workflow with the same settings you posted with your workflow. For example; I couldnt find Z image model that shows in the workflow (pornmaster one)

    Workflows
    LTXV 2.3

    Details

    Downloads
    790
    Platform
    CivitAI
    Platform Status
    Available
    Created
    3/28/2026
    Updated
    5/14/2026
    Deleted
    -

    Files

    ltx23ZImageOllamaT2ITo_v10.zip

    Mirrors

    HuggingFace (1 mirrors)