CivArchive
    LTX-2.3 All-In-One workflow for RTX 3060 with 12 GB VRAM + 32 GB RAM - v2.0
    NSFW

    [edit:

    24.04.2026: Update version 4.3 (see version description).

    Minor update and bug fix.

    Thanks to all users for the many inputs over the last days and weeks 🙂

    Attention:

    If you struggle with node conflicts or you get errors while running the workflow, please have a look at my short Trouble Shooting Guide note in the wokflow first. Most importent is to update all components sucsessfully! ]

    Special thanks to:

    @ArcleinSK for investigation and solving the FLF issue, as well as forcing the First-Mid-Last Frame option and last but not least for charing fantastic knowlage.

    @boinobin730 for initialising, forcing and supporting this project in all kinds of matter, like providing links, running tests, sharing knowlage and inspiring diskussions.

    @Urabewe for publishing the original, perfectly running 12 GB VRAM LTX-2.3 workflows mainly used here in this workflow.

    Features:

    Simple to use all-In-One LTX-2 workflow with options for:

    • Text to Video

    • Image to Video

    • First/Last Frame to Video

    • Fisrt/Mid/Last Frame to Video

    • Video to Video

    • Text + Audio to Video

    • Image + Audio to Video

    • First/Last Frame + Audio to Video

    • First/Mid/Last Frame + Audio to Video

    • easy switching between all options,

    • all steps highly automated: no manual frame or width/hight calculations necessary,

    • easy to set inputs by predefined sliders and aspeckt ratio inputs (no risk to set wrong frame counts or wrong width/hight values),

    • completely automated resizing and cropping (if necessary) of your input images/videos.

    • brilliant audio generation (speech/sound) with LTX-2.3.

    LTX-2.3 specifications:

    Workflow version v4.3 consistently follows the LTX-2.3 specifications for 16:9/9:16 aspect ratios, including automatic width/hight calculations, as well as automatic input image/video resizing/cropping.

    In addition you can simply choose now any other aspect ratios according to your needs while still getting the right values calculated for width/hight and automatic image/video resize/crop.

    Requirements:

    • GPU with 12 GB VRAM (some users reported they got it running with 8 GB too),

    • 32 GB VRAM,

    • Swap file size: 64 - 128 GB.

    Speed and video length:

    Runs very fast: 5 second (1280 x 864) Video: < 10 minutes.

    Generation of long high quality videos in one run possible: 10 - 20 seconds without any issues,

    Testrun: 30 second video (1024 x 704) tooks around 40 minutes without any OOM errors. Longer videos might be possible, but not tested yet.

    Important:

    This workflow is intended for advanced comfyui users who know how to install and operate the system and are able to resolve basic system errors themselves, like as node conflicts, or general system issues.

    About this workflow:

    This workflow is mainly based on the fantastic LTX-2.3 workflows of @Urabewe.

    As far as I know, those were the first workflows running LTX-2 with 12 GB VRAM. All credits goes to the original creator.

    My job was only to combine and organise the different workflows in a simple to use all-in-one design.

    Description

    • Added First/Last Frame to Video + Audio option

    • Links to Spatial Upscaler models corrected

    FAQ

    Comments (178)

    aigenre190226Mar 14, 2026· 3 reactions
    CivitAI

    AMAZING . FAST GENERATIONS AND EZ TO USE. THX SO MUCH. I'm using --use-sage-attention in the run_nvidia_gpu.bat file, and the generation time went from 15 minutes to 7 minutes. Then I use the REAL Video Enhancer program (available on GitHub) for upscaling and interpolation, and in 8-9 minutes I have amazing videos.

    arkinson
    Author
    Mar 14, 2026

    @aigenre190226 Thank you for your feedback and I`m glad you like it 🙂

    Yes, if installed allways use Sage Attention.

    Upscaling/Framerate multiplying: With LTX allways try to generate with highest resolution possible first. My "max" settings are reduced/speed optimized values for 12 gb vram. If you have a better gpu, increase resolution for significant better quality.

    Before I released the first ltx workflow I did a couple of test with inbuild Esregan 2x upscaling and Rife framerate multiplying as I do in my Wan workflows. What I saw was, it costs a lot of time, but the quality increase is only marginal.

    rupertmurdochtown816Mar 14, 2026· 1 reaction
    CivitAI

    Hi, I'm new to LTX generation and noticed something strange. im using the exact setup of this workflow for t2v but with quite a few loras, In the preview node, the early blurry steps have amazing composition and camera dynamics. But right at the end, the scene shifts and becomes much less dynamic—almost like it's being generated a second time and overriding the first i mean sure it now has the loras benefit like jiggle physics and stuff but the scenes dynamic was completely gone. How can I lock in the motion from those early steps with the benefit of the loras?

    arkinson
    Author
    Mar 14, 2026

    @rupertmurdochtown816 I havn`t much experiance with additional Loras, but I guess there is a lot of testing necessary - especially if you try to use more then one Lora. I would start very simple and systematic: short clips with low resolution, just 1 Lora + different seeds and prompts. Step by step you can try more.... In my experiance, right LTX prompting is the most powerfull "tool". And of course, do experiments with different clip length.

    @arkinson yeah its seams that lora affects the end result too much. its truly peculiar how ltx compared to wan, without loras ltx t2v is truly amazing but its a shame the physics part of ltx need some works

    arkinson
    Author
    Mar 14, 2026

    @rupertmurdochtown816 Yes, that`s right. And really have a look at ltx prompting style, if you came from wan, cause it is quite different and flux seems to be very prompt "sensible". I have to learn it for myself still. But what I saw - a good systematic prompt can change a lot - even most of us are allways on the try and error way 😂

    avinashvijayan96828Mar 14, 2026· 1 reaction
    CivitAI


    AttributeError: 'VAE' object has no attribute 'latent_frequency_bins' how to fix this

    arkinson
    Author
    Mar 14, 2026

    @avinashvijayan96828 All components up to date??? You used my Trouble Shooting Guide? Did you read the comments here? We had the VAE issue several times.

    DNFYMar 15, 2026

    I got that error when I selected the wrong VAE. Check if you selected all correct VAEs and also update your nodes.

    aigenre190226Mar 14, 2026· 1 reaction
    CivitAI

    AWESOME

    10 SECONDS OF VIDEO -> Prompt executed in 413.92 seconds

    THATS 7 MIN

    IM USING THE NEW DEVNVFP4 AND Sageattention

    arkinson
    Author
    Mar 14, 2026

    @aigenre190226 Uhh, what is DEVNVFP4???

    arkinson
    Author
    Mar 15, 2026

    @LewdAnimeEmpire Thank you for the link. I had a short look at it. To be honest, I`m afraid I never will really get the differences between all these models, no matter how often I try to understand it 🙄 Did you tried it?

    LewdAnimeEmpireMar 15, 2026

    @arkinson I tried to use it but an error appeared, I didn't delve deep to solve it, later I'll test to see if at least I can get more speed in video generation.

    arkinson
    Author
    Mar 15, 2026

    @LewdAnimeEmpire Do you know if it will run with 12 gb vram?

    aigenre190226Mar 15, 2026

    @LewdAnimeEmpire install Sageattention with cuda 13 pytorch 2.10.x, also u must search for ur .whl file. here -> https://github.com/woct0rdho/SageAttention/releases/tag/v2.2.0-windows.post4
    installing a Sageattention took me like 1 day or 2. But after learning how that works, ull be able to reinstall Sageattention in 10 min very EZ

    aigenre190226Mar 15, 2026

    @arkinson im using RTX 3060 12gb + 32 RAM + 128 page

    aigenre190226Mar 15, 2026

     @LewdAnimeEmpire reinstall all transformers , Pytorch, triton, etc. Ill be very hard at the beginning , but ull learn a lot . Also install and activate Sageattention int the .bat file. i have Ask gemini for a complete tutorial and worked for me.

    arkinson
    Author
    Mar 16, 2026

    @aigenre190226 Thank you. Could you please specify the exact version, files and download directories of the model you used? I only had a short look at the link @LewdAnimeEmpire published here. But there are a lot of different versions and some questions for the right files/directories.

    mHn6ru9B3Mar 14, 2026· 2 reactions
    CivitAI

    The workflow looks great and is well organized, unfortunately all output videos are completely blurred. When I use the original ltx-23-22b-gguf-workflows-12gb-vram workflows I get correct output, but not with this one... I checked and rechecked all models and they are the correct ones

    delta45424155Mar 14, 2026· 3 reactions

    did you add the lora? I forgot to and got blurry output.

    arkinson
    Author
    Mar 14, 2026· 1 reaction

    @mHn6ru9B3 All components up to date? No update errors? No node conflicts? Wich workflow version? Blurry outputs with all options?? Wich comfyui version, wich release version? Really no model mismatch??? Sorry, without usefull informations, anything is just guessing.

    arkinson
    Author
    Mar 14, 2026

    @delta45424155 Overlapping response. But I guess, you will be right 🙂

    mHn6ru9B3Mar 15, 2026· 2 reactions

    Thanks both to @delta45424155 and @arkinson it was the Lora as @delta45424155 suggested!! I was expecting to get an error in any missing model

    LewdAnimeEmpireMar 14, 2026· 3 reactions
    CivitAI

    Everything's ok and working 👏 Now we just need more speed to generate each video 😁

    arkinson
    Author
    Mar 14, 2026· 1 reaction

    @LewdAnimeEmpire Much more speed 😂

    delta45424155Mar 14, 2026· 7 reactions
    CivitAI

    Thanks a lot OP. Now i'm going to go to microcenter and trade in my 5080 to a 5090. And it is all your fault. I hope you're happy with yourself.

    arkinson
    Author
    Mar 14, 2026· 1 reaction

    @delta45424155 Please sorry for any inconveniences 🤣 and happy generating🙂

    DankOwlMar 14, 2026· 1 reaction
    CivitAI

    MediaMixer

    Status

    Install Error

    Node 'ID #201:616' has no class_type. The workflow may be corrupted or a custom node is missing.

    Nothing I do can fix the FinalFrameSelector node

    arkinson
    Author
    Mar 15, 2026

    @DankOwl Use my Trouble Shooting Guide (system updates + solving your node conflicts).

    nemikin333799Mar 15, 2026· 1 reaction
    CivitAI

    video to video works as Extend? or like Retake too?

    arkinson
    Author
    Mar 15, 2026

    @nemikin333799 V2V extends your input video without any visible changes.

    You would get a simillar result if you take the last frame of your existing video, run I2V and merge both videos finally.

    But V2V works much better, cause it extents the given audio + video "style" seamlessly (even if the last frames of your start video might be blurry). The only downside is, it tooks longer, cause the input video has to be calculated/re-generated too.

    DNFYMar 15, 2026· 2 reactions
    CivitAI

    Thanks for the workflow! It worked great after fixing some errors (outdated nodes and missing LORA).
    Specs:
    RTX3060 with 12 GB VRAM
    64 GB system RAM
    16 GB Swap
    10 seconds, 832x1216, t2va
    First run 660 seconds, next runs ~429 seconds.

    arkinson
    Author
    Mar 15, 2026· 1 reaction

    @DNFY Thank you for your feedback and happy generating 🙂

    Eliz99Mar 16, 2026· 1 reaction
    CivitAI

    Hii! I so new here, can someone share a correctly workflow for my GPU here? It's an RTX 4090. Can you help me? Thanks so much in advance! 🤗✨

    boinobin730Mar 16, 2026

    Current workflow 2.0 works fine. just ensure your comfyui is up to date, install any missing nodes, then download the models, CLIP, Vae etc etc.

    Eliz99Mar 16, 2026

    @boinobin730 Hii! But do you know which one (model) I should download for my GPU? I mean, so I can make full use of the 24 GB of VRAM 😊

    arkinson
    Author
    Mar 16, 2026· 1 reaction

    @Eliz99 Just start with the provided models and be happy with the extreme short generation times😉 If you get more experianced, first increase resolution as high as possible to get much better quality. After this you might to start struggling with better models, higher framerates, lot of testing, etc. or just looking for workflows wich better suits your hardware.

    ZombovichMar 18, 2026

    @Eliz99 Use the Q8 quant

    boinobin730Mar 16, 2026· 2 reactions
    CivitAI

    @arkinson Great job on the first frame/last frame workflow. It's very useful for my purposes. If you have the patience you can pretty much create a short film with it. I am not sure why I can't add to our last conversation. Maybe because of the Civitai Australia ban. I have turned on a free VPN associated with Opera browser. If you can read this, its working atm. Catch u later.

    arkinson
    Author
    Mar 16, 2026

    @boinobin730 Nice to see you back here and yes, your vpn seems to work🙂

    Thank you for testing the first/last frame. I got only poor results with my own quick-and-dirty tests (just used the same image for first and last frame und too short clip length to get out something "artistic").

    Kolompoi provided an interesting link (see discussion here) for video guided ltx generation. If I get it to run I will implement it too, cause it looks very interesting.

    boinobin730Mar 16, 2026

    @arkinson There seems to be more rapid development again in regard to workflows and LTX and even better video upscaling. I just installed Nvidia RTX Nodes. It upscales very well up to X2 have a look if you are interested. https://github.com/Comfy-Org/Nvidia_RTX_Nodes_ComfyUI

    you may need to upgrade your pip, since the pip might be too old to allow installation of the node.

    I threw my example to your gallery. I didn't use the last frame of the generated clip. I used another workflow to create another shot angle and used that as a guide. AS others have said the last frame usually gets very corrupted and distorted. I was thinking that an output of the last 5 frames would allow users to pick the best frame to use as a base for the first frame input. but I didn't get around to trying to work it out.

    arkinson
    Author
    Mar 16, 2026

    @boinobin730 Is it just me, or is ComfyUI still not working? The gallery here at the model page is empty and my videos published over the last few days are not shown in my video gallery nor in the overview gallery 😕

    Thank you for the link with the upsaler node. I will have a look at it. One user is "panicing" with new models (see here and here) but without any usefull informations. I`m not sure yet, if this is just trolling or if it have a real background....

    As mentioned, I can`t see your example here. And if I open your video gallery the latest videos I can see are 7 days old 😕

    Selecting the last five frames: I searched for "comfyui select frame" and found Frame Utility nodes. I did not used it yet, but it should be possible to select the desired frames. Maybe there is some simple math necessary, like total frames - 5 for example. If you like to test it, I would start with a simple workflow for quick testing, like this: Load Video -> Get total frames -> your math -> select frame -> preview image. Let me know if you get it. Or when i will find some time I can have a look at it too.

    Video guided ltx generation: I got it running finally (it only runs with the destilled models). It works similar we did with the Wan video guided workflow - if you can still remember 😂 However the results with ltx are amazing. The original workflow needs very long processing time and different manual inputs - I will see if I can "tweek" it for running faster and better usability....

    boinobin730Mar 17, 2026

    @arkinson You maybe right about the galleries still broken. I posted a lot of examples and it took a while for it to show on my feed. It has only just appeared in the gallery examples for this workflow. But no response from other users(no funny faces or likes), so I think a lot of people including yourself don't have access yet.

    Workflow is still working fine for me. I just upgraded to comfyui 0.17.2 with manager 0.39.2 . I'm not sure what that dude is whinging about. Honestly, your workflows is pulling in a lot of newbie comfyui users, so they are crying out for help for basic stuff, so you have become the local comfyui help desk.

    I even downloaded a ltx-2-3-22b-dev-Q8_0.gguf and its works on my 16gb vram. rendered a 1280x704 video clip first frame last frame, 20 seconds long no sage it took 20 minutes to create. I don't know whether its because comfyui is updated and nodes are updated, its just amazing. The quality is really good now. That was the 2nd video render, so not first off generation. I have always been running q4_m.gguf before the update, and 20 seconds took 30 minutes and would often freeze and get stuck. What was amazing is that possibly the stronger model gave me the strongest best output, previously I kept getting bad generations, for example the person was speaking, the firefighters were not moving, the fire was a static picture, the camera zooms into the action, and the fire fighters start moving a little and the fire starts, it was really bad. I cranked up the detail lora and finally got a good generation after about 8 tries. If my example shows up you can see what I am talking about, this example was created with the old q4_m model.

    Ok. So with the LTXv2 first frame last frame, at the end it will do a little shuffle move and display a few seconds of the first frame and then go back to the last frame. it does it to every generation. I don't know why, as I am not technical enough. I had to remove those frames in post using Davinci Resolve. Also has been noted a lot of those last frames will show words, closing type credits, massive colour artifacts so that the last frame is unusable. So any frame picking needs to accommodate that small shuffle move that occurs bypassing it so it captures the true last frame, if this makes sense. I'm going to be busy this week, so I can't play with comfyui too much but I will see what I can do.

    Video guided ltx? Are you talking about V2V? I tried it on your workflow and it stalled, or seem to stall. I got frustrated after waiting for a long time, so I never tried it properly. I will try it again because the comfyui got updated.

    This stuff is too addictive now. I'm being blown away by the speed of development. They have developed a wan animate type lora and workflow. .i.e. movement matching.

    https://www.reddit.com/r/comfyui/comments/1ru7itb/comfyui_tutorial_vid_transformation_with_ltx_23/

    Some dude did an inpaint workflow with sam3

    https://www.reddit.com/r/StableDiffusion/comments/1rvgmfp/i_like_to_share_my_ltx23_inpaint_whit_sam3/

    open source is such a gift to us.

    boinobin730Mar 17, 2026

    @arkinson I think what the guy is saying is go for the Fp4 models. https://civitai.com/models/2445970?modelVersionId=2750197

    as they are producing even better outputs. I need to download and check. He posted examples on the gallery, they do look good but I don't know his spec and I didn't check what checkpoint and loras he used until just now.

    arkinson
    Author
    Mar 17, 2026

    @boinobin730 Just a short anwer first: "I think what the guy is saying is go for the Fp4 models". The problem is, there is a lot of mismatch what he is saying. In the first post, we can assume he talks about Fp4 and he anwers my question with: "I have a RTX 3060 12 GB" In the last post he recommends the fp8 model🙄 Anyway. I believe you are right. I did not found the time yet, to dive in this stuff.

    arkinson
    Author
    Mar 17, 2026

    @boinobin730 Ok, I will try to anwer to "all" the other points too now 🙂

    Q8 model: sounds very good and I really believe that there is a big difference. Maybe I will give it a try on my mini-machine too.

    Randomly I got to see some of your videos. This one seems to be Q8 generated?

    Yesterday I saw some of your videos with the "artefacts" at the last frames. I guess this were FLF2V generations? So - if I understand you right, your idea was to remove the last "artefact" frames, not to select some frames for expanding the video in a new generation?

    One guy gave a hint for resolution issues shortly (without to specify the option he talks about). I`m not sure if this is relevant in practice but I will have a look at it.

    Video guided ltx: Noooo! 😆 Your short-term memory seems to be even worse than mine🤣🙂 Look at the link at my first comment here. You can use an existing video for controlnet + prompt/start image to generate a complete new video guided by the motion of the input video. It is really cool! But the given workflow is much too slow at the moment.

    I had not looked at your last links yet. Too much stuff in too short time - that’s the flip side of open AI 😅

    boinobin730Mar 17, 2026

    @arkinson The firefighter clip was a q4 cherry picked edited clip, definitely not straight flfv. So it was a highly controlled experiment to see whether a suitable storyline continuity can be achieved by guiding LTXV2.3 through images. Yes. definitely not pure I2V generation, then use the last frame, to make a new clip. I2V clip editing is needed to achieve this. Yes the artifact frames need to be removed for this ability to occur.

    It makes sense what the guy is saying about the end breakdown, but does that mean we are regulated to a very specific set of sizes for optimal video generation. It means initial image seed generation also needs to be regulated for best results that will avoid end artifacts?

    My wife always complains about my memory. So this is not surprising. Ok I didn't follow his link initially. Yes. that is the animated control I mentioned before. I will have a look at it when I get the chance. As you mentioned Wan animate did something similar but we were limited with time and it took a lot of time to create.

    The Q8 model does work well after some more basic tests. The detailer lora definitely needs to be on for any sort of active elements like fire and water, happening. I will post some q8 examples to the gallery soon.

    arkinson
    Author
    Mar 17, 2026

    @boinobin730 Just short for today.

    "we are regulated to a very specific set of sizes": No. Resolution (length and width) simply should be divisible by 32. I had, of course, taken that into account:

    T2V: you can choose only the right values from the "Video Width and Hight" node.

    I2V and V2V: I've configured the "Image Resize" sliders so that you can only set the correct values. There is only one little issue. This way only the longer side gets the right value, but the shorter side might just be a bit off.

    But I`m pretty sure, this is all important in theory. In practice we are more limited by the model itself. In my quick tests I can not see any difference.....

    Anyway, the math for correcting the shorter side too is simple. But there is one thing in my workflow I do not like: Actually I have to maintain 6 options with 6 complex subgraphs. There are only small differences in each subgraph for each option, so I would like to run only one subgraph for all options. This would made maintainance and all future editing much easier. Unfortunately, comfyui provides no true "sub-programming", only the wired "bypass" logic. So actually I am struggling a lot to find a solution wich is usable + understandable for "common" users and finally myself 🙄

    boinobin730Mar 17, 2026

    @arkinson I don't actually think your workflow is broken. Maybe not optimal as far as that user is concerned but it works. So maybe I'm just confused, the reason you are attempting to adjust the workflow is to avoid the artifacts at the end? So with this in mind, if we use an optimal frame rate, with the current existing workflow, we shouldn't see artifacts? I am going to post my q8 examples. Just for testing so you can see the reproduction quality, no effort in generating interesting output. I will post gen times as well.

    arkinson
    Author
    Mar 18, 2026

    @boinobin730 Uhh - sorry, that was probably incomprehensible. So you got me completely wrong. The existing workflow is ok for using/generating of course. And as mentioned, I do not see a problem in the small issue with not exact image resizing.

    I meant something completely different: <MY own> specific problem with maintaining/editing/further developing the workflow is caused be the actually 6 separate groups + subgraphs (one for each option - and sooner or later maybe 7 or more). That’s why simple editing is finally a lot of stupid work - you have to do it in every single group/subgraph. I simply realised that the current design is a dead end for further development.

    So I`m actually working on a complete re-design of the workflow to a very simple design (with a quite more complex logic in the background of course). Yesterday I found some custom nodes which are perfectly suited to controlling the necessary logic in a user-friendly way. The main concept is ready, but I am still struggling with a lot of "bugs" and issues in detail. But I'm optimistic and I think that it will become much easier to use - on both ends: "user experiance" and maintainance....

    boinobin730Mar 19, 2026

    @arkinson no worries. I am a bit slow on the technical aspects in regards to Comfyui, so I must apologise for not understanding the complexity involved moving forward. I think that's why a lot of people make 4-5 separate workflows for each task as it makes troubleshooting easier. Let me know if you want me to test functionality, I can help out with that at least.

    boinobin730Mar 19, 2026

    @arkinson I found this link regarding an LTX workflow with multiple frame injections. I'm not suggesting you do this, but the layout/workflow setup may help you in reconfiguring your workflow. https://www.reddit.com/r/StableDiffusion/comments/1rxl1ta/ltx23_jason_statham_in_30min_or_its_free_teaser/

    https://drive.google.com/file/d/1w5jiaPFzMhOCGLe8UUKfO1pM0roxvELg/view?usp=sharing

    workflow is on the google drive. He is running serious hardware being RTX 6000 so keep that in mind.

    boinobin730Mar 19, 2026

    @arkinson You must be thinking i'm leading you astray. Sorry the mods just killed the post. I don't know why. but it was impressive considering it was done on LTX . anyway, the workflow might give you clues as to LTX workflow knowledge.

    arkinson
    Author
    Mar 19, 2026

    @boinobin730 I just had a short look at your linked workflow. If I got it right, it is a multi frame solution. Interesting and something to keep in mind - cause this could be used to replace the first/last frame option as a general multi-frame option...

    ".. .why a lot of people make 4-5 separate workflows...." No, the opposite ist right 🙂 Take a simple example. Let`s say you have 7 options -> so you would have 7 separate nearly identical workflows. If you simply would like to change one model, you have to do this in 7 workflows - and finally you have to publish 7 new workflows 🤢 No programmer in the whole wide world would go this way 😂🙂 What I would say is: in every programming language this is a simple task, you just would use a simple sub-routine. In comfyui it is quite a little bit more "tricky" to handle "options".

    boinobin730Mar 19, 2026

    @arkinson oh ok... I take your word for it. I'm not a programmer. 🙂

    yes. its a multi image input at different frames, I think. I am just not sure how sound would work. 1 sound file for all? It's a shame they deleted the video, because it was very good for LTX, all things considered. I know there is a workflow for something less ambitious, being a first- middle - last frame. https://huggingface.co/RuneXX/LTX-2.3-Workflows/tree/main You may have seen it. I haven't had much time this whole week, but I plan to look at it soon. just busy.

    arkinson
    Author
    Mar 22, 2026

    @boinobin730 I haven't looked in here for "ages" 😅 I`m still struggling with the new worflow. Comfyui is sooooooo limited. After 2 days of working I got that "nested" subgraphs can get corrupted in comfyui after some executions. This buggy stuff is really annoying ☹️ And handling more complex options only with the barely implemented "if else" nodes and the silly switch nodes is annoying too. I did a lot of research over the last days, testing lots of custom nodes, but it seems, comfyui is still limited to very basic progrmming logic only. Well, I might spend a few more days on it. But if necessary, though, I’ll have to give up on the idea. It would be a shame, really, because I do have a few ideas on how to expand on the whole thing 🙄

    Meanwhile, if you're bored or you like to test something, you might look here and here 🙂

    boinobin730Mar 22, 2026

    @arkinson Thats ok. sorry you couldn't make headway into the comfyui limitations. I think Comfyui is a work in progress at the best of times. Hence my ideas of keeping workflows simple makes it easier to troubleshoot and remain efficient. It is too infant in its development at this stage.

    I've been playing around a bit with your workflow and I was playing around with this new node. Just to keep you informed . https://github.com/ID-LoRA/ID-LoRA/tree/main Don't get me wrong i'm not asking you to implement. I think it needs a more v-ram to work for us but it is a very big step towards having control over our clips. I couldn't get it to even install, I'm not sure why, I installed requirements. I will have another crack at it later on.

    That easy prompt has had a few new iterations, so I might play with that again. Hopefully the bugs are ironed out.

    I just had a look at your discussion points. Funnily enough I have had experience with both issues and 1 just only recently. I will comment it may shed some light.... or my comment might make it even murkier. Digging deeper doesn't always help sometimes with this stuff.

    arkinson
    Author
    Mar 23, 2026

    @boinobin730 Thank you again for responding in the other comment line 🙂 The reference to i2v has effectively narrowed down the possible causes of the error.

    Uhhh - to much new stuff actually 😂 I had only a very quick view at your new link. With some more time I will come back here. That`s all too interesting 🙂

    I´m actually try to force the new workflow update. Last night I made some advantages in the right direction after cleaning up everything and symplifying the option/switch logics. I`m still hustle around with automatic triggered multiple node bypassing and hunting bugs of course🤣

    The biggest problems are usually caused by custom nodes, which are often extremely poorly documented. So it allways needs a lot of time to find the right solution by try and error.

    boinobin730Mar 23, 2026

    @arkinson No problem at all, It was more a coincidence as I was wondering why it was so hard to prompt, yet the prompt instructions were straight forward. I will rewire it and see if your idea is correct. I've been playing around more with Flux since I was so behind in it from other things. Speaking of other things, that last link.. It was tricky to install correctly, there is a lot of things that was missing from my comfy. In the end I managed to get it installed but I realised that it is using the full safetensors ltxv2.3 no gguf yet. So I figured it will probably OOM on me.

    Your are always toiling away at these workflows. Good job. You probably don't get time to enjoy the fun stuff of using the workflows when the next few days there are more nodes that are made to add to the workflow. The madness never ends.

    arkinson
    Author
    Mar 25, 2026

    @boinobin730 Hi - it`s me again. I have the main parts of the new workflow mainly running. But I've just noticed a few really basic things researching the internet. Perhaps you could have a look at it too:

    1. LTX-2.3 resolutions: 1080p, 2K, 4K <- ok, we have to use much smaller resolutions.

    2. aspect ratio: 16:9 portrait/landscape only <- this could be really important.

    From the LTX-2.3 comfyui template I see it uses a setup resolution of 1280 x 720. Both values are divisible by 16 and the aspect ratio is 16:9.

    So I assume, best way would be to limit/resize all resolutions to: 1280 x 720, 1024 x 576 and 768 x 432. What do you think about it?

    arkinson
    Author
    Mar 25, 2026· 1 reaction

    @boinobin730 Short update. I have implemented the obove settings and run first quick tests. Seems I get better results with 16:9 only (no artefacts and face distortion so far). If I get all options running properly I will publish the update for "public" testing.

    boinobin730Mar 25, 2026

    @arkinson That's great. I'm glad you are making progress. From my experience I have been getting better results in landscape mode, it doesn't hallucinate as much, I feel like it better perceives the environment around it. It will still have trouble with basic physics such as a character walking towards a car door to open it. I had so much trouble with it even in landscape mode. Possibly I didn't give it enough time for the actions. But people and body parts of people are very good in landscape. Portrait on the other hand is nightmare fuel when it comes to a simple pan shot. For example I did a portrait pan shot of a woman, it panned across a group of women sitting on bunk beds and the nightmare fuel of distorted limbs was huge. Total immersion break.

    I haven't played around too much on SPECIFIC exact landscape video clips. I have noticed though doing a large square of 1024 x 1024, will increase compute time immensely. I think our GPUS are way underpowered for a square. I don't know why. but anything approaching square is not liked. I think 16:9 would probably be the optimal we should be aiming for but everyone's use case is different. Looking forward to it when you are done.

    arkinson
    Author
    Mar 26, 2026· 2 reactions

    @boinobin730 Thank you for your quick reply. I did not know before, that ltx-2.3 was trained especially for 16:9 aspect ratios (landscape and portrait).

    In the old workflow I "allowed" all aspect ratios divisible by 16 (and I not checked the shorter edge) wich might have lead to some reported issues.

    In the new workflow I restrict aspect ratios consistently to 16:9 (landscape OR portrait) AND both sides divisible by 16 as mentioned above. Internally it is a little bit more "complicated", cause the setup resolution gets halved in the first pass and then doubled in the second pass (2x upscaler). So the lowest resolution 768 x 432 is after halving only divisible by 8.

    The both higher resolutions should work properly and I do not want to exlude the lowest resolution, cause there are some guys running it with 8 gb vram.

    I still have to check the FLF option. There are some differences between urabewes version and the template workflow. Will try to get out the best of both... 🙂

    boinobin730Mar 27, 2026

    @arkinson Sounds like you are making serious progress. LTX seems to get better and better.

    Have a look at this interesting reddit thread. It might help you regarding sound. he even put up a workflow. https://www.reddit.com/r/StableDiffusion/comments/1s50fji/i_think_i_figured_out_how_to_fix_the_audio_issues/

    talk soon.

    arkinson
    Author
    Mar 27, 2026· 1 reaction

    @boinobin730 Hi, thank you for the link. Maybe something for later experiments. He uses the large models and a 3 pass generation. So the whole generation process is quite different and it is hard to say wich component makes the difference in audio generation finally.

    I learnd something more in the meantime: LTX-2.3 definitely needs 16:9/9:16 aspect ratios and our specific workflow needs additionally With AND Hight divisible by 32. This leaves us with only three realistic start resolutions for our hardware: 1536x864, 1024x576 and 512x288 (portrait/landscape). Highest resolution works well up to around 16 seconds clip-length (with 20 seconds I got OOMs). So I believe the resolution range is suitable for most users.

    Unless I'm on drugs, the results are much better now 🙂 No, serious - It might be that this is just my own subjective impression after struggling for several days with this stuff, but I would argue that the differences are striking 😇

    FLF is up and running too now. There is just the known issue, that some first/last frames often corrupted. So I added an option for automatic cut off fisrt/last frames.

    I will cleanup everything for more graphic "aesthetics" now, add descriptions and check for last errors/mistakes. If everything runs as it should, I will try to publish it today or tomorrow.

    arkinson
    Author
    Mar 27, 2026

    @boinobin730 Workflow version 3.0 is out now 🙂

    boinobin730Mar 27, 2026

    @arkinson Excellent!!. Thank you very much for your tireless effort. I will have a play late tonight. I have to go out today (worse luck).

    arkinson
    Author
    Mar 27, 2026

    @boinobin730 Poor boy 😂🤣

    boinobin730Mar 30, 2026

    @arkinson I finally got time to play with your workflow. It's excellent. Sound is so much better even in t2V. I will throw some of my examples as well as some i2V + audio to the gallery. I haven't played with the FFLF. But I can see how much more complicated this workflow is compared to v2.0 , so well done. You did a terrific job.

    I finally found a decent lip syncing lora that fixes my problem with LTX 2.3 not lip syncing to the voice. Check this one out if you are interested. https://huggingface.co/elix3r/LTX-2.3-22b-AV-LoRA-talking-head

    It even does a fairly good job on a non human. see my singing dancing cat video in the gallery.

    arkinson
    Author
    Mar 31, 2026

    @boinobin730 Oh my- the cat videos are stealing the show here and we become more and more "philosophic" dialoges - i like this stuff 🤣🙂

    I will test your suggested Lora soon, cause I also saw some random issues with lipsyncing.

    I believe, focusing on 16:9/9:16 standard aspect ratios (divisible by 32) has changed a lot, even there was a "rebellion" against it in the beginning. I got lot of questions, like: "that`s all useless - how should I use my square images now", etc. 🙄

    As I see from the comments, there is a growing number of users with better hardware. So I will expand the slider restrictions to 2k and 4k in the next update. With my baby gpu I never mind resolutions higher then 1536 🙄

    FLF is still buggy, see here. (if civitai only opens the model page instead of the right comment, just open Alden_Alzawas last comment manually).

    I believe I scrached the limits of comfyui with the new workflow - but I learned a lot for myself. The main problem with comfyui actually is: there is no runtime logic possible and there are no real subroutines. The subgraphs are very usefull for graphic "cosmetics" and to keep track of things, but they are not comparable with subroutines to organize more complex "processing". So, finally, with every node you add, error debugging gets significantly more complicated.

    boinobin730Mar 31, 2026

    @arkinson Yes, cat videos are the best. I loved all the SORA cat videos but alas SORA is now dead.

    Your v 2.0 workflow is still useful, in fact I think it allows for those times where you are not generating in the best optimal resolution. I know I will still be using it especially since they updated the spatial node to 1.1.

    I will have a look at Aldens problem later on. FFLF was something i didn't get around to testing nor V2V. I have started to play around with Qwen3 TTS voice models. Yes. too many toys and not enough time. I also found out I've got real work on in the next few months, so I will be not able to even play around with Comfyui or any stable diffusion. Worse luck.

    I am starting to understand the comfyui limitations. I am just amazed at some of these workflows that people produce. I don't even know where to begin in crafting a workflow from scratch to making it do something elaborate. I just tinker around and change a few nodes.

    I think in time Comfyui will get better and better but I'm pretty sure the long game is we are going to be paying a subscription service to use the advanced models of Comfyui. Just recently there have been a lot of rushed updates and broken nodes after the update and a lot of people are complaining. Why are they rushing ? I guess they want to develop faster.... for what reason. I don't know... but the cynic in me says its money opportunities.

    arkinson
    Author
    Apr 2, 2026

    @boinobin730 I just tested your "talking head" Lora. Yes, perfect - I got lip-synced the complete speach with an I2V generation at the first try now, wich did not worked before. I will do more experiments, but it seems to work well.

    I’m considering whether to relax the aspect ratio restrictions again in the next update (just to appease the hardliners 😂). For example, we could simply make the length and width selectable (each divisible only by 32, with text hints on the best values). That would completely eliminate the need for the subsequent calculations and various options in the subgraph.

    On the other hand, I find it much easier to use at the moment: there’s no need to manually enter two numbers that are hard to remember, and everything runs like clockwork....

    FLF: Today a really good user hint cames in. I will test it.

    Comfyui in generall: Unfortunately, what you say is how the world goes round. Yes I believe too, that the "exploration" and fun times are limited. I don`t know wich way comfyui will go in the future, but I think civitai will go the commercial way first.

    Btw. For me, the most dramatic incident was Google’s takeover of Panoramio many years ago and the subsequent demise of the service. Since then, the world has not seen any useful georeferenced photo service anymore ☹️ Oh, and the collapse of the original music community, last-fm really hit me hard too....

    boinobin730Apr 4, 2026

    @arkinson I think relaxing the aspect ratio is ok. There are probably a lot of people still quite happy using v2.0 I haven't been doing much LTX atm, just trying to make loras again. The Illustrious lora, wasn't very good, but I think its the checkpoint as the Pony loras are better. Just a lot of time wasting.

    I should get back into the FLF videos. they have a lot of depth for creative stories.

    I think there has been some improvements in the ID Lora, I showed you a few weeks ago. I will try to get around to playing with it. It looks promising if it can work on low v-ram.

    Yes its a shame about certain digital facilities dying. I didn't know about Panoramio. In fact Panoramio would have been a great tool for using source images to create a checkpoint that would allow users to generate geographically specific images. It would of been fantastic to say I want a panoramic shot of the Australian outback, or the winter wilderness of Canada's British Columbia.

    arkinson
    Author
    Apr 5, 2026

    @boinobin730 Just a quick reply before sleeping 🙂

    Panoramio was the world wide photo community on Google Earth before this horrible Google Photo stuff nowadays (or however it is called now). You found lots of photos from amateurs, enthusiasts and sometimes skilled or professional photographers around the world, even from very remote places. I used it nearly all day - for planning journeys, exploring cities, countries, landscapes, following the tracks of others, uploading pictures of my own journeys, or just for dreaming to travel around the globe.... Google Photo never reached the quality of this dedicated community.

    boinobin730Apr 5, 2026

    @arkinson Yep. It's a shame. Big business always kills cottage industries and grass roots activities.

    arkinson
    Author
    Apr 7, 2026

    @boinobin730 I was quite busy the last days re-designing and preparing the new workflow update. Version v4.0 is out now 🙂

    Solving the FLF issue leaded to a complete re-design of the inputs and pre-calculations. Yesterday I found a smart node wich handles the pretty complex aspect ratio inputs and width/hight calculations for most use cases in just one node. And by adding some simple tricks the whole process for resolutions, any aspect ratio inputs or automatic determining by input images/videos, as well as correct width/hight calculations including automatic image resize/cropping is now pretty simple done.

    So it was quite easy to implement the next step: Fisrt-Mid-Last Frame to Video gereration.

    Btw. Your hints a while ago and the last discussions with ArcleinSK motivated me to have a look at Qwen again. Wow, thats`t really more then I had expected. I had never mind that it works so well with 12 gb vram now. Just the first quick tests gave me a good feeling about the possibilities. Do you know if there exists any Lora trainer for Qwen now, wich works with 12 gb vram?

    I don`t believe it - the workflow is just 3 hours out and there are more then 80 downloads 🙄These guys here are crazy 🙂

    ID Lora: There was also a more unspecific question from another user. I have not tried it for myself yet. But it seems it will need much vram?

    boinobin730Apr 9, 2026

    @arkinson Wow. you are so efficient. I haven't logged onto Civitai for a while. But I will grab your v4.2 now. and have a play with it soon now that I am not so busy. Thanks for putting in the effort. I think you are probably one of the easiest creators to talk to as you are very invested and passionate. Some creators are very quiet. Perhaps its a language thing too.

    I've been using Qwen a bit as I am finetuning my character Lora creation. Still on Pony Lora creation but I am now 90% consistency in the lora creation now, before it was about 70%. I'm still learning a lot though through trial and error. creating the initial image dataset is a real pain in the ass. Theoretically I should just not worry about it, as the models are so good at generating from I2V regarding Wan and LTXv2.3. I think its becomes more of a matter of stubborness to get it right, if you know what I mean.

    Qwen is really good for setting up a story board situation. So with the angles lora, you can keep adjusting the camera position until it is just right. Then using first frame last frame, direct the action. Now that you have created a F M L frame its is going to be superb video direction. So good...!!!!

    use this for angles.

    https://civitai.com/models/2099912/qwen-edit-angles-multiple-angles-lora

    storyboard- i swapped it for newer models but there are other storyboard workflows.

    https://civitai.com/models/2065461?modelVersionId=2337222

    I will get around to looking at id-lora. again. just busy. and soon busy with real life stuff as well.

    Umm also, a bit of gossip and a becareful. You know how I was using a workflow for prompting and the vision module. The dude might be a bit sus. People claimed he was encouraging people to run a batch file that modified code to mine crypto. He got annoyed and wiped all his records . Civitai and github as well. read it here. https://www.reddit.com/r/StableDiffusion/comments/1s90hwm/not_a_fan_of_this_subreddit_anymore_peace_lora/

    I thought it was weird it wasn't showing up in comfyui manager. Quite possibly it was true. I dont think the early versions were spiked. but I don't want to touch it now.

    arkinson
    Author
    Apr 9, 2026· 2 reactions

    @boinobin730 Nice to see you here 🙂 Yes, real life can be a challange....

    Here on civitai are still a lot of open minded people, interested to share there knowlage and experiances, just to improve things together - in "real life" you often see the oposit 🙄

    Actually it is quite a lot of work to answer the questions regarding the new workflow version. And often it is hard to get - is it just a "newbie" question, is there a bug in the workflow or has comfyui updated something. Finally I did not find the time to run the workflow for myself. But the good thing is: it won’t make you more stupid at the end😅

    I tested the Qwen multi-angle Lora allready. It works really well. I will have a look at the story board too. With these tools the multi-frame generations become more and more interesting. ArcleinSK did some really cool generations.

    I am allways impressed how many work you invest to perfect a single character. To be serious, this would kill me 🤣🙃

    Mmh, strange story with Lora Daddy. Thank you for the Link.

    dsa520123Mar 16, 2026· 2 reactions
    CivitAI
    I encountered three problems, and here are the problems and solutions: 1. Error "VAELoaderKJ" Solution: Update KJnodes 2. Size mismatch for decoder.conv_in.conv.weight: copying a param with shape torch.Size([1024, 128, 3, 3, 3]) from checkpoint, the shape in the current model is torch.Size([256, 128, 3, 3, 3]). or 'Linear' object has no attribute 'weight' Solution: Update comfyui 3. ltx-2.3-spatial-upscaler-x2-1.0.safetensors can't be found Solution: Put in models/latent_upscale_models If you don't have the "latent_upscale_models" folder, create one yourself.


    by the way,this workflow is goooooooooooooooooooooooooooooooooooooooooooood
    very thank!!!!

    arkinson
    Author
    Mar 16, 2026· 1 reaction

    @dsa520123 Thank you for your feedback and buzzing 🙂

    latent_upscale_models folder: Thank you for the hint. You are right. In the workflow I accidentally published the wrong path. Will fix this in the next update.

    aigenre190226Mar 16, 2026· 1 reaction
    CivitAI

    URGENTLY BRO!!!!!!!!!!! U HAVE TO UPDATE UR WORKFLOW!!!!!!

    Model VideoVAE prepared for dynamic VRAM loading. 1384MB Staged. 0 patches attached.

    Model LTXAV prepared for dynamic VRAM loading. 24206MB Staged. 0 patches attached.

    100%|████████████████████████████████████████████████████████████████| 8/8 [00:35<00:00, 4.39s/it]

    Model VideoVAE prepared for dynamic VRAM loading. 1384MB Staged. 0 patches attached.

    Model LTXAV prepared for dynamic VRAM loading. 24206MB Staged. 0 patches attached.

    100%|████████████████████████████████████████████████████████████████| 3/3 [00:39<00:00, 13.05s/it]

    0 models unloaded.

    Model VideoVAE prepared for dynamic VRAM loading. 1384MB Staged. 0 patches attached.

    Requested to load AudioVAE

    loaded completely; 3974.02 MB usable, 693.46 MB loaded, full load: True

    Prompt executed in 103.02 seconds

    103 SECONDS FOR 5 SEC VID -> 540x960

    using https://huggingface.co/Lightricks
    using the distilled fp8 version

    GET AMAZING RESULTS!!!!!!



    Zeb101Mar 16, 2026

    What is this about?

    arkinson
    Author
    Mar 16, 2026

    @aigenre190226 Sorry - but without any information this is pretty useless.

    agentgerbilMar 16, 2026

    sounds sus

    boinobin730Mar 17, 2026

    So are you saying. Go for the new models. https://huggingface.co/Hippotes/LTX-2.3-various-formats ?

    Pytorch needs to be upgraded people. I'm running 2.91 So I need to upgrade to 2.10 to run this apparently.

    @aigenre190226 what is your GPU? what are your gen times like?

    Psy_pmpMar 17, 2026· 2 reactions
    CivitAI

    The Video Width and Height node is ruining the image. The FINAL resolution must be a multiple of 32, because the VAE decodes as 2×2×2×2×2 = 32. If the number is not a multiple, the VAE breaks and artifacts appear along the edges. The math needs to be adjusted. It should be calculated based on the FINAL result and the spatial upscaler, because a spatial factor of 1.5 often produces non-integer values.
    I made this simple node https://github.com/PsypmP/two_stage_resolution

    arkinson
    Author
    Mar 17, 2026

    @Psy_pmp Thank you for the hint. Wich options did you tested?

    Seems I have overlooked a "pretty detail" at least in the image/video to video options. Resolution is set by the slider nodes, wich garanties a multiple of 32 for the longer side value. But you are right, the shorter side will allways just resized without checking and adjusting. I will check this.

    But a short question first: do you run allready some real side-by-side tests - or is this just theory?

    What do you mean with ".... and the spatial upscaler"? We use the x2 spatial upscaler model. In my opinion anything should work fine, if the right "start" resolution for the generation process is set - or am I wrong?

    saurabhwe4u689Mar 17, 2026· 1 reaction
    CivitAI

    Why are the colors fading away after the video generation

    arkinson
    Author
    Mar 17, 2026

    All components updated? Checked for selecting the right models in each loader node??? Color issues with all options?

    seanburtonnorwich118Mar 17, 2026· 2 reactions
    CivitAI

    Although this is hands down the best video workflow I've ever tried (I've tried a lot) I don't seem to be able to keep face identity past maybe the first second or two and it usually settles for something else.... I'd be grateful for a tip or two on how best to achieve keeping identity.

    Anyway, this is an excellent workflow and I appreciate the time you put into it 👍

    ntrtalesMar 17, 2026· 1 reaction

    I've been testing V2V, extending a video (mostly an upper body/face talking) and so far the generated results with 2.3 are worst than 2.0. With 2.0 the generated face is more similar to the original video. And the face gestures are not over exaggerated. The same applies to I2V in my case.

    The positives of 2.3 for V2V:

    1. The sounds seems more clearer (but the voice cloning doesnt seem to be better)

    2. Extending a video in other language now works.

    I gave up testing for the moment. I will keep using the WF for 2.0 in the meantime.

    arkinson
    Author
    Mar 19, 2026

    @seanburtonnorwich118 Hi - thank you so much for your feedback. I´m glad you like it 🙂 The issue with not keeping the face identity I saw randomly in my tests too. Sometimes it helps just to use a complete different start image or another prompt. But sometimes it is really annoying. I would say, I have a simillar experiance like @ntrtales. LTX-2 works still better in some cases.

    cell943Mar 18, 2026· 1 reaction
    CivitAI

    Seems to do pretty well, but prompting against music output seems really tough. Sometimes it will listen to pos/neg prompts against it, but other times it insists my Cthulu rampage video needs some weird Christian rock anthem in the background.

    arkinson
    Author
    Mar 19, 2026

    Do a search for LTX prompting guides. Right LTX prompting is very essential in my experiance.

    rajabadri385Mar 18, 2026· 2 reactions
    CivitAI

    i am getting the following error : RuntimeError: ERROR: VAE is invalid:

    arkinson
    Author
    Mar 19, 2026

    Please read the comments here - or even better, my Trouble Shooting Guide first 😉 You have to update your system.

    maythegiant494Mar 18, 2026· 1 reaction
    CivitAI

    getting a lot of out of memory errors on 3060 12GB, weirdly re-running the job sometimes works. Impossibile to use at high def, I'm curious how many actual 3060 users tried it...

    ntrtalesMar 18, 2026· 1 reaction

    3060 12GB but i have 128GB RAM. using Q8. doing 720 x 1280

    I only get OOM errors when doing 20 seconds videos (V2V)

    same setup, didnt get OOM errors when using the old WF for ltx2.0

    maybe you need more ram and not have other programs in use/open.

    maythegiant494Mar 18, 2026

    @ntrtales then maybe there's something wrong with my setup. 32GB RAM but the oom are about the VRAM

    arkinson
    Author
    Mar 19, 2026

    @maythegiant494 Mmh - VRAM errors are strange. Did you updated all components according to my "Trouble Shooting Guide"? @ntrtales might be right - do you set your swap file size as mentioned in my description???

    arkinson
    Author
    Mar 19, 2026

    @ntrtales Yes, LTX-2.3 needs slightly more VRAM then LTX-2, especially for the preview. 

    maythegiant494Mar 20, 2026

    @arkinson Yes I followed the guide; I do get some generations without errors, but some fail. Even 1280 px res does work, when it works. If I re-run the failed jobs multiple times (same parameters), eventually they complete without a fatal oom. It's really weird to me. Thank you so much for your work, the results are amazing.

    arkinson
    Author
    Mar 20, 2026

    @maythegiant494 Just to be sure - what is the size of your swap file?

    maythegiant494Mar 20, 2026

    @arkinson I've gone up to 64 Gb but only about 16-20 Gb max was used. I'm on Linux so the os doesn't need a lot of memory by itself. I also switched off zram as I suspect that might have been at least part of the issue.

    arkinson
    Author
    Mar 22, 2026

    @maythegiant494 Sorry, on Linux I am out and in my experiance there is not much knowlage here. If I remember right, some month ago a user reported a similar issues with my Wan workflow on Linux, but i never got a feedback, if he solved it. If you running it on a rented service I would try a more capable machine.

    maythegiant494Mar 23, 2026

    @arkinson I think I got to the issue: basically I used I2V with non-standard image sizes and SamplerCustomAdvanced was crashing (not sure why it would throw an OOM error). Just using a square input image solves it and the workflow works every time.

    arkinson
    Author
    Mar 23, 2026· 1 reaction

    @maythegiant494 As mentioned, I can only speak for windows: As long as you use my input nodes for image sizes there should be no issues, cause my nodes guarantee the right sizes. There is only one point (but I mentioned it in the description): if you use i2v or v2v with a square image and you set the slider for resizing at 1280 for example, then you will generate a 1280 x 1280 output and this is much too high of course.

    maythegiant494Mar 23, 2026· 1 reaction

    @arkinson Yeah I spoke too soon, it still crashes with squares. Now only 1024 px size works, not more and not less... it's like if a butterfly sneezes in Peru I get an OOM lol

    arkinson
    Author
    Mar 24, 2026

    @maythegiant494 Yeah - butterflies can do terrible things 😂

    No serious, if you run your Linux locally I would check the swap file size first (<- don`t know how it is called in Linux). Cause the newer comfyui core management automatically "swaps" hughe amounts of data between vram and ram like this: vram is full -> ram, ram is full -> swap file. This means: if your (ram) swap file ist to small you can even get vram errors.

    honryindianMar 18, 2026· 1 reaction
    CivitAI

    How do I use T2V without distilled lora? I don't understand how does the sigma node works

    arkinson
    Author
    Mar 19, 2026

    @honryindian Sorry, I don`t know this for myself. You might try to find workflows to adopt the right settings. Did you had a look at the templates allready?

    abyss259362Mar 20, 2026

    you need the distill lora, distill lora lower than 0.35 strength gets blurry and scrambled

    honryindianMar 22, 2026

    @abyss259362 is there no way of using it without distilled lora? I see it changing the face of character lora with distilled lora. Reducing the strength of distilled lora improves the face, but completely reduces motion

    arkinson
    Author
    Mar 23, 2026

    @honryindian The distilled Lora is needed for low step generation - if I understand it right for myself (keep in mind we generate with only 8 + 3 steps). There might be ways to run without this Lora. But you need to check and understand if/how it works. It will definitely require a lot more steps and maybe other models, sigmas, etc. If you have the necessary hardware to get this up and running, check the templates for possible guidelines. But with a 12 gb vram limit I would not give this any try.

    ntrtalesMar 18, 2026· 7 reactions
    CivitAI

    ltx-2.3-spatial-upscaler-x2-1.1.safetensors

    new upscaler to fix that random graphics in longer videos.

    arkinson
    Author
    Mar 19, 2026· 1 reaction

    @ntrtales Thank you for the link👍 I`m actually working on a complete re-design of the worflow to simplify the handling of the different options. I have not much time for testing now. Please let me now if you have a good practical experience with the new upscaler.

    SnapRYRY89Mar 26, 2026

    Thanks for posting this. Was having an issue with my video generations and updating with this upscaler file fixed it for me.

    RedditUser981Mar 19, 2026· 14 reactions
    CivitAI

    your workflow is useless i think you need to learn more

    arkinson
    Author
    Mar 19, 2026· 2 reactions

    @kumarkishank959811 Thank you for the intelligent compliment bro 🤣 Hope we all can learn a lot from your knowlage 😉

    arleckkMar 19, 2026· 1 reaction
    CivitAI

    I'm getting a blurry video (I have this in negative prompt blurry, low quality, still frame, frames, watermark, overlay, titles, has blurbox, has subtitles) any idea why?

    arkinson
    Author
    Mar 20, 2026

    @arleckk First: wich option? all components up to date? No node conflicts?

    Did you checked twice, you have selected the right models in every loader node???

    arkinson
    Author
    Mar 20, 2026

    @arleckk Look here and let me know, if it solved your problem.

    arleckkMar 23, 2026· 1 reaction

    @arkinson ty it was the LoRa I changed it to a different one and it works, amazing job

    zexeorMar 20, 2026· 2 reactions
    CivitAI

    I can't seem to make the lipsync work. I even changed the audio to stereo, prompted the exact words the character should say, changed the video length to match the audio length, etc. Any help, please?

    arkinson
    Author
    Mar 20, 2026

    @zexeor Seems you talk about Text or Image + Audio to video??? With audio input lipsyncing seems to work not very well actually with ltx-2.3. It is often a lot of try and error: you might try to change prompt, image and audio input.

    zexeorMar 20, 2026· 3 reactions

    @arkinson Image + Audio to Video. But managed to solve it. Elevenlabs audio comes too clean, for whatever reason, adding some background noise makes lip sync work.

    Simon_BelmontMar 20, 2026

    @zexeor thanks for the tip!!!

    arkinson
    Author
    Mar 22, 2026

    @zexeor Thank you, that is really interesting.

    boinobin730Mar 22, 2026

    I have had this problem many times. Specifically it is a I+A 2V problem. I was running into this problem a lot when I was trying to dub vocal music with an image. I found that I needed to increase the detailer lora strength slightly higher. It will almost never lip sync if you don't add the detailer lora in. I haven't added noise yet, but might try out your suggestion @zexeor Also It doesn't like to lip sync certain onomatopoeia. I will give you my example, I was trying to get the girl to lip sync the KISS song "I was made for lovin you" it could not get the woman to even vocalise her in the very beginning "oohhh ing". You have to listen to the song to understand what I am talking about. Also it just fights you if you are trying to lip sync a male vocal with a female image. Obviously KISS are all guys but I like to flip the script for humour to make the women perform the male song. It doesn't like it, for whatever reason. I gave up on the idea after many failed attempts. I guess it makes sense from its inference to know that this male voice needs to come from this male looking image and vice versa for women.

    arkinson
    Author
    Mar 23, 2026

    @boinobin730 and all. Just a short info: the "Image + Audio to Video + Audio" is wired the right way. So the issue here is not related to the I2V issue here.

    drfaker911219Mar 21, 2026· 1 reaction
    CivitAI

    Result is blurred. Posted image with the workflow. What's wrong?
    I did use the distilled version and it worked but I want to use the normal version.

    I have the files downloaded from unsloth

    Update:
    restart probably fixed it, it works now.
    any way to increase the duration of the video?

    arkinson
    Author
    Mar 22, 2026

    @drfaker911219 "any way to increase the duration of the video?" Longer then 30 seconds??? Technically just edit the slider properties. But with too long videos you might run into a lot of issues too.

    drfaker911219Mar 25, 2026

    @arkinson the node does not have the number of seconds displayed can't modify. it did appear on my laptop though

    arkinson
    Author
    Mar 25, 2026

    @drfaker911219 Seems like a node conflict. Check if mxtoolkit is properly installed. All components up to date??? Did you followed my trouble shooting guide??

    drfaker911219Mar 25, 2026

    @arkinson uninstalled mxtoolkit then reinstalled nightly, same issue. I did update everything through comfyui manager

    arkinson
    Author
    Mar 25, 2026

    @drfaker911219 Stupid question: all other sliders are working??? Cause you talk just about clip length.

    drfaker911219Mar 26, 2026

    @arkinson  the resolution slider is like that also. maybe that's why others are asking how to change lenght cause they don't see the slider/value

    arkinson
    Author
    Mar 26, 2026

    @drfaker911219 Ok, so this should be the allready mentioned node conflict. Disable mixlab nodes - if installed (see here). With any custom node conflicts, first look at the github pages.

    ftw666Mar 21, 2026· 1 reaction
    CivitAI

    why not
    ltx-2.3-spatial-upscaler-x2-1.1.safetensors rather than 1.0

    dsa520123Mar 21, 2026· 1 reaction
    Because ltx-2.3-spatial-upscaler-x2-1.1.safetensors was only recently updated.

    arkinson
    Author
    Mar 22, 2026

    @ftw666 @dsa520123 Please look here and let me know if you get better results with the new upscaler.

    MegasherruMar 21, 2026· 1 reaction
    CivitAI

    Tried Text 2 Video
    and im just getting static, the sound is perfect.
    The image itself, is just static.
    The preview video I can see what I prompted, but then at some stage after that it just dies and gives up, I reloaded the workflow and have the exact same models in each one. but this happens each time.

    Have not tried the image to video yet.

    arkinson
    Author
    Mar 22, 2026

    @Megasherru and all others, please follow here.

    Rifler1Mar 21, 2026· 7 reactions
    CivitAI

    I've come up with a new tip. I spent almost 3 hours testing connecting Sage to this workflow, and it works fantastically well.

    10 I2V test generations, 11-second video (not cold start):

    Without Sage: ~470-480 seconds

    With Sage: ~290-320 seconds!

    How to set it up:

    1. Install Sage+Triton - https://www.youtube.com/watch?v=-S39owjSsMo

    2. Add the following node to the workflow: "LTX2 Mem Eff Sage Attention" (KJNodes). It should be connected between the Unet Loader (GGUF) and the Power LoRa.

    arkinson
    Author
    Mar 22, 2026· 1 reaction

    @Rifler1 Thank you for your feedback. Yes, to use Triton+SageAttention is esseential for speed up. You can manually install it, like you did, of course.

    But for most users I recommend a separate "Comfyui-Easy-Install" installation just for video generation/experimenting (see my short guide here, sroll down). It just needs about 2 mouse clicks and around 30 minutes time and you are up and running with system wide SageAttention (no need to add nodes to every workflow). I often mentioned this way and I use it for myself since several month. It works perfectly. And cause it is so simple now, I do not provide any switch options for SageAttention in my workflows anymore.

    LewdAnimeEmpireMar 21, 2026· 1 reaction
    CivitAI

    I’ve noticed that in most cases, while the video is being generated, the blurry preview follows the prompt correctly. However, during the final steps, it seems to lose its way, and the final result doesn't match the prompt.

    MegasherruMar 21, 2026· 2 reactions

    I have noticed that as well actually

    arkinson
    Author
    Mar 22, 2026· 1 reaction

    @LewdAnimeEmpire @Megasherru Thank you for the hint.

    I`m working on a complete rebuild of the existing workflow to simplify it for possible future updates (still not sure if my ideas will really work). So actually I have not much time for testing.

    To help us narrow down the problem you mentioned, could you please do the following:

    1. Update spetial upscaler model (see here) and run some tests. If the issue still exists

    2. please test, if it occours with different/all options or only with a specific option, like T2V etc.

    Please let me know about your results.

    boinobin730Mar 22, 2026· 2 reactions

    @arkinson  et.al. I will chime in here with an experience I had last night with exactly your experience @LewdAnimeEmpire

    I was trying to prompt an I2V action sequence of a woman turning around with a shotgun in her hand and firing at a group of zombies, I was very specific in my prompt having asked Chatgpt to give me an LTXv2 prompt after looking at my image. Specifically what seemed to happen was she turned slightly and fired a few shots, the zombies started to look like they would fall down in a heap but as the video was being generated they were made to be still standing again as if the prompt was cancelled/ignored by the finished render. I will add some of my examples to the gallery. Also .yes I have used the new spatial upscaler when trying to create the action clips.

    Edit- added:- I decided to try and have another go at this prompt. It finally shows a cause and effect of shoot, then zombie falls down, just badly.

    I originally surmised it might be a safety filter type situation to keep us all safe from not making something too graphic, but I'm not sure what's going on.

    arkinson
    Author
    Mar 23, 2026· 2 reactions

    @boinobin730 Thank you so much for quick testing 👍

    @LewdAnimeEmpire @Megasherru

    Arg, it seems I had made a silly mistake in my workflow (I just checked my i2v part against the template workflow). Please could you run a short test:

    Open the I2V subgraph -> go all the way to the right to "SamplerCustomAdvanced" -> connect "output" instead of "denoised_output" with "av_latent" (LTXVSeparateAVLatent node) and run a new test with your setup. I am pretty sure this will solve the strange issue.

    CybernixMar 23, 2026· 1 reaction
    CivitAI

    Sorry for the dumb question. The V2V + audio process doesn't generate audio for an existing video? I tried to add audio to a video generated in Wan, but it throws an error:
    Exception: VHS failed to extract audio from /../input/2026-02-14_00007.mp4

    arkinson
    Author
    Mar 23, 2026· 2 reactions

    @Cybernix No - V2V just extends an existing video getting the "style" from the input video. Let`s say you have an existing 3 second video of your dog looking into the camera. Now you can generate a 10 second video for example. The first 3 seconds are exactly your input video and in the following 7 seconds your dog tells a lipsynced joke - or whatever you prompt 🙂

    [edit: Generally, however, no error message should appear]

    CetlhoMar 23, 2026· 2 reactions
    CivitAI

    The workflow uses Unet Gguf, but I see examples from lower generations that are using the LTX2.3 10Eros model (LTX2.3 10Eros - Beta | LTXV23 Checkpoint | Civitai), for example, which is a safetensioner model. How is this possible? What customizations do I need to make to achieve the same?

    arkinson
    Author
    Mar 24, 2026

    @Cetlho If you like to use other models you have to manage it by yourself. But I would guess this is not an easy "plug-and play" task.

    mcjasonchu0906Mar 24, 2026· 1 reaction
    CivitAI

    how to change length ?

    arkinson
    Author
    Mar 24, 2026

    @mcjasonchu0906 Using the slider 😉

    DozorMar 24, 2026· 2 reactions
    CivitAI

    Thanks for the workflow. It runs on 8 GB VRAM + 32 GB RAM + 30 GB swap space. But I’m not sure how long my SSD will last. The total storage usage is about 56 GB for each generation of a 10-second video with upscaling.

    arkinson
    Author
    Mar 24, 2026· 1 reaction

    @Dozor Yes, that's normal and is how ComfyUI handles it.

    dsa520123Mar 25, 2026
    Is there any way to reduce SSD wear and tear, or is increasing RAM the only option?

    arkinson
    Author
    Mar 25, 2026· 1 reaction

    @dsa520123 I`m not an "expert", but if I got it right, the newer comfyui core handles the vram management, simply spoken this way: vram -> ram -> swap file. With low vram and low ram it allways uses the swap file. With 12 (8) gb vram and 32 gb ram we are operating at the absolut minimum. Some users upgraded to 64 gb ram to increase speed. But I would guess swap file is needed too.

    One thing you could try is model unloading and cach clearing - manually or via nodes at certain places in the workflow. But it might be all a lot of try and error.

    dsa520123Mar 25, 2026

    @arkinson 
    Thanks for replying. I'm currently using a 10-second video, 12GB RAM + 32GB RAM. Sometimes it writes 0.5GB-3GB, sometimes it writes 5GB. This might be due to the different page write sizes caused by the error message. I'm still looking for ways to reduce SSD wear and tear.

    arkinson
    Author
    Mar 25, 2026

    @dsa520123 Uhh - I did not get you. Wich error message? Btw. on Window my swap file size is set to 64 - 128 GB - and this is mostly used.

    dsa520123Mar 26, 2026

    @arkinson I mean, it'll use a lot of virtual memory paging, which leads to heavy SSD writes and shortens the SSD's lifespan significantly .The ultimate goal is to reduce the wear and tear on the computer.

    arkinson
    Author
    Mar 26, 2026· 1 reaction

    @dsa520123 Yes I know, but you spoke about an "error message". Anyway, as allready mentioned, with comfyui you have not much otions at least - except buying a new gpu with min. 24 gb vram + buying new ram. But believe me or not, byuing a new ssd is by far the most cost-effective solution.

    dsa520123Mar 26, 2026

    @arkinson I'm sorry, I made a mistake. It shouldn't be "error message," it should be "different prompt word" ,I'm not an English speaker, I'm using Google Translate......

    Yes,i know update vram and new ram is best way,I'm just wondering if there are other models or software that can reduce the hardware burden.



    arkinson
    Author
    Mar 26, 2026· 1 reaction

    @dsa520123 No problem. I understand your sentence now.

    Other models: I`m pretty sure a 12 gb vram gpu will actually run ltx-2.3 with gguf model only. The pro: you get it running with such low hardware at least. The con: high ram usage and with low ram -> high swap file usage of course.

    Other software: might be there is something out somewhere, probably specialised in a few tasks and with better performance. But I guess you will hardly find something programmable and comparable to comfyui.

    [edit: thank you for buzzing 😋]

    arkinson
    Author
    Mar 26, 2026· 1 reaction
    CivitAI

    @kAIbold Hi- thank you so much for buzzing😋

    charlesdelavigere743Mar 26, 2026· 2 reactions
    CivitAI

    Thank you this workflow is awesome ! Had to update a few things but it's running smoothly and produces amazing results ! Only a few minutes GPU time for a 10sec video ! Still need to work a bit on how to get the desired audio to sound like, but.. Thank you !!!!

    arkinson
    Author
    Mar 26, 2026

    @charlesdelavigere743 Hi - thank you for your feedback 😋🙂 and "stay tuned" cause I will try to publish an update shortly which takes better account of the aspect ratio specifications set out for LTX-2.3. Hopfully this will reduce the issus some users reported with artefacts, face distortions, etc.

    @arkinson Looking forward to testing this update. I noticed some visual artefacts at the end of longer rendering (ie 18 sec) where some red things appear in the image background. Otherwise, almost only positive results ! Really amazing ! What is your longer video achieved with this workflow ?

    arkinson
    Author
    Mar 27, 2026

    @charlesdelavigere743 New update just released 🙂

    LTX-2.3 specifications say: clip length up to 20 seconds. With the new workflow and 12 gb vram you can reach around 16 seconds with highest resolution. With medium resolution 20 seconds or more should be possible (but not tested yet).

    VIXAIMar 26, 2026
    CivitAI

    Hey!

    It’s running really well on my 8GB VRAM after switching to the distilled model, but I have a few questions.

    How can I manually change the resolution in Text-to-Video? Also, in Image-to-Video, is there a way to adjust the image resize so it doesn’t downscale the image so much? Additionally, I don’t see the Clip Length (in seconds) slider at all.

    Thanks in advance!

    arkinson
    Author
    Mar 26, 2026

    @VIXAI Hi - thank you.

    change the resolution in Text-to-Video: Simply replace the input node with whatever you like.

    Slider issue: see my last comment here.

    BbirdMar 27, 2026· 2 reactions
    CivitAI

    First frame > last frame is working very well now, thank you!
    Maybe for the version 2.1 ff/lf + audio file?
    That would be great!

    arkinson
    Author
    Mar 27, 2026

    Thank you. I am about to release a completely revamped workflow (hopfully today or tomorrow) wich should provide much better results - and yes, flf + audio to video is allready implemented 🙂

    mandbzMar 27, 2026· 1 reaction
    CivitAI

    Really good workflow!

    arkinson
    Author
    Mar 27, 2026

    @mandbz Thank you - and see my just released update 🙂

    Workflows
    LTXV 2.3

    Details

    Downloads
    3,917
    Platform
    CivitAI
    Platform Status
    Available
    Created
    3/13/2026
    Updated
    5/13/2026
    Deleted
    -

    Files

    ltx23AllInOneWorkflowForRTX_v20.zip

    Mirrors