๐ธ๐ DaSiWa LTX 2.3 | Lightspeed ๐๐ธ
My new LTX 2.3 model for I2V, T2V, V2V generation.
Version overview: https://civarchive.com/articles/23495/dasiwa-model-versions-and-timeline
โ ๏ธ v2 is BETA RELEASE! | v1 is ALPHA RELEASE!
โผ๏ธIMPORTENT: I found a bug, making the model v2 worse as it should be!
I uploaded a better version. All who spend buzz can just re-download when the new files.
FP8/NVFP4 are back again!
I apologize for the inconvenience and hassle ๐ฃ
Expect that not everything is perfect and mind LTX2.3 is not as stable as WAN 2.2 finetunes.
๐ฎ Key Features:
๐ฅ Best With I2V and V2V
โ๏ธ Really Fast generation
๐ Better Sound
๐ฃ๏ธ Better Voices
๐ Enhanced Quality and Reasoning
๐ Unrestricted
๐ช Better Prompt Responsiveness
๐ฅบ๐๐Better understanding of anime/manga style composition
๐ชก FP8+ mixed precision
๐ตโ๐ซ Reduced some hallucinations
๐ Strengthened visual consistency/understanding for anime
๐Workflow
Make sure to checkout my easy to use Workflows!
๐LoRA's
But: This checkpoint is not meant to replace all LoRAs, it is meant to:
Perform better overall at his own
As easy as possible to use
With LoRAs to be more awesome
โ ๏ธ Read the corresponding announcements.
๐ข Make sure to check it out for in-depth information and a complex comparison!
๐ ๏ธ Recommended Settings
CFG 1
Euler/linear_quadratic
8 Steps
Dependencies
VAE
LTX23_audio_vae_bf16.safetensors
LTX23_video_vae_bf16.safetensors
Dual CLIP (Encoder and Projection)
gemma-3-12b-it-heretic-v2_fp8_e4m3fn.safetensors
ltx-2.3_text_projection_bf16.safetensors
๐ฉป Known issues
Tell me ๐ซต๐ซข
LTX2.3 be LTX2.3 ๐ซฃ
Hands are sometimes unstable
Shifting of fine details (e.g. eyes) without prompting or really high resolution
Needing way more runs for good results than WAN22
๐ฉบ Fixes & Feedback
If you use LoRAs, try to respect the LoRA training triggers and try some versatile descriptions, most LoRAs will work with 0.3-1.2 (start with 0.3)
Do not mass add LoRAs, just add 1 or 2
Negative prompting do not work with cfg 1, thats a limitation of speed-ups with cfg 1
Before posting any questions I suggest reading my guide.
Update your ComfyUI โ
๐ค Why I Made This
Pushing LTX2.3 to its limits!
This checkpoint is also my personal playground.
Closing words
๐คฉ I want to thank all the fantastic other creators who made super nice LoRAs and concepts to play with! Support that awesome creators by using their LoRAs and post to their gallery and share the meta-data!
โ ๏ธ I made all this with permissions or open-source resources (the time it is incorporated).
I share as much insights as I can without compromising my work. I'm doing this for fun as my hobby and just do not want my hobby to be destroyed.
More details can be obtained in the corresponding announcements!
If you would like to contribute in my awesome (๐) checkpoint or willing to share resources I'll gladly give credit! Just contact me!
โ All credits / resources are mentioned inside the announcements! - Since different versions may have different resources.
YOU are responsible for outputs as always! If you make ToS violating content and I get aware I WILL report this.
Disclaimer
This models are shared without warranties and with the condition that it is used in a lawful and responsible way. I do not support or take responsibility for illegal, harmful, or harassing uses. By downloading or using it, you accept that you are solely responsible for how it is used.
LTX-2.3 Custom Addendum: Fine-Tune Integrity & Attribution
Base License: LTX-2.3 Community License Agreement
1. Verification & Integrity Requirement
This model is a fine-tuned or merged derivative of LTX-2.3. To ensure users receive the correct weights, safety metadata, and version updates, the Official Source is maintained at: https://civarchive.com.
Notice of Non-Support: Any versions hosted on third-party platforms (mirrors) are considered "Unverified." The creator provides zero warranty, support, or safety guarantees for unverified files.
2. Trademark & Branding Restriction (Pursuant to LTX-2.3 Section 8)
While the underlying weights are subject to the LTX-2.3 distribution rights, the name "DaSiWa [Model Name]" and any associated logos or promotional imagery are the intellectual property of the creator.
Renaming Rule: Any Entity or individual redistributing or mirroring this model on a third-party platform (including but not limited to Hugging Face, Tensor.art, or SeaArt) MUST remove the original model name and branding unless explicit written permission is granted.
Source Attribution: Redistributors must provide a prominent link back to the Official Source as the primary point of origin.
3. Commercial Platform Restriction (Pursuant to LTX-2.3 Section 2)
Commercial Entities (as defined in the base license) that generate revenue through the provision of "Generation-as-a-Service" or ad-supported hosting are prohibited from using the official branding of this model to market their services without a separate agreement.
If your platform charges "credits" or subscriptions to access this specific fine-tune, you are required to contact the creator to ensure compliance with my project.
Description
๐ฅ Best With I2V and V2V
๐ Enhanced T2V (compared to TreasureChest v1)
โ๏ธ Really Fast generation
๐ Better Sound
๐ฃ๏ธ Better Voices
๐ Enhanced Quality and Reasoning
๐ Unrestricted
๐ช Better Prompt Responsiveness
๐ฅบ๐๐Better understanding of anime/manga style composition
๐ชก FP8+ mixed precision
๐ตโ๐ซ Reduced some hallucinations
๐ Strengthened visual consistency/understanding for anime
FAQ
Comments (47)
Hello! Thanks for the efforts. For the I2V workflow I can see the option to modify the CFG. However I can not see anywhere to change the Steps in comfyui, which node is it? I don't see much speed improvement. Any other ways to optimize?
I couldn't tag as resource in video post. I would have added it manually here but now civit has errors for new posts. Kudos on the small file size. I did higher res 15 second video in record time!
actually I was just using regular distilled fp8 like a dummy when I posted this.
update: Disawa v2 saved about 2 minutes generating 10 second video over Kijai fp8, nice from about 8 minutes to 6 minutes, maybe also because almost no loras.
Your Omni-Forge workflows are money! Thanks!
I keep getting this error constantly: Video Combine ๐ฅ๐ ฅ๐ ๐ ข Exception: An error occurred in the ffmpeg subprocess: [aac @ 0000021f2fcccac0] Input contains (near) NaN/+-Inf
Moreover, when I use a regular distilled model in your workflow, this error does not appear.
This is the VHS node for combining images to a video, that has no influence on the model used. I cannot imagine why this should be the case for my model.
@darksidewalkerย
I also canโt understand, I tried to fix this error with the help of AI, but none of their advice helped. I used your models and Wan workflow before, everything was fine, but now Iโm stuck.
@llayfar602373ย did you try my comfyui installer to make sure everything is setup correctly?
@darksidewalkerย
I didn't know about this, I haven't used your comfy ui installer, since there were no problems before, can you give me a link so I can get acquainted?
@llayfar602373ย https://civitai.red/models/2364056/tool-dasiwa-comfyui-installer
@darksidewalkerย Thank you
1 - Is there a workflow planned for the t2v mod?
2 - Why is the FP4 version of the model 11GB? That's very small.
3 - As far as I understand, this is a fine-tuned version of the Sulphur 2 model. What dataset did you train your model on? The information is interesting.
Sorry if these are stupid questions.
1# The workflow already has this
2# Aggressive quantization with advanced scheduler and optimized logic.
3# v2 is made with sulphur-base - Refer to the announcement for detailed info
@darksidewalkerย Your recommended workflow has three nodes for frames: a start frame, a middle frame, and an end frame. How can this be used for t2v generations? I don't understand. And about this fine-tune. I mean, what data did you use to create the fine-tune? Is it 2D animation, 3D animation? Does the model understand classic tags like 1girl, 1boy? i about DaSiWa LTX 2.3, not sulphur
@yuduz367ย it's all described inside the workflow notes. You enable t2v and write your prompt.
Ltx23 understands natural language, no tag based approach.
its getting real good, good work as allways
Testing SolsticeCoin on WanGP - super fast gens at 720 and 1080 with my 5080 gpu. great motion and audio as well but getting some horrible ghosting effects... this is with 0 lora's and trying both 24 and 30 fps with rife upscaling on and off
how are people affording 50 series cards nowadays ?
~~Some settings and inputs may degrade the model due to a bug. See my comment on the frontpage. I'll re-up as soon as it is possible to me.~~ already updated.
@darksidewalkerย is the NF4 version the NVFP4 ?
@NUGGZ1616ย yes
For 2 days I was posting videos on this page thinking I was using this checkpoint but I was using regular distilled fp8 instead, SMH. I had this checkpoint in an unused node .
hey, my generations are taking 1-2 hours for a 10 sec video at 0.65 MP - Balanced. I'm using a 4060 with 8vram and 32Ram. I don't know what else to try, any idea what the problem might be?
Lower your settings. 8gb VRAM is low.
@darksidewalkerย i tried, but even with 0.26 - Preview with 16 fps took 2 hours for a 5s video. Generating tokens takes 2 hours arraving at 71-76/256 and then goes to steps that takes 2 min.
@hectoriumย if you need more than some minutes you cannot run this with your setup. If your comfy is broken that could be an issue. But normally hitting that long times are because you offload to drive instead of RAM.
@darksidewalkerย i changed the model to eros fp8 and now i can get 1.05 quality in 25 min, maybe you're rigth and my setup it's not enough for your model, thank you for answering me. One last Question, it looks like the model generates a similar character and dont uses exactly my image, even the first frame is very diferent, is it how it works or i'm missing something?
No one has come here to suggest the most simplest answer: If anything is using your GPU by even 10-15% then generations can take HOURS instead of minutes. Comfy does not do well when other stuff is using the GPU. Can't even have a frigging game open on idle at 3 fps using 3 GB VRam and only 5% util or shit crawls to a slug when generating.
@MadCat2kย Well I thought it is obvious that one can only run AI and should not run anything besides.
@hectoriumย I don't think this is the checkpoint I made.
Here a quick comparison of what each run uses on resources with 0.52MP, 24fps, 8s:
Sulphur-Base FP8: 54% (64GB RAM), 95% (16GB VRAM) ~ 62s
SolsticeCoin v2 FP8: FP8: 56% (64GB RAM), 95% (16GB VRAM) ~ 40s
SolsticeCoin v2 NVFP4: FP8: 30% (64GB RAM), 95% (16GB VRAM) ~ 45s (I do not have a 50xx)
They both used nearly the same amount. The NVFP4 uses less RAM but not running good, because I do not have a 50xx.
Anyone have issues with TextGenerateLTX2Prompt being censored? Any tips or tricks for that? my videos are chastising me for everything, lol.
All normal text encoders are censored, you would need an abliterated or heretic clip text encoder
@darksidewalkerย Ah yeah I'm using a heretic clip text encoder. but when I set this to use the default system prompt it just gets weird. I'll try a different ablit
Hey, I've tested the model quite a few times.
I'm not sure which model was used as the base, but compared to Sulphur 2, the 10eros model (which is based on Sulphur 2) gives much better motion stability and accuracy in I2V and V2V โ because it's specifically tuned for that.
Also, I'd suggest not merging the Distilled LoRA in and leaving it out instead. You really need to fine-tune the Distilled LoRA strength separately at the stage-1 sampling and the stage-2 sampling (after upscale) to get good results. Steps, CFG, and Distilled LoRA strength all affect the output quite sensitively, so having it baked in takes away that control. The official Distilled LoRA also has some issues on its own.
Hope this helps with your next version.
I've been using your Wan model a lot too โ thanks for working on the LTX model as well!
I hope you did not test the broken file. That would explain your experience.
Did you test the 28GB fp8?
Also wanted to mention that the whole sense of a lightspeed model is baked distillation,since this saves huge amounts of VRAM if you do not have to load extra loras
Today's uploaded file was giving me an error RuntimeError: shape '[2048, 2048]' is invalid for input of size 533183 File. Other fp8 models run without that error, including yesterday's version of this model.
I did run 8 tests today with the fp8 ... all worked fine with my workflow.
Maybe your workflow or an incompatible setting. Did you make sure to update comfyui?
My environment: ComfyUI 0.18.1 / ComfyUI_frontend v1.41.21.
v2 is running well on this setup.
Update Comfy and Nodes.bat fixed it thanks. Please T2V butthole in V3. LOL. Sulphur 2 doesn't do it.
Have you fixed the issue with the model yet? I want to download it, but I have one more question. Can this model generate and understand furry content? It's problematic with the standard sulphur 2 model. The model only understands where the organs are on non-standard characters and how they should interact with each other with a 20% chance of success. Also, the standard sulphur 2 model doesn't work well with 2D content, whether text2video or image2video. Please answer these questions.
And when can we expect the next iteration of your model?
Yes I uploaded the fixed version.
About furry, I did include some but not specifically, so I can not tell in depth. Also I did not test it.
The next iteration could need some time, not soon at least.
can you make a version that works with the normal comfyui workflow?
Unlike for now
assuming most other people also won't be using it since they don't want to download 100 custom nodes for something that's already been figured out.
@solidframegaming301ย I think you are on the wrong track, there a 2 types of LTX23 checkpoints. The one with baked VAE and the one without.
Most custom workflows working with the ones without baked VAE and you can always add the vae even in the standard ltx23 workflow, there is no need for a separate checkpoint.
And anyone who wants to make better videos will not use the standard workflows anyway.
Also there are not 100 custom nodes, there are 6 and 3 of them so common, that they are in almost all workflows anyway.
So assuming "most" AI-fan's will use custom workflows to achieve better results. But if not, well... I make what I would use, and I use it this way atm. ๐