CivArchive
    WAN 2.1 IMAGE to VIDEO with Caption and Postprocessing - Experimental
    NSFW
    Preview 62887190

    Workflow: Image -> Autocaption (Prompt) -> WAN I2V with Upscale and Frame Interpolation and Video Extension

    • Creates Video Clips with up to 480p resoltion (720p with corresponding model)

    There is a Florence Caption Version and a LTX Prompt Enhancer (LTXPE) version. LTXPE is more heavy on VRAM

    LTX Prompt Enhancer (LTXPE) might have issues with latest Comfy and Lightricks update


    MultiClip: Wan 2.1. I2V Version supporting Fusion X Lora to create clips with 8 steps and extend up to 3 times, see examples posted with 15-20sec of length.

    Workflow will create a clip on Input Image and extends it with up to 3 clips/sequences. It uses a colormatch feature to ensure consistency in color and light in most cases. See the notes in worflow with full details.

    There is a normal version which allows to use own prompts and a version using LTXPE for autoprompting. Normal version works well for specific or NSFW clips with Loras and the LTXPE is made to just drop an image, set width/height and hit run. The clips are combined to one full video at the end.

    update 16th of July 2025: A new Lora "LightX2v"has been released as an alternative to Fusion X Lora. To use, switch Lora in black "Lora Loader" node. It can create great motion with only 4-6 steps. : https://huggingface.co/lightx2v/Wan2.1-I2V-14B-480P-StepDistill-CfgDistill-Lightx2v/tree/main/loras

    More info/tips & help: https://civarchive.com/models/1309065/wan-21-image-to-video-with-caption-and-postprocessing?dialog=commentThread&commentId=869306


    V3.1: Wan 2.1. I2V Version supporting Fusion X Lora for fast processing

    Fusion X Lora: process the video with just 8 Steps (or lower, see notes in workflow). It does not have the issues like the CausVid Lora from V3.0 and does not require a color match correction.

    Fusion X Lora can be downloaded here: https://civarchive.com/models/1678575?modelVersionId=1900322 (i2V)


    V3.0: Wan 2.1. I2V Version supporting Optimal Steps Scheduler (OSS) and CausVid Lora

    • OSS is a newer comfy core node to allow lower no. of steps with a boost in quality. Instead of using 50+ steps you can receive same result with like 24 steps. https://github.com/bebebe666/OptimalSteps

    • CausVid uses a Lora to process the video with just 8-10 steps, it is fast at a lower quality. It contains a Color Match option in postprocessing to cope with the increased saturation, the lora is introducing. Lora can be downloaded here: https://huggingface.co/Kijai/WanVideo_comfy/tree/main

      (Wan21_CausVid_14B_T2V_lora_rank32.safetensors)

    • Both have a version with FLorence or LTX Prompt Enhancer (LTXPE) for Caption, can use Loras and have Teacache included.


    V2.5: Wan 2.1. Image to Video with Lora Support and Skip Layer Guidance (improves motion)

    There are 2 version, Standard with Teacache, Florence caption, upscale, frame interp. etc. plus a version with LTX Prompt Enhancer as an additional captioning tool (see notes for more info, requires custom nodes: https://github.com/Lightricks/ComfyUI-LTXVideo).

    For Lora use, recommend to switch to own prompt with Lora trigger phrase, complex prompts might confuse some Loras.


    V2.0: Wan 2.1. Image to Video with Teacache support for GGUF model, speeds up generation by 30-40%

    It will render the first steps with normal speed, remaining steps with higher speed. There is a minor impact on quality with more complex motion. You can bypass the Teacache node with Strg-B

    Example clips with workflow in Metadata: https://civarchive.com/posts/13777557

    Info and help with Teacache: https://civarchive.com/models/1309065/wan-21-image-to-video-with-caption-and-postprocessing?dialog=commentThread&commentId=724665


    V1.0: WAN 2.1. Image to Video with Florence caption or own prompt plus upscale, frame interpolation and clip extend.

    Workflow is setup to use a GGUF model.

    When generating a Clip you can chose to apply upscaling and/or frame interpolation. Upscale factor depends on upscale model used (2x or 4x, see "load upscale model" node). Frame Interpolation is set to increase frame rate from 16fps (model standard) to 32fps. Result will be shown in "Video Combine Final" node on the right, while the left node shows the unprocessed clip.

    Recommend to "Toggle Link visibility" to hide the cables.


    Models can be downloaded here:

    Wan 2.1. I2V (480p): https://huggingface.co/city96/Wan2.1-I2V-14B-480P-gguf/tree/main

    Clip (fp8): https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/tree/main/split_files/text_encoders

    Clip Vision: https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/tree/main/split_files/clip_vision

    VAE: https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/tree/main/split_files/vae


    Wan 2.1. I2V (720p): https://huggingface.co/city96/Wan2.1-I2V-14B-720P-gguf/tree/main

    Wan2.1. Text to Video (works): https://huggingface.co/city96/Wan2.1-T2V-14B-gguf/tree/main


    location to save those files within your Comfyui folder:

    Wan GGUF Model -> models/unet

    Textencoder -> models/clip

    Clipvision -> models/clip_vision

    Vae -> models/vae


    Tips:

    • lower framerate in "Video combine Final" node from 30 to 24 to have a slow motion effect

    • You can use the Text to Video GGUF Model, it will work as well.

    • If video output shows strange artifacts on the very right side of a frame, try changing the parameter "divisible_by" in node "Define Width and Height" from 8 to 16, this might better latch on to the standard Wan resolution and avoid the artifacts.

    • see this thread if you face issues with LTX Prompt Enhancer: https://civarchive.com/models/1823416?dialog=commentThread&commentId=955337

    • Last Frame: If you face issues finding the pack for that node: https://github.com/DoctorDiffusion/ComfyUI-MediaMixer

    Full Video with Audio example:

    Description

    WAN 2.2. TI2V 5b GGUF Model support

    FAQ

    Comments (39)

    FuSolikoMar 11, 2025
    CivitAI

    for NSFW do i need the original photo to contain nudity? If i prompt a photo for NSFW it just ignores and does something else non NSFW.
    am i missing anything?

    jm112368767Mar 11, 2025

    Well, in general I2V is best for keeping the original content but just animating it. i.e., it's not great, nor really meant for, changing a lot of the base content. Recommend you use Flux Fill to change the source image to your NSFW liking then do I2V.

    tremolo28
    Author
    Mar 11, 2025· 1 reaction
    CivitAI

    If LTX Prompt Encancer from experimental tab is causing issues (Error: "Expected all tensors..."), see below thread for solution, might occur with <16gb vram:

    https://civitai.com/models/995093?modelVersionId=1511863&dialog=commentThread&commentId=727932

    more infos: https://civitai.com/models/995093?modelVersionId=1511863&commentId=722660&dialog=commentThread

    GFrostMar 11, 2025
    CivitAI

    Sometimes result like in slow motion. How can i fix this or is it normal behaviour?
    P.S. Hope to see your version of TextToVideo

    tremolo28
    Author
    Mar 12, 2025· 1 reaction

    agree, sometimes the output looks like slomo. Did not try it, but assume the following could help: add text to negative prompt (i.e. "slow motion") or increase framerate in Final Video combine node from 30 to maybe 40.

    Text to Video is next on my list :)

    tremolo28
    Author
    Mar 13, 2025· 1 reaction

    turns out, the workflow works as well as Text to Video by just using the T2V GGUF model: https://huggingface.co/city96/Wan2.1-T2V-14B-gguf/tree/main

    GFrostMar 13, 2025

    @tremolo28 Ha! So what should i do? Just disable image input and use personal prompt?

    tremolo28
    Author
    Mar 13, 2025

    @GrandpaFrost just load the t2v video model and use own prompt or insert an image and let florence do the job.

    GFrostMar 13, 2025

    @tremolo28 Hmmm i have loaded, "wan2.1-t2v-14b-Q5_K_M.gguf"
    and getting

    "Unexpected architecture type in GGUF file, expected one of flux, sd1, sdxl, t5encoder but got 'pig'"

    For example i use "wan2.1-i2v-14b-480p-Q4_K_M.gguf" for i2v

    So which one i need to download?

    tremolo28
    Author
    Mar 13, 2025

    @GrandpaFrost yes, this model should work: "wan2.1-t2v-14b-Q5_K_M.gguf", I used Q4, but shouldnt matter. Anyway, you might need to have an input image in the workflow, even if you use an own prompt.

    GFrostMar 14, 2025· 1 reaction

    @tremolo28 with wan2.1-t2v-14b-Q4_K_M.gguf it worked.

    GFrostMar 15, 2025
    CivitAI

    Im trying to add your Workflow via "resources" button but recently it stops to show there. However in some post i was able tu put it there. Did you encounter this and maybe have a solution?

    tremolo28
    Author
    Mar 15, 2025· 1 reaction

    I load/save workflows just from the default directory

    CivitIsTooCensoredMar 15, 2025
    CivitAI

    Praying WAN comes to Stable. Comfy seems to require a copy of my checkpoints and LoRAs and I can't keep doubling 6.4GB.

    Modify the "extra_model_paths.yaml" file located into ComfyUI folder and point the respective models and loras path location to your Stable Diffusion models/loras folder. I also pointed the VAE, ControlNet models and such this way

    Sniza_007522Mar 20, 2025· 1 reaction
    CivitAI

    I have my own setup that works without any problems. I tried yours and like other extended setups it says - mat1 and mat2 shapes cannot be multiplied (154x768 and 4096x5120). Can someone please advise me how to solve this problem? Thank you very much

    Sniza_007522Mar 21, 2025· 1 reaction

    Sorry, my fault, I don't use scaled CLIP model. But it help me more understanding system. Thank you very much for your work.

    lecozizu385Jul 29, 2025
    I have the same problem, can you help me, I'm a layman on the subject

    GFrostMar 21, 2025· 1 reaction
    CivitAI

    Today i have an error:

    DownloadAndLoadFlorence2Model

    Unrecognized configuration class <class 'transformers_modules.Florence-2-large.configuration_florence2.Florence2LanguageConfig'> for this kind of AutoModel: AutoModelForCausalLM. Model type should be one of AriaTextConfig, BambaConfig, BartConfig, BertConfig, BertGenerationConfig, BigBirdConfig, BigBirdPegasusConfig, BioGptConfig, BlenderbotConfig, BlenderbotSmallConfig, BloomConfig, CamembertConfig, LlamaConfig, CodeGenConfig, CohereConfig, Cohere2Config, CpmAntConfig, CTRLConfig, Data2VecTextConfig, DbrxConfig, DiffLlamaConfig, ElectraConfig, Emu3Config, ErnieConfig, FalconConfig, FalconMambaConfig, FuyuConfig, GemmaConfig, Gemma2Config, Gemma3Config, Gemma3TextConfig, GitConfig, GlmConfig, GotOcr2Config, GPT2Config, GPT2Config, GPTBigCodeConfig, GPTNeoConfig, GPTNeoXConfig, GPTNeoXJapaneseConfig, GPTJConfig, GraniteConfig, GraniteMoeConfig, GraniteMoeSharedConfig, HeliumConfig, JambaConfig, JetMoeConfig, LlamaConfig, MambaConfig, Mamba2Config, MarianConfig, MBartConfig, MegaConfig, MegatronBertConfig, MistralConfig, MixtralConfig, MllamaConfig, MoshiConfig, MptConfig, MusicgenConfig, MusicgenMelodyConfig, MvpConfig, NemotronConfig, OlmoConfig, Olmo2Config, OlmoeConfig, OpenLlamaConfig, OpenAIGPTConfig, OPTConfig, PegasusConfig, PersimmonConfig, PhiConfig, Phi3Config, PhimoeConfig, PLBartConfig, ProphetNetConfig, QDQBertConfig, Qwen2Config, Qwen2MoeConfig, RecurrentGemmaConfig, ReformerConfig, RemBertConfig, RobertaConfig, RobertaPreLayerNormConfig, RoCBertConfig, RoFormerConfig, RwkvConfig, Speech2Text2Config, StableLmConfig, Starcoder2Config, TransfoXLConfig, TrOCRConfig, WhisperConfig, XGLMConfig, XLMConfig, XLMProphetNetConfig, XLMRobertaConfig, XLMRobertaXLConfig, XLNetConfig, XmodConfig, ZambaConfig, Zamba2Config.

    Rlated post:

    https://github.com/huggingface/transformers/issues/36886

    tremolo28
    Author
    Mar 22, 2025

    Cool, another update breaking custom nodes…. Anyway according to the linked thread, a fix seems to be in progress.

    CaulShiversMar 22, 2025

    any fix for this yet??

    GFrostMar 22, 2025

    @CaulShivers Apperently not i just turn noff nodes to bypass them. There is other problem. after recent updates video Generations taking longer again. Manager showing that my GPU all used but temp is not rising up as much as it was before.

    QuackquackMar 27, 2025· 1 reaction

    I was able to get it working using the steps to replace "AutoModelForCausalLM" with "AutoModelForSeq2SeqLM" as mentioned in the below comment.
    I also tried to downgraded transformers (python -m pip install --upgrade transformers==4.49.0) as they suggest but I don't believe that actually did anything for me (I think it failed to downgrade actually).
    https://github.com/kijai/ComfyUI-Florence2/issues/134#issuecomment-2745372425

    GFrostMar 27, 2025· 1 reaction

    @Quackquack I use StabilityMatrix, so i used CMDline from "...\StabilityMatrix\Packages\ComfyUI\venv\Scripts" folder. it worked for me. Thnx for sharing man.

    GFrostMar 30, 2025· 1 reaction

    They fix it.

    Lw24AIMar 22, 2025· 3 reactions
    CivitAI

    I love your workflows. I've been following your posts since the LTX models. Your workflows are always working, even though I only have 12GB of video memory.)))

    tremolo28
    Author
    Mar 22, 2025· 1 reaction

    thanks, mate

    tremolo28
    Author
    Mar 26, 2025· 4 reactions
    CivitAI

    finally made my first Wan music video, It is me and the lads.

    https://youtu.be/4oq2JOp5o5w?si=my6fxAUKeSddL_Lc

    zczcgMar 27, 2025
    CivitAI

    Anyone can resolve it? Missing Node Types: WanImageToVideo, i update comyfui version,but it already can't solved

    tremolo28
    Author
    Mar 27, 2025

    WanImageToVideo is a comfy core node.. You might be on an outdated comfy version.

    Current version (March 27th):

    ComfyUI: v0.3.27-6-g3661c833
             (2025-03-26)
    Manager: V3.31.8

    zczcgMar 28, 2025

    @tremolo28 i have solved my problem,thanks a lot!

    cbm27Mar 28, 2025

    @zczcg how did you resove the problem - my manager can not find the missing nodes

    zczcgMar 28, 2025· 1 reaction

    @cbm27 You must update your comfyui,\ComfyUI_windows_portable\update\update_comfyui.bat here

    EechiZeroMar 31, 2025· 2 reactions
    CivitAI

    It is very good. It would be great to add Loras and Seageattention

    3rdny467Mar 31, 2025· 1 reaction
    CivitAI

    Great workflow, I also would like to know where/how to add loras?

    tremolo28
    Author
    Mar 31, 2025· 1 reaction

    might release an update this weekend for Lora support.

    For now you could add a "LoraLoaderModelOnly" node and place it after unet loader and add the lora trigger word to Pre- or After text node. Just tried that with the "squish" Wan lora and it worked.

    3rdny467Apr 2, 2025

    Awesome Thank you! I tried something similar but it wasn't working. I'll give it a go. Do you think multiple loras would work?

    tremolo28
    Author
    Apr 3, 2025· 1 reaction

    @3rdny467 think it would work to daisy chain the Lora loaders, but might be an overkill to use 2+ Video Loras.

    Workflows
    Wan Video 14B i2v 480p

    Looks like we don't have an active mirror for this file right now.

    CivArchive is a community-maintained index — we catalog mirrors that volunteers upload to HuggingFace, torrents, and other public hosts. Looks like no one has uploaded a copy of this file yet.

    Some files do get recovered over time through contributions. If you're looking for this one, feel free to ask in Discord, or help preserve it if you have a copy.

    Details

    Downloads
    240
    Platform
    CivitAI
    Platform Status
    Deleted
    Created
    3/11/2025
    Updated
    4/21/2026
    Deleted
    4/13/2026

    Files

    wan21IMAGEToVIDEOWith_experimental.zip

    Mirrors

    wan21IMAGEToVIDEOWith_experimental.zip

    Mirrors

    wan21IMAGEToVIDEOWith_experimental.zip

    Mirrors