CivArchive
    VACE CONTROLNET simple workflow WAN2.1 | GGUF | LoRA | UPSCALE - v1.0
    Preview 79219760
    Preview 79219783

    This workflow allows you to retrieve the movements of a video via controlnet (pose/canny or depth) and create a new video from an image of your choice with this movement.

    Resources you need:

    馃搨Files :

    Recommendation :
    >24 gb Vram: base or Q8_0
    16 gb Vram: Q5_K_S
    <12 gb Vram: Q4_K_S

    For base version
    VACE Model: wan2.1_vace_14B_fp8_e4m3fn.safetensors or wan2.1_vace_1.3B_fp16.safetensors
    In models/diffusion_models

    For GGUF version
    VACE Quant Model: Wan2.1-VACE-14B-QX_0.gguf
    In models/diffusion_models

    CLIP: umt5_xxl_fp8_e4m3fn_scaled.safetensors
    in models/clip

    VAE: wan_2.1_vae.safetensors
    in models/vae

    LORA: Wan21_CausVid_14B_T2V_lora_rank32.safetensors
    in models/lora

    ANY upscale model:

    in models/upscale_models

    馃摝Custom Nodes :

    Description

    base version

    FAQ

    Comments (42)

    Peticree435May 29, 2025
    CivitAI

    great workflow as always. Which gguf would work on 24gb vram? Also, does vace work with split sigmas?

    UmeAiRT
    Author
    May 29, 2025

    I've done some testing and the split sigma isn't working at the moment. I've tried to publish a working version but I still have a lot of testing to do to optimize.

    schschMay 29, 2025
    CivitAI

    8GB VRAM + 32GB RAM possible with Q4?

    UmeAiRT
    Author
    May 29, 20251 reaction

    This will be complicated but you should try by loading as much of the model as possible into RAM.

    Defect450May 29, 20251 reaction
    CivitAI

    The legend returns! I can't wait to test this one out, thank you UmeAiRT!!

    Peticree435Jun 2, 2025

    literally though i wrote this. but agreed

    SouthernLightsMay 29, 2025
    CivitAI

    Excellent work! I am delighted with this workflow and the excellent images it helps produce. Thank you for your hard work!

    CupofTeaMay 29, 2025
    CivitAI

    I was waiting for this before I tried Vace. Works brilliantly and presented amazing as well. thanks!

    tarajiyeon5201314927May 29, 20251 reaction
    CivitAI

    Thank you very much. Can you give a good workflow for the recently released causvid and accvid acceleration lora? I have tried the ones released on the C site, but they are not easy to use. I look forward to your update.

    UmeAiRT
    Author
    May 30, 2025

    In this workflow I added the lora causvid, does this method not seem easy to use?

    @UmeAiRT聽Thanks, I'll try it

    alain57160Jun 3, 2025

    Hello @UmeAiRT聽I really love your workflow. Small question, did you tried something like in this one : https://civitai.com/models/1622023/causvid-2-sampler-workflow-for-wan-480p720p-i2v?modelVersionId=1835720

    I read that causvid should have different values
    unfortunately i'm not as expert as you in this domain
    Maybe your could add such feature in your's different workflows

    smolushaMay 30, 2025
    CivitAI

    Everything ends on a StringConcatenate note, the control preview works and shows the result. Then nothing happens.

    Failed to validate prompt for output 398:

    * StringConcatenate 511:

    - Failed to convert an input value to a INT value: frame_c, , invalid literal for int() with base 10: ' '

    - Failed to convert an input value to a INT value: frame_b, , invalid literal for int() with base 10: ''

    - Required input is missing: text_a

    - Required input is missing: text_b

    - Failed to convert an input value to a INT value: frame_a, , invalid literal for int() with base 10: ''

    Output will be ignored

    Failed to validate prompt for output 413:

    Output will be ignored

    Prompt executed in 0.41 seconds

    BBBAAA2May 30, 2025

    I was having a similar problem with another one of Ume's (excellent) workflows and my workaround was to connect the positive prompt directly to the positive encode node. I think the problem is occurring when combining the Florence auto-prompt with the regular prompt. If you aren't using the auto-prompt anyways, connecting directly won't compromise anything else.

    UmeAiRT
    Author
    May 30, 2025

    This bug is related to different versions of ComfyUI that have changed the "concatenate" node. You must have the latest version of ComfyUI for this to work in general or right-click on it and then "fix node"

    smolushaMay 30, 20251 reaction
    CivitAI

    Here is another error in the console:

    File "F:\AI\ComfyUI_windows_portable\ComfyUI\comfy\ldm\wan\model.py", line 244, in forward

    c = self.before_proj(c) + x

    ~~~~~~~~~~~~~~~~~~~~^~~

    RuntimeError: The size of tensor a (46620) must match the size of tensor b (47880) at non-singleton dimension 1

    smolushaJun 1, 2025

    An interesting fact is that it only works for me when setting the resolution to 480x480, but if you fix something on one parameter, the error I described above comes out.

    p1042779030337Jun 2, 2025
    CivitAI

    Just to remember that clip too can be GGUF.

    https://huggingface.co/city96/umt5-xxl-encoder-gguf/tree/main

    williamkenji523Jun 3, 2025
    CivitAI

    Congratulations on your work, it's very good. I'm really enjoying it.

    jay_richJun 4, 2025
    CivitAI

    Thanks again for your work! However the output video keeps on being just a brown dense fog? what settings am I missing here?
    (using DWPose and the controlnet video renders fine from my input video. it is "only" the actual output video that does not show)

    jay_richJun 4, 2025

    now, the output is a yellow orange slow pace video of two womens faces!? Soo weird.. have tried all models and all apparent settings... doesnt change.

    greedsmith353Aug 1, 2025

    jay_rich聽i have a same thing

    2600angroupAug 4, 2025

    I also have the same problem

    busyahnJun 6, 2025
    CivitAI

    The dynamic range drops a lot. The black part collapses a lot. What should I touch? Thank you always

    BeyondMasterJun 6, 20252 reactions
    CivitAI

    Excellent work!

    Question: What setting can I change to keep a higher similiarity with the input iage and really to only get the movement from the video? The style changed too much for my taste. Thank you.

    RobertBobertsonJul 17, 2025

    Have the same question. I ran with the default settings (CFG 1.0, combination of DWPose, Depth and Canny enabled) and kind of got something close to my input image, but it was still a different style than I intended. When I switched to just using DWPose, which is what I use in the other popular non-VACE WAN ControlNet workflow to some success, it turned my cartoony source image into a real person in the output video lol.

    If we had control over the ControlNet strength for video like we do for images, then I feel like that would allow us to get closer to the source image.

    It's been about a month since your comment though. Were you able to figure this out, either with this workflow or a different one?

    RobertBobertsonJul 17, 2025

    So I just figured out that the CausVid_T2V lora the guide for this workflow suggests causes the realistic output I was getting. Without that, I get much closer to the original style but it could still be pushed much further.

    I'm trying to mess with the strength value in the WanVaceToVideo node to see if that helps.

    _VI_Jun 13, 2025
    CivitAI

    Always wanted to know what is better Base or GGUF version? What are the cardinal differences between them?

    lost_moonJun 18, 20253 reactions

    Going by my knowledge: gguf is compressed and loads slower and loses some quality. base is larger, faster in the sense that it doesn't need to be decompressed in runtime, however it requires more vram and takes also somewhat longer to load into vram because its bigger. gguf might be slightly worse in everything in direct comparison. For image gguf I remember how Q8 was basically almost default, Q5 first with slight quality loss, Q4 still acceptable. But for example on a 10gb vram gpu one had to use Q4 as it was the only model below 10gb size. In therms of workflow, the basic one has no gguf load workflow nodes included, therefore if you use gguf quants you need the gguf wf.

    _VI_Jun 18, 2025

    @lost_moon聽Thanks for the detailed answer! I couldn't understand because using both methods I get the same result in processing time.

    Psy_pmpJun 16, 2025
    CivitAI

    String Concatenate broken

    jfgjrty55Jun 18, 2025
    CivitAI

    mxslider2d notes just look blank, if I do fix v2 they reappear for a second then disappear again. Anyone know how to fix this?

    ApchXiJun 30, 2025
    CivitAI

    Hi! Please help me with this!!!!

    KSampler

    mat1 and mat2 shapes cannot be multiplied (154x768 and 4096x5120)

    arandomuser2839Jul 4, 2025

    For me that's usually needing to use a different text encoder. Try changing the CLIP model used.

    ApchXiJul 4, 2025

    @arandomuser2839聽Thanks!!

    spanking_cutieAug 2, 2025
    CivitAI

    10/10

    tupuAug 4, 20251 reaction
    CivitAI

    maybe it works well, but is not simple at all.

    UmeAiRT
    Author
    Aug 4, 2025

    Simple is the name of my workflow series, advance thing like vace can look complex but you just have to import 1 image and 1 video and all is automatic

    GoldenCharactersAug 11, 2025
    CivitAI

    Really great workflow. How can I make the generated video keep the same style as my input image? I used the auto install script and left all the settings at default, but the output still changes a lot from the original image.

    phdalAug 19, 20251 reaction
    CivitAI

    No matter the seetings it is either an OOM or a 'The size of tensor a (60840) must match the size of tensor b (62010) at non-singleton dimension 1' error

    Maxed out virtual memory, on a 24GB VRAM card

    Essentially with Q8 the workflow is using upwards of 50gb or memory through some magical reason.

    Anyone else having this?

    zamaelDec 21, 2025

    I had the same issue, but I could resolve the issue by using standard resolustion like 480p or 720p and it workd for me. (RTX 3090)

    zamaelDec 21, 2025
    CivitAI

    It worked for me. If you have encountered an issue like "'The size of tensor a (####) must match the size of tensor b (####) at non-singleton dimension 1" at Ksampler, that means your latent image size could be an issue.

    Try to use standard resolution such as 480P or 720P. I used the exact model/Lora/Clip/Vae that the OP listed, and ran this workflow with RTX 3090 (Base Version).

    Thank you for creating such a simple and easy workflow, OP!

    Workflows
    Wan Video 14B i2v 720p

    Details

    Downloads
    2,629
    Platform
    CivitAI
    Platform Status
    Available
    Created
    5/29/2025
    Updated
    5/13/2026
    Deleted
    -

    Files

    vaceCONTROLNETSimpleWorkflow_v10.zip

    Mirrors