CivArchive
    (NSFW) Dead-Simple MMAudio + RIFE Interpolation Setup for WAN 2.2 I2V 14B - v1.0.3
    NSFW

    Changelog

    Version 1.0.3: Connected both steps so no more re-uploading is required. Just upload your video in Step 1 and hit Run.

    Version 1.0.2: Changed VHS nodes to VHS ffmpeg nodes to avoid color drift (thank you LastAssignment). Also changed FPS flow from 24 to 25 to more closely align to MMAudio specs.

    Version 1.0.1: RIFE Group output was set to 8fps by accident. Changed it to 24fps

    Version 1.0: Initial release

    A TRIBUTE TO GOONERS EVERYWHERE

    Your WAN 2.2 video is great. It looks awesome. But where's the sound? We moved from images to videos, and WAN 2.2 is incredible for video. The missing piece...AUDIO!

    This is my first article ever, so I'm sorry if I made any mistakes. Please leave a comment if I've made an error or if you need any help. For your reference, I'm running:

    • ComfyUI 0.3.68

    • Torch 2.9

    • CUDA 13

    • Python 3.13.9

    • Sage Attention 2.2

    • NVIDIA 5070 Ti (16gb vram)

    And here are the custom nodes (3 in total):

    • ComfyUI-VideoHelperSuite 1.7.7 (https://github.com/Kosinkadink/ComfyUI-VideoHelperSuite)

    • ComfyUI-MMAudio Nightly (https://github.com/kijai/ComfyUI-MMAudio)

      • I recommend manually git cloning this node pack into your /ComfyUI/models/custom_nodes folder and then installing the requirements.txt file using your embedded python. I'm on portable Comfy, so the command would look something like this:

        • "C:\ComfyUI\python_embeded\python.exe" -m pip install -r "C:\ComfyUI\ComfyUI\custom_nodes\ComfyUI-MMAudio\requirements.txt"

    • ComfyUI-VFI Unknown (https://github.com/GACLove/ComfyUI-VFI)

      • I think there's a more popular RIFE custom node that a lot of other people use, but Icouldn't figure out how to get fractional multiples for interpolation (16 -> 25fps is a ~1.5x interpolation), but this node allows it.

    Onto the workflow...

    ------------------------------------

    This workflow handles two jobs:

    1. Fix WAN 2.2’s native 16fps output by interpolating it to 25fps with RIFE.

    2. Generate synced audio with MMAudio using the final 25fps video.

    The setup is plug-and-play. Drop in your WAN video → interpolate → feed it into MMAudio → get synced output. The included notes explain the reasoning for FPS, step settings, and seed behavior.

    What this workflow covers:

    1. RIFE interpolation from 16 → 25 fps.

    2. MMAudio sampler

      1. Upon some further testing, 50-100 steps works well. The node runs pretty fast in general, and it's also worthwhile toying with CFG (4.5 - 8). 100 steps and CFG 8 works well for high-quality output and better prompt adherence.

    3. Automatic audio + video combine at 25fps.

    4. Optional re-interpolation afterward if you want 30fps+ output.

      1. You can plug your finished 25fps video into the 'Step 1: Rife Interpolation' group and just change the 'source_fps' to 25 and the 'target_fps' to 30.

    Required MMAudio files

    Download all of these into:

    ComfyUI/models/mmaudio

    MMAudio NSFW Model (fine-tuned off the base model)

    https://huggingface.co/phazei/NSFW_MMaudio/resolve/main/mmaudio_large_44k_nsfw_gold_8.5k_final_fp16.safetensors?download=true

    MMAudio VAE (fp16)

    https://huggingface.co/Kijai/MMAudio_safetensors/resolve/5984623e6b436818c6ff287ef6eec93e3e05aa3f/mmaudio_vae_44k_fp16.safetensors

    MMAudio Synchformer (fp16)

    https://huggingface.co/Kijai/MMAudio_safetensors/resolve/main/mmaudio_synchformer_fp16.safetensors

    MMAudio CLIP Encoder (fp16)

    https://huggingface.co/Kijai/MMAudio_safetensors/resolve/main/apple_DFN5B-CLIP-ViT-H-14-384_fp16.safetensors


    Nvidia BigVGAN v2 24KHz 100band 512x

    This seems to be required for MMAudio to work. You can manually download all the files, git clone, or use the HuggingFace CLI tool (huggingface-cli repo clone URL). The repo should be placed in the ComfyUI/models/mmaudio folder.

    https://huggingface.co/nvidia/bigvgan_v2_44khz_128band_512x

    Bonus

    Once you've created a good MMAudio track, there are some further steps you can take depending on what you'd like to create.

    1. Import your audio/video into some type of software (CapCut/Shotcut) and layer on some music in the background. I've done this with a few of my videos. I added a 'radio' filter to make it seem like the music was kinda tinny and playing in the background.

    2. Layer other audio tracks alongside the NSFW audio track. You can see KaptainSisay very elegantly did something like that here (https://civarchive.com/images/110700679)

    Description

    Connected both steps so no more re-uploading is required. Just upload your video in Step 1 and hit Run.

    FAQ

    Comments (85)

    seedilyDec 12, 2025
    CivitAI

    Thank you for this, it's great! Working well on a portable install, takes around 20-30s to generate a 7s audio clip.

    Do you have any advice for improving the audio quality once I get some that i like? Like, is there a way to essentially "upscale" the audio after generation?

    Melodic_Possible_582589Dec 12, 2025· 2 reactions
    CivitAI

    automatic synched audio generated? that's some wan 2.5 level stuff :)

    9832676Dec 12, 2025· 2 reactions
    CivitAI

    i added fast group bypasser switch, and put rrfe in its own group so it didn't redo rrfe every time i ran mmaudio

    9832676Dec 12, 2025
    CivitAI

    it would be nice if you can add an nsfw qwen video describer then feed it the information you want to extract just for MMaudio, like "explain the video in a very concise way" , i always have problems finding the one that reads nsfw content.

    msiaigensDec 13, 2025
    CivitAI

    anyone tested this with 8gb vram?

    FangNightDec 13, 2025· 2 reactions
    CivitAI

    any prompt examples?

    usarigDec 19, 2025
    CivitAI

    Kapiere ich nicht. Irgendwie ist der Workflow nicht fertig angeschlossen. Da blinkt bei "Video Combine" der Image Anschluss auf. Wohin damit? Hab erst angefangen mit Videos :)

    sih9312Dec 22, 2025
    CivitAI

    lol i literally cannot find the output. any idea where it could be? it's not where it's supposed to be...

    daspin335Dec 26, 2025· 4 reactions
    CivitAI

    got the error:

    **error solved**

    error(s) in loading state_dict for MMAudio: Missing key(s) in state_dict: "clip_input_proj.2.w1.weight", "clip_input_proj.2.w2.weight", "clip_input_proj.2.w3.weight", "text_input_proj.2.w1.weight", "text_input_proj.2.w2.weight", "text_input_proj.2.w3.weight". Unexpected key(s) in state_dict: "clip_input_proj.1.w1.weight", "clip_input_proj.1.w2.weight", "clip_input_proj.1.w3.weight", "text_input_proj.1.w1.weight", "text_input_proj.1.w2.weight", "text_input_proj.1.w3.weight". size mismatch for t_embed.mlp.0.weight: copying a param with shape torch.Size([896, 256]) from checkpoint, the shape in current model is torch.Size([896, 896])

    **error was outdated nodes from nodes list tab, update nodes, double check the model name

    _xxxBigMemerxxx_Dec 30, 2025· 5 reactions
    CivitAI

    Even a braindead Degen like myself got this to work. You're an absolute legend m8

    oldthrashbarJan 2, 2026
    CivitAI

    nice but has anyone had any luck prompting out the moaning and chinese speaking?

    ParadantozJan 4, 2026
    CivitAI

    Bro, how do video in 60 fps? The sound is really bad and doesn't match the video. But at 25-30 fps, everything is fine. Could you help or tell me what's wrong?

    texaspartygirlJan 7, 2026· 1 reaction
    CivitAI

    this looks great. Is it possible to always get the same voice? Or maybe that requires training with a dataset with just the one voice?

    PlasmadoseJan 17, 2026
    CivitAI

    This is so simple and quick to use that I am kicking myself for not using it sooner.

    ShortLoveJan 17, 2026· 5 reactions
    CivitAI

    Perfect workflow. Tips I just discover, type just "music" in negative prompt and you will get clearer result more often

    AlexMeowlerJan 18, 2026
    CivitAI

    Not sure I get it, but why final video combine node produces 3 files - original video, image (last frame?) and video with sound attached. Is there a way to save only the final video with sound?

    orveronJan 18, 2026
    CivitAI

    Please fix the output location to export into default "output" folders, rather than dumping to the appdata\local\temp\etc. directories.

    SeoulSeeker
    Author
    Jan 19, 2026· 1 reaction

    All your need to do to is enable the 'save_output' toggle on the final Video Combine node. Then the output video will save into /output

    HermeusMoronJan 23, 2026
    CivitAI

    the RIFEInterpolation custom node seems to be missing for me, and I can't seem to find it on the web to download separately. I have found "RIFE VFI" nodes, but I don't think that's the same thing as what's needed for this workflow.

    SeoulSeeker
    Author
    Jan 24, 2026

    https://github.com/GACLove/ComfyUI-VFI

    git clone this repo into your /custom_nodes folder

    HermeusMoronJan 26, 2026

    @SeoulSeeker thanks!

    nikecoy884925Jan 24, 2026
    CivitAI

    I got this error:
    'audio_input_proj.0.bias'

    SeoulSeeker
    Author
    Jan 24, 2026

    DM me, I need more information to help

    kaizercowan699133Jan 28, 2026
    CivitAI

    so i dont have ComfyUI/models/mmaudio ? So where do i put the audio files

    SeoulSeeker
    Author
    Jan 28, 2026

    just make a new folder in /models called mmaudio and put all the files there!

    Billyhank800Jan 31, 2026
    CivitAI

    hello sir, everything was working perfect just the other day, but now its not working. i think there was a node update. (im new) i would sincerley appreciate a workflow update. thank you for your hard work as always

    SeoulSeeker
    Author
    Jan 31, 2026

    try updating ComfyUI to the latest version

    serjrenarJan 31, 2026
    CivitAI

    It feels like the model recognizes 2.5D, 3D, and realistic input images much better than pure 2D ones. What do you think?

    SeoulSeeker
    Author
    Feb 1, 2026

    Probably, I wouldn't reasonably expect the training set to include a lot of 2D content, mostly real-life videos if anything

    DelavestraFeb 3, 2026
    CivitAI

    So I use the smoothing WF to create my wan2.2 animations. Does anyone have a WF that incorporates mmaudio smoothly into a full gen workflow, rather than a standalone WF to run?

    SeoulSeeker
    Author
    Feb 3, 2026

    You can just copy the content of the workflow and add it after your VAE decode. So your decoded images will get passed into the 'input of the MMAudio sampler

    DelavestraFeb 7, 2026

    @SeoulSeeker Cool ty. Would the prompt for the wan animation ever be adequate for the prompt for the audio? Or will that always require a manual/independent prompt?

    SeoulSeeker
    Author
    Feb 7, 2026

    @Delavestra best practice would probably be an independent prompt I’d say. You really want to provide a varied and complete spectrum of what would be heard in the audio. I use two wildcard nodes for oral sex and regular sex, I might just update the workflow description with that info because it works pretty well and basically eliminates the need to type a prompt

    elmejor9369Feb 7, 2026· 3 reactions
    CivitAI

    It keeps giving me this error: MMAudioFeatureUtilsLoader

    BigVGAN._from_pretrained() missing 2 required keyword-only arguments: 'proxies' and 'resume_download'

    Any tip?

    SeoulSeeker
    Author
    Feb 7, 2026

    Do you have a folder called 'bigvgan_v2_44khz_128band_512x' in /models/mmaudio?

    elmejor9369Feb 7, 2026

    @SeoulSeeker yes

    dolphinblowgal382Feb 8, 2026· 1 reaction

    I'm seeing the same issue. This is new. This workflow worked for me about a month ago.

    I also have nvidia/bigvgan_v2_44khz_128band_512x

    Fixed by cloning https://github.com/kijai/ComfyUI-MMAudio in custom_nodes

    elmejor9369Feb 8, 2026

    @dolphinblowgal382 Still having the issue after cloning https://github.com/kijai/ComfyUI-MMAudio in custom_nodes

    SeoulSeeker
    Author
    Feb 8, 2026

    @elmejor9369 a newer comment from a person with the same problem just said 'yah i had to manualy clone the MMaudio repo not install from manager'

    elmejor9369Feb 9, 2026

    @SeoulSeeker -nope, still nothing, i've clone it manually and still gives the same error.

    SeoulSeeker
    Author
    Feb 9, 2026

    @elmejor9369 are you on a new-ish Comfy version? Try updating to the latest stable version

    migero731Feb 8, 2026· 1 reaction
    CivitAI

    i keep getting
    BigVGAN._from_pretrained() missing 2 required keyword-only arguments: 'proxies' and 'resume_download'
    i did download all 4 models clip vae and such

    migero731Feb 8, 2026· 1 reaction

    yah i had to manualy clone the MMaudio repo not install from manager

    SeoulSeeker
    Author
    Feb 8, 2026

    @migero731 thank you for replying with your fix!

    migero731Feb 8, 2026

    @SeoulSeeker np nothing like learning from your mistakes :D

    btw is 100 samples ok ? it seems to produce bad sounds 300 seems better to me

    SeoulSeeker
    Author
    Feb 8, 2026

    @migero731 it all depends on the seed and your prompt, ideally higher steps should be better but there are often diminishing returns and sometimes it can overcook the result. i usually do 50-100 steps myself

    TohnoAkihaFeb 14, 2026
    CivitAI

    Can I change the voice, or is it only randomly selected via a seed?

    SeoulSeeker
    Author
    Feb 17, 2026

    Keeping a fixed seed should help a little, but I haven't found a way to reliably control the voice

    TheSemenDemonFeb 17, 2026
    CivitAI

    i created the mmaudio folder and placed 4 files in there. however the nodes are still missing? did i do something wrong?

    mpayne1985106Feb 17, 2026

    Me too. Reinstall via custom notes manager also same error.

    You need to install all missing nodes from 'comfyui custom note manager'.

    mpayne1985106Feb 17, 2026· 1 reaction

    Ok. I finally fix use this command.

    1. Open 'python_embeded' folder Ex: " C:\ComfyUI\ComfyUI-Easy-Install\python_embeded "

    2. type cmd in 'File Explorer' address bar on the top. Ex: replace " C:\ComfyUI\ComfyUI-Easy-Install\python_embeded " into " cmd " & then enter.

    3. Paste this command. Ex: "python.exe -m pip install -r C:\ComfyUI\ComfyUI-Easy-Install\ComfyUI\custom_nodes\comfyui-mmaudio\requirements.txt"

    Dracken1986Feb 17, 2026
    CivitAI

    Answer: BigVGAN._from_pretrained() missing 2 required keyword-only arguments: 'proxies' and 'resume_download'

    Ok so anyone really new to this like me that don't understand what people might consider basic (clone), here is a fix to the above error.

    open your comfy directory, go into custom nodes (if your using Pinokio like me: C:\pinokio\api\comfy.git\app\custom_nodes) find and delete "ComfyUI-MMAudio". (if its not there, then skip that step).

    Then, right click anywhere in the window within the custom_nodes window and select "open in terminal" to open command prompt. Type in "git clone https://github.com/kijai/ComfyUI-MMAudio"

    (this should then install). It worked fine for me after this.

    If your still having issues, check that the requirements are installed.

    "pip install -r requirements.text" within folder.

    Hope this helps (sorry if this isn't helpful, This error been driving me nuts).

    hazzy00584Feb 21, 2026

    i had same issue i had to go into bigvan.py and then copy paste the code provided on github as there is a fix for that. use that code. use gemini to find its fix it knows which link to give to you to hcange the code to bigvan.py now it works fine for me/

    MuIeiFeb 19, 2026
    CivitAI

    im having issue on step 2 mmaudio

    free_upper_bound + pytorch_used_bytes[device] <= device_total INTERNAL ASSERT FAILED at "C:\\actions-runner\\_work\\pytorch\\pytorch\\pytorch\\c10\\cuda\\CUDAMallocAsyncAllocator.cpp":563, please report a bug to PyTorch.

    SeoulSeeker
    Author
    Feb 19, 2026· 1 reaction

    looks like an OOM error to me. try clearing your model/execution cache and run it again with other programs closed

    stenny13street654Feb 21, 2026· 1 reaction

    You just uploaded a video that was too long.

    MuIeiFeb 23, 2026

    @stenny13street654 yes, i tried with a 5 second video and worked

    TribalbladeFeb 23, 2026
    CivitAI

    Works fairly well! I wonder how hard it is to fine-tune the mmaudio model some more, feels like it could be quite a bit better. Thanks for uploading and explaining

    kosenMar 9, 2026· 1 reaction
    CivitAI

    an error MMAudioModelLoader

    Error(s) in loading state_dict for MMAudio: Missing key(s) in state_dict: "clip_input_proj.2.w1.weight", "clip_input_proj.2.w2.weight", "clip_input_proj.2.w3.weight", "text_input_proj.2.w1.weight", "text_input_proj.2.w2.weight", "text_input_proj.2.w3.weight". Unexpected key(s) in state_dict: "clip_input_proj.1.w1.weight", "clip_input_proj.1.w2.weight", "clip_input_proj.1.w3.weight", "text_input_proj.1.w1.weight", "text_input_proj.1.w2.weight", "text_input_proj.1.w3.weight". size mismatch for t_embed.mlp.0.weight: copying a param with shape torch.Size([896, 256]) from checkpoint, the shape in current model is torch.Size([896, 896]).

    Same error here, I had to install official model

    kosenMar 22, 2026

    @la440soundproject752 I have already solved this problem. It's an issue with the plugin environment. You can send this error report to DeepSeek or other AI, and it will guide you step by step on how to install the environment

    Kazuki_kunMar 18, 2026
    CivitAI

    Is there a way to run it with 8GB VRAM? If so, how should I config it?

    SeoulSeeker
    Author
    Mar 30, 2026

    You can just give it a try. I can't say whether it will work, but worth a shot

    OtherworldsMar 22, 2026
    CivitAI

    Can someone tell me what caused this error?"An error happened while trying to locate the files on the Hub and we cannot find the appropriate snapshot folder for the specified revision on the local disk. Please check your internet connection and try again."

    ## Stack Trace ``` File "D:\ComfyUI-aki-v2\ComfyUI-aki-v2\ComfyUI\execution.py", line 515, in execute output_data, output_ui, has_subgraph, has_pending_tasks = await get_output_data(prompt_id, unique_id, obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb, v3_data=v3_data) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\ComfyUI-aki-v2\ComfyUI-aki-v2\ComfyUI\execution.py", line 329, in get_output_data return_values = await _async_map_node_over_list(prompt_id, unique_id, obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb, v3_data=v3_data) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\ComfyUI-aki-v2\ComfyUI-aki-v2\ComfyUI\execution.py", line 303, in _async_map_node_over_list await process_inputs(input_dict, i) File "D:\ComfyUI-aki-v2\ComfyUI-aki-v2\ComfyUI\execution.py", line 291, in process_inputs result = f(**inputs) ^^^^^^^^^^^ File "D:\ComfyUI-aki-v2\ComfyUI-aki-v2\ComfyUI\custom_nodes\ComfyUI-MMAudio-main\nodes.py", line 243, in loadmodel snapshot_download( File "D:\ComfyUI-aki-v2\ComfyUI-aki-v2\python\Lib\site-packages\huggingface_hub\utils\_validators.py", line 114, in _inner_fn return fn(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^ File "D:\ComfyUI-aki-v2\ComfyUI-aki-v2\python\Lib\site-packages\huggingface_hub\_snapshot_download.py", line 248, in snapshot_download raise LocalEntryNotFoundError(

    ODBsGoonerExtravaganzaMar 24, 2026· 1 reaction
    CivitAI

    Lip syncing is virtually non-existent with the fine-tune. Moans are driven almost entirely by body motion. There's still plenty (and it is fun) to work with, but it severely limits it's usefulness

    balrogk877Mar 26, 2026· 1 reaction
    CivitAI

    Works great. Thanks for sharing !

    aznableMiaoMar 28, 2026
    CivitAI

    What prompt have you used in mmaudio sampler node? The prompt shows on demo video is for wan i think.

    SeoulSeeker
    Author
    Mar 30, 2026· 1 reaction

    I use a wildcard node with two different prompt blocks for oral sex and regular sex. These can easily be modified with an LLM like Grok

    Oral sex:

    Generate intimate, rhythmic sounds: {wet skin slapping|throbbing flesh slapping|slick body impacts} at a {slow sensual pace|steady moderate pace|building fast pace}, syncing with thrusts, {sensual female orgasmic moans that rise and fall|breathy female gasps building to cries|playful female whimpers turning to moans} with {rising intensity|teasing build-up|explosive peaks}, faint {squelching with each movement|wet friction sounds|slick glide noises}. Add {breathy whispers of dirty talk|soft bed creaks under motion|fabric rustle against skin}. Keep it close-mic’d and natural, with {subtle echo in room|raw unfiltered intimacy|layered breathy reverb}.

    Regular sex:

    Generate intimate oral sex sounds: {wet slurping on flesh|sloppy sucking noises|deepthroat glide sounds} at a {slow teasing pace|steady rhythmic pace|building intense pace}, syncing with bobbing motion, {sensual female moans muffled around shaft|breathy female hums building to gasps|playful female gagging turning to whimpers} with {rising intensity|teasing build-up|explosive peaks}, prominent {saliva drooling and dripping|wet suction pops|throat gurgles and licks}. Add {breathy dirty talk encouragements|soft hand stroking sounds|fabric rustle or knee shifts}. Keep it close-mic’d and natural, with {subtle room ambiance|raw wet intimacy|layered breathy reverb}.

    aznableMiaoMar 31, 2026

    @SeoulSeeker thanks! I will try it

    VektorApr 2, 2026
    CivitAI

    I'm getting:

    RIFEInterpolation
    module 'torch' has no attribute 'nullcontext'

    Any ideas? Thanks in advance.

    SeoulSeeker
    Author
    Apr 3, 2026

    Looks like your PyTorch might be out of date. You might need a fresh Comfy install, or update PyTorch individually (might cause problems)

    VektorApr 5, 2026

    But I just installed Comfy, and this workflow was working fine last week.

    TezozomoctliApr 6, 2026
    CivitAI

    This workflow used to work for me about a month ago but now when I tried using it a bunch of nodes no longer worked (MMAudioLoader, FeatureUtilLoader). I tried updating in the manager settings and rolling back to previous versions but that didn't seem, to work. I think this issue occurred after the major ComfyUI update.

    WeazeApr 10, 2026· 2 reactions

    Update Comfyui to v0.18.5. I recommend to clone/copy your old installation first. then navigate to comfyui/user folder and search for the config.inni. Edit the config innis security level to: = weak. Save the file so you can install the MMAuudio-Suite from Kijai otherwhise Comfyui blocks the installation. Install first ComfyUI-MMAudio from KJ. You ll notice the nodes dont show up. Then install ComfyUI-MMAudio-Suite from Takenoko3333 ontop. Close Comfyui and change the security level back to: = normal. hope this helps m8.

    TezozomoctliApr 10, 2026

    @Weaze Changing the security level worked! Thank you so much!

    RavagedCherryApr 7, 2026
    CivitAI

    Is it possible to only get the sounds of the bodies interacting? I don't want any voices, moaning etc.

    SeoulSeeker
    Author
    Apr 7, 2026

    You can try prompting for voices/moaning/etc, though I think the dataset for the nsfw finetune is pretty heavily biased towards vocalization

    OrrodrarverApr 7, 2026
    CivitAI

    Hey MMAUDIO generates only sound effects if im right, is there any chance to make them tell, speak, say what im wrote in the promt?

    SeoulSeeker
    Author
    Apr 7, 2026

    Not with this workflow/model. You'd probably want to try LTX 2.3 for that

    HmNikeApr 10, 2026· 4 reactions
    CivitAI

    If some of you want a very cool trick with mmaudio I used on some of my video

    So first you will need an editing software or maybe you can do it with comfui but I don't know how.

    So because MMaudio generate sounds synced to motion, most of the time the model get pretty confused if you have too much going on or your subject/action is too small/far from the "lens"

    So the trick is to edit your video, crop it on the part you want and zoom into it to only have one single and clear motion, face, expression, etc.. for the model to understand

    Then put this edited video on the mmaudio gen and most of the time it is way better like that.

    And then of course you will need to edit the video with the audio track, it's a bit extra work but I preafer that than playing " please give me a good seed" game

    SeoulSeeker
    Author
    Apr 14, 2026

    Great idea, thank you for the comment!

    HmNikeApr 14, 2026· 1 reaction

    @SeoulSeeker your welcome, I'v posted a workflow with this idea, using ultra lyrics detector to crop the face and generate only with this part if you want to check.

    lTWASNTMEApr 17, 2026
    CivitAI

    For some reason, though I installed everything, it still can't find these nodes:

    - MMAudioFeatureUtilsLoader

    - MMAudioModelLoader

    - MMAudioSampler

    Any ideas why that would be?

    AIai88Apr 27, 2026

    Transformers are limited to version 4.x (e.g., 4.45.0 or 4.46.0). Try changing it.

    Workflows
    Wan Video 2.2 I2V-A14B

    Details

    Downloads
    12,945
    Platform
    CivitAI
    Platform Status
    Available
    Created
    12/12/2025
    Updated
    6/14/2026
    Deleted
    -

    Files

    NSFWDeadSimpleMmaudioRIFE_v103.zip

    Mirrors