Latest: v4.20 not to be confused with 4.2 (I just don't want to go past 4.2 sorry lol) Added Any Node to duplicate int image, not much of a change, but it's a single image version of wav2lip now. So 1 image is all you need for the wav2lip video thanks to any node. Other ways to do it, but this was a test and it worked so why not. Single Image Version.
Best Version - v4.2 - So, added wav2lip to the workflow, someone had a working node going, now you can load a video, and have whisper create the voice for your assistant and now, you have a voiced animated avatar. Fun workflow, piecing it together was awesome lol. Crafting LLM profiles, plugging different connectors to places they didn't belong. Play sound (loop) is the only node I can get good audio for this inside the workflow. You have to convert the path to an input, then use the wav output from whisper as an input for play sound. this makes the voice play so you can hear it.
It then sends it with the video you uploaded to wav2lip to create the video. Node works, but it's not as good as forge and auto yet. Creator of the node did an awesome job non the less. Can only go up from here and here's not bad lol.
older-------------
Ran voice through wav2lip and sadtalker for some fun and pulled an old character I made to be the face of Darwin lol.
Added more groups, notes explaining things a bit.
AI Assistant through text or voice.
Inpainting and outpainting
SVD, Cascade, AnimateDiff
Spritesheet maker
Text to voice generation using .ogg audio files as training IF Whisper to Speech (Audio with vocals only, 3 minutes seems to be fine, but 10 as in their example seems to be better)
Added my layering group node setup
Voice assistant test workflow. Trying out some nodes for my Rosebud AI workflow. Darwin is a custom personality, so not included.
Needs Ollama installed and running
Impact Frames or IF nodes make this possible
WIP
Description
Added Wav2Lip
FAQ
Comments (4)
Thanks for this great work flow
How to download speakers, like darwin.ogg
You just need to put a 10 minute vocal file in the audio folder. It requires .ogg format, but it's easy to convert mp3 to ogg tons of free apps to do it. It's used in indie game development a lot. Now, 10 minutes is best, but I've gotten decent results from around 3 minutes Ish of vocals I separated from Udio and Suno songs I made.
Thank you for making interesting workflow.
I could create a video with voice but the character didn't move mouth.
The pictures output of Any node(local LLM) is the same 19 pictures. Does it correct?
Yeah, it needs ollama installed and running to use any node with your LLMs. If you'd like an easier go, you can use a video int also. I wanted to incorporate any node and test it out. Plus I didn't have a video to use on hand at the time so it worked out lol.
