WAN2.2 5B - First-Last-Frame to Video (FLF2V)

WAN2.2 5B - First-Last-Frame to Video (FLF2V) - v2.0

Unlock the power of 14B-style animation control on the 5B model! This workflow uses a custom node to perform First-Last-Frame conditioning, letting you guide the video's start and end point with precision.

Breakthrough Accessibility: This workflow brings a flagship feature from the demanding WAN2.2 14B model to the efficient and popular 5B variant. By leveraging the powerful Wan22FirstLastFrameToVideoLatent custom node, you can now define both the starting frame AND the ending frame of your generated video, giving you unprecedented control over the animation's narrative arc.

🎯 Precision-Guided Video Generation:

First-Last-Frame (FLF) Conditioning: The core innovation. You provide two images:
1. Start Image: The first frame of your video.
2. End Image: The desired final frame.
  The model then generates a coherent video that smoothly transitions between these two defined points.
Unmatched Creative Control: This is far more powerful than standard image-to-video. Want a flower to bloom? A character to turn and smile? An object to transform? Define the start and end states, and let the AI handle the complex in-between motion.
Perfect for the 5B Model: Enjoy this advanced capability without the massive VRAM requirements of the 14B model. This makes high-control animation accessible to a much wider audience.
Simple & Efficient: The workflow is clean, focused, and designed for simplicity. Load your two images, set your prompt, and generate. It's perfect for both experimentation and production.

⚙️ How It Works:

Input: You load two key images: the initial state and the target state.
Encoding: The custom node Wan22FirstLastFrameToVideoLatent encodes both images into the model's latent space alongside your text prompt.
Generation: The KSampler uses this combined information to generate a video latent that respects both the starting condition and the desired outcome.
Output: Decode the latent into a video file that seamlessly animates from your first frame to your last.

✨ Key Features:

Custom Node Power: Utilizes stduhpf/ComfyUI--Wan22FirstLastFrameToVideoLatent to enable this advanced feature.
Full 5B Compatibility: Works with wan2.2_ti2v_5B_fp16.safetensors and standard WAN components (umt5 CLIP, wan2.2_vae).
Complete Pipeline: Includes everything from model loading to final MP4 video output with the Video Helper Suite node.
Flexible: The note node reminds you that you can use full GPU models if you have the VRAM, making it adaptable to different hardware setups.

🎨 Ideal For:

Storyboarding: Plan and visualize scenes with exact beginning and ending frames.
Controlled Transformations: Create precise morphing, growth, rotation, or state-change animations.
Artistic Exploration: Experiment with having the AI solve the "how" of getting from point A to point B.
Users who want the creative control of the 14B model but need the practicality of the 5B model.

⚠️ Requirements:

Custom Node: You must install the Wan22FirstLastFrameToVideoLatent node from its repository (e.g., stduhpf/ComfyUI--Wan22FirstLastFrameToVideoLatent).
Standard WAN Dependencies: The usual suspects: ComfyUI-VideoHelperSuite, and the required WAN2.2 5B model files.

This workflow is a game-changer for WAN2.2 5B users. It dramatically expands the creative possibilities of the model, moving it from a simple text-to-video tool to a powerful keyframe-based animation system.

Stop just generating videos. Start directing them. Download the workflow and define your story's beginning and end.

Description

Updated workflow and now using Wan2.2-Fun-5B-InP-GGUF

instead of the regular Wan2.2_ti2v_5B model.

FAQ

Comments (6)

blobby99Sep 4, 2025· 3 reactions

CivitAI

14B does NOT have massive VRAM requirements- can people please stop repeating this lie. No video model does or can. Instead you need (very cheap) system RAM to hold/cache the model so it can be streamed, once per iteration, across your PCIe bus (and being a LINEAR data structure, it is used as it is streamed, so needs only a tiny amount of VRAM, regardless of size).

civitai3463Sep 29, 2025

I agree. Unfortunately the available tools like ComfyUI don't do a great job advertising how to activate this functionality.

paco02Sep 4, 2025· 1 reaction

CivitAI

works really well on a rtx 4070 16vram

thanks for sharing this ! !

zardozai

Author

Sep 4, 2025

You welcome buddy! I am glad you enjoyed it!

maxtusrordey512Sep 5, 2025

CivitAI

missing node type, loader gguf clip loader gguf vaegguf, where can i find these nodes? comfyui manager wont install them. I have already downloaded and installed Wan22FirstLastFrameToVideoLatent and i have comfyui video helper suite installed. where can i find those custom nodes? why do i get this error?

GrimmHelltrapSep 13, 2025

I replaced them with the regular Load Clip, Load VAE, and UNET loader (GGUF). The workflow processed but the output I got was nothing like either image I had loaded so this might not be a solution. I've only tried it once but on my 8Gb VRAM it took 40 mins to generate a clip that was nothing like my input. But further testing is required before I can confirm

Workflows

Wan Video 2.2 TI2V-5B

by zardozai

Download (Beta) View on CivitAI