Video nodes
Generate AI video clips, combine them into sequences, and upscale the result. From a text prompt to broadcast-ready output.
Video nodes let you generate, combine, and enhance video content directly inside your Spaces workflow. The Video Generator supports 40+ AI models from Kling, MiniMax, Runway, Google Veo, Sora, PixVerse, Seedance, Wan, Luma, and more. Once you have clips, the Video Combiner merges them into longer sequences, and the Video Upscaler enhances resolution and quality.
In this article
- Video Generator
- Video Generator models
- Key model capabilities
- Video Combiner
- Video Upscaler
- Common workflows
- Coming soon
- Tips and best practices
Video Generator
The Video Generator creates AI video clips from text prompts and optional visual references. With 40+ models you can generate everything from cinematic scenes to animated illustrations, lip-synced characters, and motion-controlled sequences.
How to use it
Add the node
Search for Video Generator in Spotlight. You can also search for a specific model like Kling 3.0 or Veo 3 to add a pre-configured node.
Write your prompt
Describe the scene you want to generate. Some models can work without a prompt when a Start Frame image is provided.
Choose a model
Click the model selector to browse 40+ options. Hover over any model for a tooltip showing features, duration, and credit cost.
Set your options
Pick an aspect ratio, duration (2s to 10s depending on model), resolution (720p or 1080p), and number of generations (1 to 10).
Connect references (optional)
Connect a Start Frame image to animate a still, an End Frame for keyframe interpolation, or reference images to guide style and composition.
Run the node
Results appear on the card. Use the arrow buttons to browse multiple generations.
Settings
| Setting | What it does |
|---|---|
| Prompt | Text description of the scene. Supports @mentions to reference other nodes. |
| Model | Which AI model to use. 40+ options from multiple providers. |
| Aspect Ratio | 1:1, 16:9, 9:16, 4:3, 3:4, and more. Varies by model. |
| Duration | Length in seconds. Common options: 2s, 4s, 5s, 8s, 10s. |
| Resolution | 720p or 1080p (model-dependent). |
| Sound Effects | Toggle to include AI-generated audio that matches the video content (model-dependent). |
| Number of Generations | 1 to 10 clips per run. |
Input and output
| Direction | Port | Data type | Notes |
|---|---|---|---|
| Input | Prompt | Text | Scene description |
| Input | Start Frame | Image | First frame for image-to-video |
| Input | End Frame | Image | Last frame for keyframe interpolation |
| Input | References | Image (multiple) | Guide visual style, character, product, or composition |
| Input | Video Reference | Video | Existing video as motion guide. Availability depends on model. |
| Input | Audio | Audio | For lip-sync models |
| Output | Output | Video | The generated video clip |
| Output | Start Frame | Image | First frame of the generated video |
| Output | End Frame | Image | Last frame of the generated video |
| Output | Audio Output | Audio | Audio track (if model supports sound) |
Use cases
- Text-to-video. Type a prompt, pick a model, generate. The simplest starting point.
- Image-to-video. Connect an Image Generator output to the Start Frame port to animate a still image.
- Keyframe interpolation. Set both a Start Frame and an End Frame. The AI generates the transition between them.
- With voiceover. Generate a video, then combine it with a Voiceover track using Video Audio Mix.
- Storyboard to video. Create multiple clips from scene descriptions and assemble them with Video Combiner.
Video Generator models
The Video Generator supports models from 13+ providers. Each model appears in Spotlight as a variant (for example, search "Kling 3.0" to add a pre-configured node).
| Provider | Known for |
|---|---|
| Kling | Consistent quality, motion control, wide range of options |
| MiniMax | Fast generation, live illustration style |
| Runway | High-quality cinematic output |
| Google (Veo) | Photorealism, prompt adherence, sound generation |
| OpenAI (Sora) | Creative output, strong prompt-following |
| PixVerse | Versatile generation, good value |
| ByteDance (Seedance) | Rich reference system, lip-sync, person detection |
| Wan | Anime and illustration focus |
| Luma | Dreamy, artistic visual style |
| Hunyuan | Chinese art style, versatile output |
| LTX | Fast budget-friendly option |
| xAI (Grok) | Fast, creative generation |
| VEED (Fabric) | Fast, practical video creation |
Key model capabilities
Not all models support all features. Here is what to look for depending on what you need.
| Capability | What it does | Supported by |
|---|---|---|
| Start Frame | Animate a still image as the first frame | Most models |
| End Frame | Set a target last frame for keyframe interpolation | Select Kling, Wan, and Seedance variants |
| Character references | Guide generation with a character image | Act Two, Wan Animate, Seedance, Veo 3.1, Kling 3.0 |
| Product references | Guide generation with a product image | Seedance, Veo 3.1, Kling 3.0 |
| Style references | Transfer visual style from a reference | Seedance 2.0, Veo 3.1, Kling 3.0, MiniMax Reference |
| Motion control | Preset camera movements (pan, zoom, orbit, tilt) | Kling Motion Control 2.6, Kling Motion Control 3.0 |
| Sound effects | AI-generated audio matching the video content | Veo 3 and others |
| Lip-sync | Synchronize character mouth with audio input | Omni Human 1.5, Kling Motion Control variants |
| Negative prompt | Specify elements to exclude from generation | Models with negative prompt support |
| Multishot | Generate multi-shot sequences from a single prompt | Models with multishot support |
Video Combiner
The Video Combiner merges multiple video clips into a single continuous video. Connect multiple video sources and arrange their order to create longer sequences.
How to use it
Add the node
Search for Video Combiner in Spotlight.
Connect your clips
Connect multiple video outputs to the Videos input port. Order matters.
Arrange the sequence
Use the Video Order control to drag and reorder clips. The View Mode setting lets you switch between timeline and list views for managing your clip order.
Run the node
The combined video appears on the card.
Input and output
| Direction | Port | Data type | Notes |
|---|---|---|---|
| Input | Videos | Video (multiple) | Clips to combine. Order matters. |
| Output | Output | Video | The combined video |
| Output | Start Frame | Image | First frame of the combined video |
| Output | End Frame | Image | Last frame of the combined video |
Use cases
- Story sequence. Generate multiple scenes with the Video Generator, then assemble them into a complete narrative.
- Music video. Combine a series of generated clips into a seamless video.
- Smooth transitions. Connect the End Frame output of one Video Generator to the Start Frame input of the next, then combine everything for smooth scene-to-scene flow.
Video Upscaler
The Video Upscaler enhances video resolution and quality using AI. Three modes cover different needs.
| Mode | Powered by | Best for | Speed |
|---|---|---|---|
| Topaz | Topaz | Professional video. Resolution control, frame interpolation for smoother playback. | Moderate |
| Magnific AI | Magnific | Creative enhancement. Three presets: Realistic, Animation 3D, Artistic. Optional FPS boost. | Slower |
| Sharpen | SeedVR2 | Quick quality boost. Fast resolution enhancement without creative changes. | Fast |
How to use it
Add the node
Search for Video Upscaler in Spotlight.
Connect a video
Connect any video output to the Input Video port.
Choose a mode
Topaz for professional output, Magnific AI for creative enhancement, Sharpen for a quick fix.
Configure settings
Set the target resolution and any mode-specific options like FPS boost or preset.
Run the node
The upscaled video appears on the card.
Use cases
- Post-generation enhancement. Take a 720p generated video to broadcast-ready 1080p with Topaz.
- Creative video art. Transform a generated video with the Magnific AI Artistic preset.
- Quick fix. Sharpen and clean up any video fast with the Sharpen mode.
Common workflows
- Text to video. Text node with your prompt connected to Video Generator. The simplest starting point.
- Image to video. Image Generator connected to Video Generator Start Frame port. Animate any still image.
- Generate and upscale. Video Generator at 720p, then Video Upscaler to 1080p for the final output.
- Multi-scene production. Multiple Video Generators (one per scene) connected to a Video Combiner for a complete sequence.
- Full audiovisual pipeline. Video Generator plus Voiceover plus Music Generator, all combined with Video Audio Mix.
Tips and best practices
Connect an Image Generator to Start Frame for image-to-video. This is one of the most popular workflows. Generate a still image first, then animate it.
Use both Start Frame and End Frame for smooth transitions. The AI generates the video between your two keyframes, giving you precise control over the beginning and end.
Enable Sound Effects on models that support it. Veo 3 and others can generate matching audio automatically, saving you the step of adding sound separately.
Match the model to the style. Kling for consistent quality, MiniMax for speed, Veo for photorealism, Wan for anime, Luma for dreamy artistic looks.
Use Run Downstream when changing only the video model. It skips re-generating upstream nodes like prompts and images, saving time and credits.
Connect End Frame outputs to the next scene Start Frame. When building multi-scene videos with the Combiner, this creates smooth transitions between clips.
Topaz is the professional choice for upscaling. Use it for final delivery. Magnific AI is better for creative experimentation. Sharpen is fastest when you just need a quick boost.
For lip-sync, use Omni Human 1.5 or Kling Motion Control. Connect a Voiceover audio output to the Video Generator Audio input on these models.
Can't find an answer to your question?
Our support team is here to help you with any questions or issues.
Submit a request