Freepik

    Video nodes

    Generate AI video clips, combine them into sequences, and upscale the result. From a text prompt to broadcast-ready output.

    Video nodes let you generate, combine, and enhance video content directly inside your Spaces workflow. The Video Generator supports 40+ AI models from Kling, MiniMax, Runway, Google Veo, Sora, PixVerse, Seedance, Wan, Luma, and more. Once you have clips, the Video Combiner merges them into longer sequences, and the Video Upscaler enhances resolution and quality.

    In this article

    Video Generator

    The Video Generator creates AI video clips from text prompts and optional visual references. With 40+ models you can generate everything from cinematic scenes to animated illustrations, lip-synced characters, and motion-controlled sequences.

    How to use it

    1

    Add the node

    Search for Video Generator in Spotlight. You can also search for a specific model like Kling 3.0 or Veo 3 to add a pre-configured node.

    2

    Write your prompt

    Describe the scene you want to generate. Some models can work without a prompt when a Start Frame image is provided.

    3

    Choose a model

    Click the model selector to browse 40+ options. Hover over any model for a tooltip showing features, duration, and credit cost.

    4

    Set your options

    Pick an aspect ratio, duration (2s to 10s depending on model), resolution (720p or 1080p), and number of generations (1 to 10).

    5

    Connect references (optional)

    Connect a Start Frame image to animate a still, an End Frame for keyframe interpolation, or reference images to guide style and composition.

    6

    Run the node

    Results appear on the card. Use the arrow buttons to browse multiple generations.

    Settings

    SettingWhat it does
    PromptText description of the scene. Supports @mentions to reference other nodes.
    ModelWhich AI model to use. 40+ options from multiple providers.
    Aspect Ratio1:1, 16:9, 9:16, 4:3, 3:4, and more. Varies by model.
    DurationLength in seconds. Common options: 2s, 4s, 5s, 8s, 10s.
    Resolution720p or 1080p (model-dependent).
    Sound EffectsToggle to include AI-generated audio that matches the video content (model-dependent).
    Number of Generations1 to 10 clips per run.

    Input and output

    DirectionPortData typeNotes
    InputPromptTextScene description
    InputStart FrameImageFirst frame for image-to-video
    InputEnd FrameImageLast frame for keyframe interpolation
    InputReferencesImage (multiple)Guide visual style, character, product, or composition
    InputVideo ReferenceVideoExisting video as motion guide. Availability depends on model.
    InputAudioAudioFor lip-sync models
    OutputOutputVideoThe generated video clip
    OutputStart FrameImageFirst frame of the generated video
    OutputEnd FrameImageLast frame of the generated video
    OutputAudio OutputAudioAudio track (if model supports sound)

    Use cases

    • Text-to-video. Type a prompt, pick a model, generate. The simplest starting point.
    • Image-to-video. Connect an Image Generator output to the Start Frame port to animate a still image.
    • Keyframe interpolation. Set both a Start Frame and an End Frame. The AI generates the transition between them.
    • With voiceover. Generate a video, then combine it with a Voiceover track using Video Audio Mix.
    • Storyboard to video. Create multiple clips from scene descriptions and assemble them with Video Combiner.

    Video Generator models

    The Video Generator supports models from 13+ providers. Each model appears in Spotlight as a variant (for example, search "Kling 3.0" to add a pre-configured node).

    ProviderKnown for
    KlingConsistent quality, motion control, wide range of options
    MiniMaxFast generation, live illustration style
    RunwayHigh-quality cinematic output
    Google (Veo)Photorealism, prompt adherence, sound generation
    OpenAI (Sora)Creative output, strong prompt-following
    PixVerseVersatile generation, good value
    ByteDance (Seedance)Rich reference system, lip-sync, person detection
    WanAnime and illustration focus
    LumaDreamy, artistic visual style
    HunyuanChinese art style, versatile output
    LTXFast budget-friendly option
    xAI (Grok)Fast, creative generation
    VEED (Fabric)Fast, practical video creation
    The model list is dynamic and updated frequently. Check the model selector in Spaces for the latest options.

    Key model capabilities

    Not all models support all features. Here is what to look for depending on what you need.

    CapabilityWhat it doesSupported by
    Start FrameAnimate a still image as the first frameMost models
    End FrameSet a target last frame for keyframe interpolationSelect Kling, Wan, and Seedance variants
    Character referencesGuide generation with a character imageAct Two, Wan Animate, Seedance, Veo 3.1, Kling 3.0
    Product referencesGuide generation with a product imageSeedance, Veo 3.1, Kling 3.0
    Style referencesTransfer visual style from a referenceSeedance 2.0, Veo 3.1, Kling 3.0, MiniMax Reference
    Motion controlPreset camera movements (pan, zoom, orbit, tilt)Kling Motion Control 2.6, Kling Motion Control 3.0
    Sound effectsAI-generated audio matching the video contentVeo 3 and others
    Lip-syncSynchronize character mouth with audio inputOmni Human 1.5, Kling Motion Control variants
    Negative promptSpecify elements to exclude from generationModels with negative prompt support
    MultishotGenerate multi-shot sequences from a single promptModels with multishot support

    Video Combiner

    The Video Combiner merges multiple video clips into a single continuous video. Connect multiple video sources and arrange their order to create longer sequences.

    How to use it

    1

    Add the node

    Search for Video Combiner in Spotlight.

    2

    Connect your clips

    Connect multiple video outputs to the Videos input port. Order matters.

    3

    Arrange the sequence

    Use the Video Order control to drag and reorder clips. The View Mode setting lets you switch between timeline and list views for managing your clip order.

    4

    Run the node

    The combined video appears on the card.

    Input and output

    DirectionPortData typeNotes
    InputVideosVideo (multiple)Clips to combine. Order matters.
    OutputOutputVideoThe combined video
    OutputStart FrameImageFirst frame of the combined video
    OutputEnd FrameImageLast frame of the combined video

    Use cases

    • Story sequence. Generate multiple scenes with the Video Generator, then assemble them into a complete narrative.
    • Music video. Combine a series of generated clips into a seamless video.
    • Smooth transitions. Connect the End Frame output of one Video Generator to the Start Frame input of the next, then combine everything for smooth scene-to-scene flow.

    Video Upscaler

    The Video Upscaler enhances video resolution and quality using AI. Three modes cover different needs.

    ModePowered byBest forSpeed
    TopazTopazProfessional video. Resolution control, frame interpolation for smoother playback.Moderate
    Magnific AIMagnificCreative enhancement. Three presets: Realistic, Animation 3D, Artistic. Optional FPS boost.Slower
    SharpenSeedVR2Quick quality boost. Fast resolution enhancement without creative changes.Fast

    How to use it

    1

    Add the node

    Search for Video Upscaler in Spotlight.

    2

    Connect a video

    Connect any video output to the Input Video port.

    3

    Choose a mode

    Topaz for professional output, Magnific AI for creative enhancement, Sharpen for a quick fix.

    4

    Configure settings

    Set the target resolution and any mode-specific options like FPS boost or preset.

    5

    Run the node

    The upscaled video appears on the card.

    Use cases

    • Post-generation enhancement. Take a 720p generated video to broadcast-ready 1080p with Topaz.
    • Creative video art. Transform a generated video with the Magnific AI Artistic preset.
    • Quick fix. Sharpen and clean up any video fast with the Sharpen mode.

    Common workflows

    1. Text to video. Text node with your prompt connected to Video Generator. The simplest starting point.
    2. Image to video. Image Generator connected to Video Generator Start Frame port. Animate any still image.
    3. Generate and upscale. Video Generator at 720p, then Video Upscaler to 1080p for the final output.
    4. Multi-scene production. Multiple Video Generators (one per scene) connected to a Video Combiner for a complete sequence.
    5. Full audiovisual pipeline. Video Generator plus Voiceover plus Music Generator, all combined with Video Audio Mix.
    Additional video nodes — including frame extraction (Extract Frames), frame-to-video assembly (Frames to Video), and a dedicated video editor (Edit Video) — are available in advanced mode and may require enabling debug features. Their functionality may change. This article will be updated as these nodes reach general availability.

    Tips and best practices

    Connect an Image Generator to Start Frame for image-to-video. This is one of the most popular workflows. Generate a still image first, then animate it.

    Use both Start Frame and End Frame for smooth transitions. The AI generates the video between your two keyframes, giving you precise control over the beginning and end.

    Enable Sound Effects on models that support it. Veo 3 and others can generate matching audio automatically, saving you the step of adding sound separately.

    Match the model to the style. Kling for consistent quality, MiniMax for speed, Veo for photorealism, Wan for anime, Luma for dreamy artistic looks.

    Use Run Downstream when changing only the video model. It skips re-generating upstream nodes like prompts and images, saving time and credits.

    Connect End Frame outputs to the next scene Start Frame. When building multi-scene videos with the Combiner, this creates smooth transitions between clips.

    Topaz is the professional choice for upscaling. Use it for final delivery. Magnific AI is better for creative experimentation. Sharpen is fastest when you just need a quick boost.

    For lip-sync, use Omni Human 1.5 or Kling Motion Control. Connect a Voiceover audio output to the Video Generator Audio input on these models.

    Can't find an answer to your question?

    Our support team is here to help you with any questions or issues.

    Submit a request