Does Crevid support both text and image inputs?

Yes. Crevid supports creating content from text or images, including text-to-video, image-to-video, text-to-image, and image-to-image workflows.

What does “Reference-First” mean in Crevid?

Crevid uses reference inputs—such as images, video references, and audio—to anchor scene elements, rather than relying only on a text prompt.

How does Crevid handle lip-sync and audio?

Crevid generates synchronized dialogue with phoneme-level lip-sync (8+ languages) and environmental SFX as part of the generation process.

Can users revise parts of a generated clip without remaking everything?

Yes. Crevid supports targeted revisions such as character replacement and movement adjustments within existing clips without changing the overall style or rhythm.

Crevid

Direct the Future: The Professional Multimodal AI Studio with Precise Control.

Visit Website

Launched Apr 20, 2026

MakerFei

0 upvotes

Product Overview

Crevid is an all-in-one AI video & image generator online that creates high-quality content from text or images. It supports multiple generation options, including Sora 2, Veo3, Veo3.1, Runway, and Seedance 2.0, alongside other listed models such as Midjourney and GPT 4o.

The studio is built for creators who need more than prompt-only outputs. Crevid uses a “Reference-First” architecture to anchor scenes using image, video reference, and audio inputs, aiming to make AI video generation feel more like professional production with precise control.

Key features

Reference-First scene direction using up to 9 images, 3 video references, and audio files to anchor characters, art style, and camera trajectories.
Native audio-visual sync generation, including synchronized dialogue, phoneme-level lip-syncing (8+ languages), and environmental SFX without separate post-production steps.
Absolute identity stacking to reduce character and costume flickering across multi-shot sequences and complex lighting environments.
Targeted revisions that allow character replacement and movement adjustments within existing clips without altering overall style or rhythm.
Creation workflows for both video and images, including text-to-video, image-to-video, text-to-image, and image-to-image modes.
Controls for repeatable output styling using a seed value and options like frame mode and video ratio selection (e.g., 16:9, 9:16).

How Crevid works

1
Choose video or image mode
Select whether to create video or an image, then pick a generation path such as text-to-video or image-to-video.
2
Provide reference inputs and prompt
Upload images, video references, and audio as needed, then enter or translate a prompt to English for better results.
3
Generate and revise with settings
Set options like aspect ratio (video ratio) and use seeds for repeatability, then regenerate or apply targeted revisions when edits are needed.

Use cases

An independent filmmaker revising a multi-shot scene: they can replace a character or adjust movement inside existing clips while keeping the scene’s style and rhythm consistent.
An e-commerce operator producing product visuals from references: they can anchor art style and camera trajectories with multiple images, then iterate quickly using seeds and repeatable settings.
A marketer generating localized dialogue and sound design: they can create synchronized dialogue with phoneme-level lip-sync across 8+ languages and add environmental SFX as part of the generation process.

Who is it for?

Crevid is designed for creators, marketers, and filmmakers who want more precise control over AI-generated video and image outputs than text prompts alone. It fits teams that need repeatable direction (via references and seeds) and iterative revisions during production.

Frequently asked questions

Share on X

SuperX