Unified models are smarter than previous generations, but they still appreciate clarity. Because UniVideo uses Qwen2.5-VL for understanding, it excels at descriptive, natural language prompts rather than "tag salad" (e.g., "4k, 8k, best quality").
Structure of a Perfect Prompt
We recommend a 3-part structure:
[Subject] + [Action/Motion] + [Atmosphere/Style]
Example:
"A cybernetic samurai drawing a katana (Subject/Action), standing in neon rain on a rooftop (Atmosphere), 35mm film grain, moody lighting (Style)."
Editing Prompts
When using the editing mode, be direct. You don't need to describe the whole scene again, just the change.
- Good: "Make the car red."
- Good: "Change the weather to snow."
- Bad: "Generate a video of the same car but now it is red and driving." (This might confuse the model into generating a new video rather than editing).
Camera Control
You can direct the camera movement with simple keywords: