Omni-Video
Create an Omni-Video generation task (text-to-video, image-to-video, video editing, and more).
Authorizations
Request Body
Model name. Enum: kling-video-o1, kling-v3-omni
Text prompt; may include positive and negative descriptions.
Prompts can be templated for different video generation needs.
Up to 2500 characters.
Required when multi_shot is false.
Reference image list (subject, scene, style, etc.), or use as first/end frame for video generation.
When used as first/end frame:
Use type to mark frames: first_frame for the first frame, end_frame for the last frame.
End frame only is not supported; if end_frame is set, first_frame is required.
First/end frame generation cannot be combined with video editing.
Reference video via URL.
Can be a feature reference video or a video to edit (default: to edit); optionally keep the original audio.
refer_type: feature = feature reference, base = video to edit.
When the reference is a video to edit (base), first/end frames cannot be set.
keep_original_sound: yes = keep audio, no = remove; also applies to feature reference videos.
Video generation mode.
Enum: std, pro.
std: Standard mode — balanced quality and cost.
pro: Pro mode — higher quality output.
Output aspect ratio (width:height).
Enum: 16:9, 9:16, 1:1.
Required when not using first-frame reference or video editing.
Video duration in seconds.
Enum: 3, 4, 5, 6, 7, 8, 9, 10.
For text-to-video and first-frame image-to-video, only 5 and 10 are supported.
With video editing (refer_type base), output duration matches the input video; this field is ignored; billing rounds input duration to the nearest second.
kling-v3-omni supports 3–15; same editing rules apply when refer_type is base.
Subjects
Whether to generate a multi-shot video.
When true, prompt is ignored.
When false, shot_type and multi_prompt are ignored.
Shot segmentation mode.
Enum: customize.
Required when multi_shot is true.
Per-shot details (prompt, duration, etc.).
Define shots with index, prompt, and duration:
Up to 6 shots, at least 1.
Each shot's content is at most 512 characters.
Each shot duration is between 1 and the task total duration.
Sum of shot durations must equal the task total duration.
Whether to also generate a watermarked result.
Set via enabled, e.g.:
"watermark_info": { "enabled": true }
true = generate, false = do not generate.
Whether to generate audio with the video.
Enum: on, off.
Responses
OK
Task ID
Task status