Skip to content

Omni-Video

POST
/kling/v1/videos/omni-video

Create an Omni-Video generation task (text-to-video, image-to-video, video editing, and more).

Authorizations

bearer
Type
HTTP (bearer)

Request Body

application/json
object

Model name. Enum: kling-video-o1, kling-v3-omni

Text prompt; may include positive and negative descriptions.
Prompts can be templated for different video generation needs.
Up to 2500 characters.
Required when multi_shot is false.

object[]

Reference image list (subject, scene, style, etc.), or use as first/end frame for video generation.
When used as first/end frame:
Use type to mark frames: first_frame for the first frame, end_frame for the last frame.
End frame only is not supported; if end_frame is set, first_frame is required.
First/end frame generation cannot be combined with video editing.

object[]

Reference video via URL.
Can be a feature reference video or a video to edit (default: to edit); optionally keep the original audio.
refer_type: feature = feature reference, base = video to edit.
When the reference is a video to edit (base), first/end frames cannot be set.
keep_original_sound: yes = keep audio, no = remove; also applies to feature reference videos.

Video generation mode.
Enum: std, pro.
std: Standard mode — balanced quality and cost.
pro: Pro mode — higher quality output.

Output aspect ratio (width:height).
Enum: 16:9, 9:16, 1:1.
Required when not using first-frame reference or video editing.

Video duration in seconds.
Enum: 3, 4, 5, 6, 7, 8, 9, 10.
For text-to-video and first-frame image-to-video, only 5 and 10 are supported.
With video editing (refer_type base), output duration matches the input video; this field is ignored; billing rounds input duration to the nearest second.
kling-v3-omni supports 3–15; same editing rules apply when refer_type is base.

object[]

Subjects

Whether to generate a multi-shot video.
When true, prompt is ignored.
When false, shot_type and multi_prompt are ignored.

Shot segmentation mode.
Enum: customize.
Required when multi_shot is true.

object[]

Per-shot details (prompt, duration, etc.).
Define shots with index, prompt, and duration:
Up to 6 shots, at least 1.
Each shot's content is at most 512 characters.
Each shot duration is between 1 and the task total duration.
Sum of shot durations must equal the task total duration.

object

Whether to also generate a watermarked result.
Set via enabled, e.g.:
"watermark_info": { "enabled": true }
true = generate, false = do not generate.

Whether to generate audio with the video.
Enum: on, off.

Responses

OK

application/json
object

Task ID

Task status

Playground

Authorization
Body

Samples