Skip to content

Text to Video

POST
/kling/v1/videos/text2video

Create a video generation task from a text prompt.

Authorizations

bearer
Type
HTTP (bearer)

Request Body

application/json
object

Model name. Enum: kling-v1, kling-v1-6, kling-v2-master, kling-v2-1-master, kling-v2-5-turbo, kling-v3

Positive text prompt, up to 2500 characters

Whether to generate a multi-shot video.
When true, prompt is ignored.
When false, shot_type and multi_prompt are ignored.

Shot segmentation mode.
Enum: customize
Required when multi_shot is true.

object[]

Per-shot details (prompt, duration, etc.).
● Define shot index, prompt, and duration via index, prompt, and duration:
○ Up to 6 shots, at least 1
○ Each shot content up to 512 characters
○ Each shot duration ≤ total task duration and ≥ 1
○ Sum of shot durations equals total task duration

Negative text prompt, up to 2500 characters

Video generation freedom; higher values mean less freedom and stronger alignment with the prompt

Video generation mode.
Enum: std, pro
std: Standard mode — cost-effective baseline quality
pro: Pro mode — higher quality output

Whether to generate audio with the video.
Enum: on, off
Supported on V2.6 and later models only.

Output aspect ratio (width:height)

Video duration in seconds

object

Whether to also return a watermarked result.
● Set via enabled (key:value), e.g.:
"watermark_info": {
"enabled": boolean // true to generate, false to skip
}

object

Camera motion control (if omitted, the model infers motion from text/image input)

Responses

OK

application/json
object

Task ID

Task status

Playground

Authorization
Body

Samples