Multimodal Video
POST
/kling/v1/videos/multi-elements
Edit video with multimodal operations such as add, swap, or remove elements based on session and selections.
Authorizations
bearer
Type
HTTP (bearer)
Request Body
application/json
model_name
string
Required
Model name. Enum: kling-v1-6
session_id
string
Required
Session ID generated by the video init task; unchanged by selection edits
edit_mode
string
Required
Operation type. Enum: addition, swap, removal. addition: add element; swap: replace element; removal: remove element
image_list
string[]
List of cropped reference images
Expand
prompt
string
Required
Positive text prompt
negative_prompt
string
Negative text prompt
mode
string
Required
Video generation mode.
Enum: std, pro
std: Standard mode — balanced quality and cost.
pro: Pro mode — higher quality output.
duration
string
Required
Video duration in seconds.
Enum: 5, 10
callback_url
string
external_task_id
string
Responses
OK
application/json
object
task_id
string
Task ID
task_status
string
Task status