Skip to content

Digital Human

POST
/kling/v1/videos/avatar/image2video

Generate a digital human video task from a reference image and audio.

Authorizations

bearer
Type
HTTP (bearer)

Request Body

application/json
object

Reference image for the digital human.
Supports Base64 or an accessible image URL.
Formats: .jpg / .jpeg / .png.
Max file size 10MB; width and height must each be at least 300px; aspect ratio between 1:2.5 and 2.5:1.

Audio ID from the preview (TTS) API.
Only audio generated within 30 days, duration 2–300 seconds.
Use either audio_id or sound_file, not both and not neither.

Audio file.
Supports Base64 or an accessible audio URL.
Formats: .mp3 / .wav / .m4a / .aac; max 5MB; duration 2–300 seconds.
Use either audio_id or sound_file, not both and not neither.

Positive text prompt

Video generation mode.
Enum: std, pro

Callback URL when the task completes

External task ID (custom)

Responses

OK

application/json

Playground

Authorization
Body

Samples