Audio Understanding
POST
/v1beta/models/gemini-2.5-pro:generateContent
- Upload audio via
inline_data(base64, e.g. audio/mp3) - Use
textto specify tasks such as transcription, summarization, or Q&A - Supports native multimodal
generateContentformat - Official docs: Audio understanding
Authorizations
bearer
Type
HTTP (bearer)
Request Body
application/json
contents
object[]
Required
Responses
Success
application/json
object