Claude OpenAI-Compatible Format

Fast-Token supports calling Anthropic Claude models via the OpenAI API protocol. If you already have apps built on the OpenAI SDK or /v1/chat/completions, you can usually connect to Claude by changing only base_url and model, without rewriting to the native Anthropic Messages API.

This page focuses on usage guidance. For request bodies, response fields, and online debugging of each endpoint, see the ChatGPT documentation and the links in the capability table below.

Two integration approaches

Comparison	OpenAI-compatible format (this page)	Native Claude API (other docs in this folder)
Typical path	`POST /v1/chat/completions`	`POST /v1/messages`
Request body	OpenAI fields such as `messages`, `model`, `stream`	Anthropic fields such as `messages`, `system`, `thinking`
Best for	Existing OpenAI clients, unified multi-model gateways, quick migration	Claude-only features (extended thinking budget, Tool Use, PDF, web search, etc.)
Documentation	This page + Chat section	API pages under the Chat group in this folder

Both approaches share the same API Key and gateway URL. Billing follows the corresponding model in the model marketplace.

Setup

Gateway and authentication

Base URL: https://fast-token.com/v1 (same as Getting Started)
Authentication: request header Authorization: Bearer <Fast-Token_API_KEY>
Model name: copy a model ID containing claude from the model marketplace into the model field

Using the OpenAI SDK (recommended)

Point the official SDK base_url to Fast-Token; everything else works like OpenAI:

python

from openai import OpenAI

client = OpenAI(
    base_url="https://fast-token.com/v1",
    api_key="<Fast-Token_API_KEY>",
)

completion = client.chat.completions.create(
    model="claude-sonnet-4-20250514",  # use model marketplace IDs
    messages=[
        {"role": "user", "content": "用一句话介绍你自己"},
    ],
)
print(completion.choices[0].message.content)

javascript

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://fast-token.com/v1",
  apiKey: "<Fast-Token_API_KEY>",
});

const completion = await client.chat.completions.create({
  model: "claude-sonnet-4-20250514",
  messages: [{ role: "user", content: "用一句话介绍你自己" }],
});
console.log(completion.choices[0].message.content);

Set stream: true for streaming chat. See Create chat completion (streaming).

Chat-compatible capabilities overview

The following capabilities are accessed via the OpenAI-compatible path. In the console or Apifox they are often grouped as “chat-compatible format”. In practice they mostly use the same POST /v1/chat/completions, differentiated by model, streaming, and message structure.

Capability	Description	Reference
Create thinking chat	Conversations with model “thinking” output	Streaming (`extra_body.enable_thinking`)
Create chat completion (streaming)	Standard multi-turn text chat, SSE streaming	Create chat completion (streaming)
Create chat completion (non-streaming)	Standard multi-turn text chat, single response	Create chat completion (non-streaming)
Create chat vision (streaming)	Upload images for understanding, description, or Q&A	Create chat vision (streaming)
Create chat vision (non-streaming)	Vision scenarios, full non-streaming response	Create chat vision (non-streaming)

Model selection

Use the model ID from the model marketplace for each scenario. Try Claude models whose names include claude and that support the relevant chat / vision / thinking capability. If you get “model not found” or unsupported capability, switch to another Claude entry for the same scenario.

Usage notes by scenario

Standard chat (streaming / non-streaming)

Endpoint: POST /v1/chat/completions
Use a messages array for multi-turn dialogue; role supports system / user / assistant (same as OpenAI)
Non-streaming: stream: false or omit stream → Create chat completion (non-streaming)
Streaming: stream: true → Create chat completion (streaming)
Common parameters (temperature, max_tokens, top_p, etc.) behave like the OpenAI documentation

system role

In OpenAI format, put the system prompt in messages with role: "system". The native Claude API often uses a top-level system field; see Create chat completion (streaming).

Thinking mode (create thinking chat)

Some Claude thinking models can enable thinking output in streaming requests via an extension field:

json

{
  "model": "claude-sonnet-4-20250514",
  "messages": [{ "role": "user", "content": "逐步推导：为什么天空是蓝色的？" }],
  "stream": true,
  "extra_body": {
    "enable_thinking": true
  }
}

How thinking is surfaced in compatible format depends on the gateway and model; choose a Claude model that supports “thinking” in the model marketplace
For precise control of thinking token budget (thinking.budget_tokens) and separate thinking vs. answer blocks, use the native Create extended thinking chat (POST /v1/messages)

Vision (image understanding)

Use a multimodal array in the user message content: text + image_url (URL or Base64).

json

{
  "model": "claude-sonnet-4-20250514",
  "messages": [
    {
      "role": "user",
      "content": [
        { "type": "text", "text": "描述这张图片中的主要内容" },
        {
          "type": "image_url",
          "image_url": { "url": "https://example.com/photo.jpg" }
        }
      ]
    }
  ],
  "stream": true
}

Streaming vision: stream: true — see Create chat vision (streaming)
Non-streaming vision: stream: false — see Create chat vision (non-streaming)
Base64 images: see Create chat vision (streaming) Base64

Choosing OpenAI-compatible vs. native Claude API

Your need	Recommendation
Quick integration, reuse OpenAI SDK / existing code	OpenAI-compatible format (this page)
Extended thinking, `budget_tokens`, thinking-block SSE structure	Native API — Create extended thinking chat
Tool Use / function calling	Native API — Create function calling (streaming)
Structured output (JSON Schema)	Native API — Create structured outputs
PDF documents, web search	Native API — PDF support, Web search
Only need chat + vision + thinking with an OpenAI client	OpenAI-compatible format covers the main path

FAQ

Q: Why doesn’t this match the official OpenAI docs exactly?
A: The compatibility layer aligns requests/responses with OpenAI while Claude runs underneath. Some OpenAI-only parameters may be ignored; rely on what the model actually supports.

Q: What should I put in model?
A: Use the full model ID from the model marketplace (usually including claude). Do not use shorthand or outdated names.

Q: Can I use the official Anthropic Python SDK?
A: That SDK targets POST /v1/messages by default. To keep using it, follow the native API docs in this folder; use the OpenAI SDK for the compatible format on this page.

Q: What is the streaming response format?
A: SSE with data: {...} lines and data: [DONE] at the end, same as OpenAI streaming Chat Completions. See Chat completion chunk object.

Q: Full response JSON structure?
A: See Chat completion object. For native Messages API responses, see Chat completion object (Anthropic format).

OpenAI Official Format

Chat Mode

Unified Standard API Format

Unified Standard Format

Chat Mode

OpenAI Format

Unified Standard API

OpenAI-Compatible Format

Replicate Official Format

OpenAI Compatible Format

Claude OpenAI-Compatible Format

Two integration approaches

Setup

Gateway and authentication

Using the OpenAI SDK (recommended)

Chat-compatible capabilities overview

Usage notes by scenario

Standard chat (streaming / non-streaming)

Thinking mode (create thinking chat)

Vision (image understanding)

Choosing OpenAI-compatible vs. native Claude API

FAQ

Further reading

Claude OpenAI-Compatible Format ​

Two integration approaches ​

Setup ​

Gateway and authentication ​

Using the OpenAI SDK (recommended) ​

Chat-compatible capabilities overview ​

Usage notes by scenario ​

Standard chat (streaming / non-streaming) ​

Thinking mode (create thinking chat) ​

Vision (image understanding) ​

Choosing OpenAI-compatible vs. native Claude API ​

FAQ ​

Further reading ​

Claude OpenAI-Compatible Format

Two integration approaches

Setup

Gateway and authentication

Using the OpenAI SDK (recommended)

Chat-compatible capabilities overview

Usage notes by scenario

Standard chat (streaming / non-streaming)

Thinking mode (create thinking chat)

Vision (image understanding)

Choosing OpenAI-compatible vs. native Claude API

FAQ

Further reading