Claude OpenAI-Compatible Format
Fast-Token supports calling Anthropic Claude models via the OpenAI API protocol. If you already have apps built on the OpenAI SDK or /v1/chat/completions, you can usually connect to Claude by changing only base_url and model, without rewriting to the native Anthropic Messages API.
This page focuses on usage guidance. For request bodies, response fields, and online debugging of each endpoint, see the ChatGPT documentation and the links in the capability table below.
Two integration approaches
| Comparison | OpenAI-compatible format (this page) | Native Claude API (other docs in this folder) |
|---|---|---|
| Typical path | POST /v1/chat/completions | POST /v1/messages |
| Request body | OpenAI fields such as messages, model, stream | Anthropic fields such as messages, system, thinking |
| Best for | Existing OpenAI clients, unified multi-model gateways, quick migration | Claude-only features (extended thinking budget, Tool Use, PDF, web search, etc.) |
| Documentation | This page + Chat section | API pages under the Chat group in this folder |
Both approaches share the same API Key and gateway URL. Billing follows the corresponding model in the model marketplace.
Setup
Gateway and authentication
- Base URL:
https://fast-token.com/v1(same as Getting Started) - Authentication: request header
Authorization: Bearer <Fast-Token_API_KEY> - Model name: copy a model ID containing
claudefrom the model marketplace into themodelfield
Using the OpenAI SDK (recommended)
Point the official SDK base_url to Fast-Token; everything else works like OpenAI:
from openai import OpenAI
client = OpenAI(
base_url="https://fast-token.com/v1",
api_key="<Fast-Token_API_KEY>",
)
completion = client.chat.completions.create(
model="claude-sonnet-4-20250514", # use model marketplace IDs
messages=[
{"role": "user", "content": "用一句话介绍你自己"},
],
)
print(completion.choices[0].message.content)import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://fast-token.com/v1",
apiKey: "<Fast-Token_API_KEY>",
});
const completion = await client.chat.completions.create({
model: "claude-sonnet-4-20250514",
messages: [{ role: "user", content: "用一句话介绍你自己" }],
});
console.log(completion.choices[0].message.content);Set stream: true for streaming chat. See Create chat completion (streaming).
Chat-compatible capabilities overview
The following capabilities are accessed via the OpenAI-compatible path. In the console or Apifox they are often grouped as “chat-compatible format”. In practice they mostly use the same POST /v1/chat/completions, differentiated by model, streaming, and message structure.
| Capability | Description | Reference |
|---|---|---|
| Create thinking chat | Conversations with model “thinking” output | Streaming (extra_body.enable_thinking) |
| Create chat completion (streaming) | Standard multi-turn text chat, SSE streaming | Create chat completion (streaming) |
| Create chat completion (non-streaming) | Standard multi-turn text chat, single response | Create chat completion (non-streaming) |
| Create chat vision (streaming) | Upload images for understanding, description, or Q&A | Create chat vision (streaming) |
| Create chat vision (non-streaming) | Vision scenarios, full non-streaming response | Create chat vision (non-streaming) |
Model selection
Use the model ID from the model marketplace for each scenario. Try Claude models whose names include claude and that support the relevant chat / vision / thinking capability. If you get “model not found” or unsupported capability, switch to another Claude entry for the same scenario.
Usage notes by scenario
Standard chat (streaming / non-streaming)
- Endpoint:
POST /v1/chat/completions - Use a
messagesarray for multi-turn dialogue;rolesupportssystem/user/assistant(same as OpenAI) - Non-streaming:
stream: falseor omitstream→ Create chat completion (non-streaming) - Streaming:
stream: true→ Create chat completion (streaming) - Common parameters (
temperature,max_tokens,top_p, etc.) behave like the OpenAI documentation
system role
In OpenAI format, put the system prompt in messages with role: "system". The native Claude API often uses a top-level system field; see Create chat completion (streaming).
Thinking mode (create thinking chat)
Some Claude thinking models can enable thinking output in streaming requests via an extension field:
{
"model": "claude-sonnet-4-20250514",
"messages": [{ "role": "user", "content": "逐步推导:为什么天空是蓝色的?" }],
"stream": true,
"extra_body": {
"enable_thinking": true
}
}- How thinking is surfaced in compatible format depends on the gateway and model; choose a Claude model that supports “thinking” in the model marketplace
- For precise control of thinking token budget (
thinking.budget_tokens) and separate thinking vs. answer blocks, use the native Create extended thinking chat (POST /v1/messages)
Vision (image understanding)
Use a multimodal array in the user message content: text + image_url (URL or Base64).
{
"model": "claude-sonnet-4-20250514",
"messages": [
{
"role": "user",
"content": [
{ "type": "text", "text": "描述这张图片中的主要内容" },
{
"type": "image_url",
"image_url": { "url": "https://example.com/photo.jpg" }
}
]
}
],
"stream": true
}- Streaming vision:
stream: true— see Create chat vision (streaming) - Non-streaming vision:
stream: false— see Create chat vision (non-streaming) - Base64 images: see Create chat vision (streaming) Base64
Choosing OpenAI-compatible vs. native Claude API
| Your need | Recommendation |
|---|---|
| Quick integration, reuse OpenAI SDK / existing code | OpenAI-compatible format (this page) |
Extended thinking, budget_tokens, thinking-block SSE structure | Native API — Create extended thinking chat |
| Tool Use / function calling | Native API — Create function calling (streaming) |
| Structured output (JSON Schema) | Native API — Create structured outputs |
| PDF documents, web search | Native API — PDF support, Web search |
| Only need chat + vision + thinking with an OpenAI client | OpenAI-compatible format covers the main path |
FAQ
Q: Why doesn’t this match the official OpenAI docs exactly?
A: The compatibility layer aligns requests/responses with OpenAI while Claude runs underneath. Some OpenAI-only parameters may be ignored; rely on what the model actually supports.
Q: What should I put in model?
A: Use the full model ID from the model marketplace (usually including claude). Do not use shorthand or outdated names.
Q: Can I use the official Anthropic Python SDK?
A: That SDK targets POST /v1/messages by default. To keep using it, follow the native API docs in this folder; use the OpenAI SDK for the compatible format on this page.
Q: What is the streaming response format?
A: SSE with data: {...} lines and data: [DONE] at the end, same as OpenAI streaming Chat Completions. See Chat completion chunk object.
Q: Full response JSON structure?
A: See Chat completion object. For native Messages API responses, see Chat completion object (Anthropic format).
Further reading
- Getting Started — first call and API Key
- API quick start guide — Base URL and client setup
- List models —
GET /v1/modelsfor available models - Other pages in this folder — detailed parameters and examples for Claude native Messages API