Skip to content

Claude OpenAI-Compatible Format

Fast-Token supports calling Anthropic Claude models via the OpenAI API protocol. If you already have apps built on the OpenAI SDK or /v1/chat/completions, you can usually connect to Claude by changing only base_url and model, without rewriting to the native Anthropic Messages API.

This page focuses on usage guidance. For request bodies, response fields, and online debugging of each endpoint, see the ChatGPT documentation and the links in the capability table below.

Two integration approaches

ComparisonOpenAI-compatible format (this page)Native Claude API (other docs in this folder)
Typical pathPOST /v1/chat/completionsPOST /v1/messages
Request bodyOpenAI fields such as messages, model, streamAnthropic fields such as messages, system, thinking
Best forExisting OpenAI clients, unified multi-model gateways, quick migrationClaude-only features (extended thinking budget, Tool Use, PDF, web search, etc.)
DocumentationThis page + Chat sectionAPI pages under the Chat group in this folder

Both approaches share the same API Key and gateway URL. Billing follows the corresponding model in the model marketplace.

Setup

Gateway and authentication

  • Base URL: https://fast-token.com/v1 (same as Getting Started)
  • Authentication: request header Authorization: Bearer <Fast-Token_API_KEY>
  • Model name: copy a model ID containing claude from the model marketplace into the model field

Point the official SDK base_url to Fast-Token; everything else works like OpenAI:

python
from openai import OpenAI

client = OpenAI(
    base_url="https://fast-token.com/v1",
    api_key="<Fast-Token_API_KEY>",
)

completion = client.chat.completions.create(
    model="claude-sonnet-4-20250514",  # use model marketplace IDs
    messages=[
        {"role": "user", "content": "用一句话介绍你自己"},
    ],
)
print(completion.choices[0].message.content)
javascript
import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://fast-token.com/v1",
  apiKey: "<Fast-Token_API_KEY>",
});

const completion = await client.chat.completions.create({
  model: "claude-sonnet-4-20250514",
  messages: [{ role: "user", content: "用一句话介绍你自己" }],
});
console.log(completion.choices[0].message.content);

Set stream: true for streaming chat. See Create chat completion (streaming).

Chat-compatible capabilities overview

The following capabilities are accessed via the OpenAI-compatible path. In the console or Apifox they are often grouped as “chat-compatible format”. In practice they mostly use the same POST /v1/chat/completions, differentiated by model, streaming, and message structure.

CapabilityDescriptionReference
Create thinking chatConversations with model “thinking” outputStreaming (extra_body.enable_thinking)
Create chat completion (streaming)Standard multi-turn text chat, SSE streamingCreate chat completion (streaming)
Create chat completion (non-streaming)Standard multi-turn text chat, single responseCreate chat completion (non-streaming)
Create chat vision (streaming)Upload images for understanding, description, or Q&ACreate chat vision (streaming)
Create chat vision (non-streaming)Vision scenarios, full non-streaming responseCreate chat vision (non-streaming)

Model selection

Use the model ID from the model marketplace for each scenario. Try Claude models whose names include claude and that support the relevant chat / vision / thinking capability. If you get “model not found” or unsupported capability, switch to another Claude entry for the same scenario.

Usage notes by scenario

Standard chat (streaming / non-streaming)

  • Endpoint: POST /v1/chat/completions
  • Use a messages array for multi-turn dialogue; role supports system / user / assistant (same as OpenAI)
  • Non-streaming: stream: false or omit streamCreate chat completion (non-streaming)
  • Streaming: stream: trueCreate chat completion (streaming)
  • Common parameters (temperature, max_tokens, top_p, etc.) behave like the OpenAI documentation

system role

In OpenAI format, put the system prompt in messages with role: "system". The native Claude API often uses a top-level system field; see Create chat completion (streaming).

Thinking mode (create thinking chat)

Some Claude thinking models can enable thinking output in streaming requests via an extension field:

json
{
  "model": "claude-sonnet-4-20250514",
  "messages": [{ "role": "user", "content": "逐步推导:为什么天空是蓝色的?" }],
  "stream": true,
  "extra_body": {
    "enable_thinking": true
  }
}
  • How thinking is surfaced in compatible format depends on the gateway and model; choose a Claude model that supports “thinking” in the model marketplace
  • For precise control of thinking token budget (thinking.budget_tokens) and separate thinking vs. answer blocks, use the native Create extended thinking chat (POST /v1/messages)

Vision (image understanding)

Use a multimodal array in the user message content: text + image_url (URL or Base64).

json
{
  "model": "claude-sonnet-4-20250514",
  "messages": [
    {
      "role": "user",
      "content": [
        { "type": "text", "text": "描述这张图片中的主要内容" },
        {
          "type": "image_url",
          "image_url": { "url": "https://example.com/photo.jpg" }
        }
      ]
    }
  ],
  "stream": true
}

Choosing OpenAI-compatible vs. native Claude API

Your needRecommendation
Quick integration, reuse OpenAI SDK / existing codeOpenAI-compatible format (this page)
Extended thinking, budget_tokens, thinking-block SSE structureNative APICreate extended thinking chat
Tool Use / function callingNative APICreate function calling (streaming)
Structured output (JSON Schema)Native APICreate structured outputs
PDF documents, web searchNative APIPDF support, Web search
Only need chat + vision + thinking with an OpenAI clientOpenAI-compatible format covers the main path

FAQ

Q: Why doesn’t this match the official OpenAI docs exactly?
A: The compatibility layer aligns requests/responses with OpenAI while Claude runs underneath. Some OpenAI-only parameters may be ignored; rely on what the model actually supports.

Q: What should I put in model?
A: Use the full model ID from the model marketplace (usually including claude). Do not use shorthand or outdated names.

Q: Can I use the official Anthropic Python SDK?
A: That SDK targets POST /v1/messages by default. To keep using it, follow the native API docs in this folder; use the OpenAI SDK for the compatible format on this page.

Q: What is the streaming response format?
A: SSE with data: {...} lines and data: [DONE] at the end, same as OpenAI streaming Chat Completions. See Chat completion chunk object.

Q: Full response JSON structure?
A: See Chat completion object. For native Messages API responses, see Chat completion object (Anthropic format).

Further reading

  • Getting Started — first call and API Key
  • API quick start guide — Base URL and client setup
  • List modelsGET /v1/models for available models
  • Other pages in this folder — detailed parameters and examples for Claude native Messages API