OpenAI Compatible API

Puyun AI is fully compatible with the OpenAI API protocol. You can use the OpenAI SDK or any OpenAI-compatible client to connect directly.

Base URL

text
https://ai.tracup.com/v1

Authentication

All endpoints require authentication:

  • Bearer Token: Authorization: Bearer <your-api-key>
  • API Key Header: x-api-key: <your-api-key>

List Models

Returns the list of currently available models.

bash
curl https://ai.tracup.com/v1/models \
  -H "Authorization: Bearer sk-your-api-key"

Response example:

json
{
  "object": "list",
  "data": [
    { "id": "gpt-4o", "object": "model", "owned_by": "openai" },
    { "id": "claude-sonnet-4-6", "object": "model", "owned_by": "anthropic" },
    { "id": "gemini-2.5-pro", "object": "model", "owned_by": "google" }
  ]
}

Chat Completions

OpenAI-compatible chat completion endpoint. Supports all upstream models, not just OpenAI models.

bash
curl https://ai.tracup.com/v1/chat/completions \
  -H "Authorization: Bearer sk-your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Explain quantum computing"}
    ],
    "temperature": 0.7,
    "max_tokens": 1024
  }'

Request parameters:

ParameterTypeRequiredDescription
modelstringYesModel ID (e.g., gpt-4o, claude-sonnet-4-6)
messagesarrayYesArray of message objects
streambooleanNoEnable streaming response
temperaturenumberNoSampling temperature (0-2)
max_tokensintegerNoMaximum tokens to generate
top_pnumberNoNucleus sampling probability
nintegerNoNumber of completion candidates
stopstring/arrayNoStop sequences
presence_penaltynumberNoPresence penalty (-2 to 2)
frequency_penaltynumberNoFrequency penalty (-2 to 2)
toolsarrayNoFunction Calling tool definitions
tool_choicestring/objectNoTool calling strategy

Response example:

json
{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1700000000,
  "model": "gpt-4o",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Quantum computing is a type of computing that leverages quantum mechanics..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 25,
    "completion_tokens": 128,
    "total_tokens": 153
  }
}

Streaming Response

When "stream": true is set, the response uses SSE format:

text
data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","choices":[{"delta":{"content":"Qu"},"index":0}]}

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","choices":[{"delta":{"content":"an"},"index":0}]}

data: [DONE]

Embeddings

Generate text vector embeddings.

bash
curl https://ai.tracup.com/v1/embeddings \
  -H "Authorization: Bearer sk-your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "text-embedding-3-small",
    "input": "Hello world"
  }'

Request parameters:

ParameterTypeRequiredDescription
modelstringYesEmbedding model ID
inputstring/arrayYesText to embed
encoding_formatstringNoEncoding format (float or base64)

Automatic Protocol Conversion

One of Puyun AI's core advantages is automatic protocol conversion. When you call a non-OpenAI model (such as Claude or Gemini) through the OpenAI-compatible endpoint, the platform automatically:

  1. Request conversion: Converts the OpenAI-format request body to the target model's native format
  2. Response conversion: Converts the target model's native response to OpenAI format
  3. Streaming adaptation: Streaming responses are automatically format-converted as well

This means you can use the same OpenAI SDK code to call all models without worrying about underlying protocol differences.

Using the OpenAI SDK

python
from openai import OpenAI

client = OpenAI(
    api_key="sk-your-api-key",
    base_url="https://ai.tracup.com/v1"
)

# Call an OpenAI model
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello"}]
)

# Call a Claude model — same code, automatic protocol conversion
response = client.chat.completions.create(
    model="claude-sonnet-4-6",
    messages=[{"role": "user", "content": "Hello"}]
)

# Call a Gemini model — same approach
response = client.chat.completions.create(
    model="gemini-2.5-pro",
    messages=[{"role": "user", "content": "Hello"}]
)

Error Responses

All errors follow the OpenAI error format:

json
{
  "error": {
    "message": "Invalid API key",
    "type": "invalid_request_error",
    "code": "invalid_api_key"
  }
}
HTTP Status CodeMeaning
400Invalid request parameters
401Authentication failed — invalid API Key
403Insufficient permissions — model unavailable
429Rate limit or insufficient balance
500Internal server error
502Upstream service error