OpenAI Compatible API

Puyun AI is fully compatible with the OpenAI API protocol. You can use the OpenAI SDK or any OpenAI-compatible client to connect directly.

Base URL

https://ai.tracup.com/v1

Authentication

All endpoints require authentication:

Bearer Token: Authorization: Bearer <your-api-key>
API Key Header: x-api-key: <your-api-key>

List Models

Returns the list of currently available models.

curl https://ai.tracup.com/v1/models \
  -H "Authorization: Bearer sk-your-api-key"

Response example:

{
  "object": "list",
  "data": [
    { "id": "gpt-4o", "object": "model", "owned_by": "openai" },
    { "id": "claude-sonnet-4-6", "object": "model", "owned_by": "anthropic" },
    { "id": "gemini-2.5-pro", "object": "model", "owned_by": "google" }
  ]
}

Chat Completions

OpenAI-compatible chat completion endpoint. Supports all upstream models, not just OpenAI models.

curl https://ai.tracup.com/v1/chat/completions \
  -H "Authorization: Bearer sk-your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Explain quantum computing"}
    ],
    "temperature": 0.7,
    "max_tokens": 1024
  }'

Request parameters:

Parameter	Type	Required	Description
model	string	Yes	Model ID (e.g., gpt-4o, claude-sonnet-4-6)
messages	array	Yes	Array of message objects
stream	boolean	No	Enable streaming response
temperature	number	No	Sampling temperature (0-2)
max_tokens	integer	No	Maximum tokens to generate
top_p	number	No	Nucleus sampling probability
n	integer	No	Number of completion candidates
stop	string/array	No	Stop sequences
presence_penalty	number	No	Presence penalty (-2 to 2)
frequency_penalty	number	No	Frequency penalty (-2 to 2)
tools	array	No	Function Calling tool definitions
tool_choice	string/object	No	Tool calling strategy

Response example:

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1700000000,
  "model": "gpt-4o",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Quantum computing is a type of computing that leverages quantum mechanics..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 25,
    "completion_tokens": 128,
    "total_tokens": 153
  }
}

Streaming Response

When "stream": true is set, the response uses SSE format:

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","choices":[{"delta":{"content":"Qu"},"index":0}]}

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","choices":[{"delta":{"content":"an"},"index":0}]}

data: [DONE]

Embeddings

Generate text vector embeddings.

curl https://ai.tracup.com/v1/embeddings \
  -H "Authorization: Bearer sk-your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "text-embedding-3-small",
    "input": "Hello world"
  }'

Request parameters:

Parameter	Type	Required	Description
model	string	Yes	Embedding model ID
input	string/array	Yes	Text to embed
encoding_format	string	No	Encoding format (float or base64)

Automatic Protocol Conversion

One of Puyun AI's core advantages is automatic protocol conversion. When you call a non-OpenAI model (such as Claude or Gemini) through the OpenAI-compatible endpoint, the platform automatically:

Request conversion: Converts the OpenAI-format request body to the target model's native format
Response conversion: Converts the target model's native response to OpenAI format
Streaming adaptation: Streaming responses are automatically format-converted as well

This means you can use the same OpenAI SDK code to call all models without worrying about underlying protocol differences.

Using the OpenAI SDK

from openai import OpenAI

client = OpenAI(
    api_key="sk-your-api-key",
    base_url="https://ai.tracup.com/v1"
)

# Call an OpenAI model
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello"}]
)

# Call a Claude model — same code, automatic protocol conversion
response = client.chat.completions.create(
    model="claude-sonnet-4-6",
    messages=[{"role": "user", "content": "Hello"}]
)

# Call a Gemini model — same approach
response = client.chat.completions.create(
    model="gemini-2.5-pro",
    messages=[{"role": "user", "content": "Hello"}]
)

Error Responses

All errors follow the OpenAI error format:

{
  "error": {
    "message": "Invalid API key",
    "type": "invalid_request_error",
    "code": "invalid_api_key"
  }
}

HTTP Status Code	Meaning
400	Invalid request parameters
401	Authentication failed — invalid API Key
403	Insufficient permissions — model unavailable
429	Rate limit or insufficient balance
500	Internal server error
502	Upstream service error