Usage

Learn about Puyun AI's service tiers, pricing model, and usage tracking.

Service Tiers

Puyun AI offers three service tiers, each designed for different use cases with varying service quality and pricing.

Test

Use case: API debugging, feature verification, early-stage development testing
Features:
- Lowest cost, ideal for high-volume test calls
- No guarantee of service stability or response speed
- Request queuing may occur
Recommended for: Initial integration testing, CI/CD automated testing

Flex

Use case: Individual development, independent projects, daily use
Features:
- Balanced cost-performance ratio
- Good service stability
- Supports most mainstream models
Recommended for: Personal tools, script automation, small-to-medium projects

Enterprise

Use case: Enterprise production environments, mission-critical systems
Features:
- Highest service stability with SLA guarantees
- Priority scheduling, low latency
- Dedicated resource pool, isolated from other users
- Dedicated technical support
Recommended for: Enterprise applications, SaaS product backends, high-concurrency production services

Tier Comparison

Feature	Test	Flex	Enterprise
Cost	Lowest	Medium	Higher
Stability	Basic	Good	Highest
Response Speed	Standard	Fast	Fastest
SLA Guarantee	None	Basic	Full
Technical Support	Community	Ticket	Dedicated
Use Case	Testing	Individual development	Enterprise production

Pricing Model

Pricing is based on token consumption. Each model has different input and output token rates:

Input tokens: Tokens in prompts and conversation history
Output tokens: Tokens in model-generated responses

Prices for the same model may differ across service tiers. The Enterprise tier is priced slightly higher due to its more stable service.

Viewing Prices

Visit the homepage to view real-time pricing for all models. Prices are displayed per million tokens.

Usage Tracking

Monitor your usage through the Portal dashboard:

Log in to the Portal
Go to the Usage page
View token consumption and cost breakdown by model

Billing

Prepaid model: Top up your account balance before use
Real-time deduction: Token costs are deducted from your balance in real time
No hidden fees: You are only billed for actual token consumption

Rate Limits

Rate limits vary by service tier. Check the current quota information in your dashboard for specific limits. The Enterprise tier supports custom rate limits.