Usage

Learn about Puyun AI's service tiers, pricing model, and usage tracking.

Service Tiers

Puyun AI offers three service tiers, each designed for different use cases with varying service quality and pricing.

Test

  • Use case: API debugging, feature verification, early-stage development testing
  • Features:
    • Lowest cost, ideal for high-volume test calls
    • No guarantee of service stability or response speed
    • Request queuing may occur
  • Recommended for: Initial integration testing, CI/CD automated testing

Flex

  • Use case: Individual development, independent projects, daily use
  • Features:
    • Balanced cost-performance ratio
    • Good service stability
    • Supports most mainstream models
  • Recommended for: Personal tools, script automation, small-to-medium projects

Enterprise

  • Use case: Enterprise production environments, mission-critical systems
  • Features:
    • Highest service stability with SLA guarantees
    • Priority scheduling, low latency
    • Dedicated resource pool, isolated from other users
    • Dedicated technical support
  • Recommended for: Enterprise applications, SaaS product backends, high-concurrency production services

Tier Comparison

FeatureTestFlexEnterprise
CostLowestMediumHigher
StabilityBasicGoodHighest
Response SpeedStandardFastFastest
SLA GuaranteeNoneBasicFull
Technical SupportCommunityTicketDedicated
Use CaseTestingIndividual developmentEnterprise production

Pricing Model

Pricing is based on token consumption. Each model has different input and output token rates:

  • Input tokens: Tokens in prompts and conversation history
  • Output tokens: Tokens in model-generated responses

Prices for the same model may differ across service tiers. The Enterprise tier is priced slightly higher due to its more stable service.

Viewing Prices

Visit the homepage to view real-time pricing for all models. Prices are displayed per million tokens.

Usage Tracking

Monitor your usage through the Portal dashboard:

  1. Log in to the Portal
  2. Go to the Usage page
  3. View token consumption and cost breakdown by model

Billing

  • Prepaid model: Top up your account balance before use
  • Real-time deduction: Token costs are deducted from your balance in real time
  • No hidden fees: You are only billed for actual token consumption

Rate Limits

Rate limits vary by service tier. Check the current quota information in your dashboard for specific limits. The Enterprise tier supports custom rate limits.