Request-based plans designed for developers and small teams. Start free, scale when you need to, and never worry about token calculations.
For developers exploring Oxlo and testing ideas.
(Early bull discount)
For builders running early apps and prototypes.
7 day free trial
For teams running production workloads.
7 day free trial.
For high-volume or custom requirements.
All plans use request-based pricing. No token calculations.
| Usage & Limits | ||||
|---|---|---|---|---|
| Requests included | 60 / day | 300 / day | High request limits | Custom |
| Burst rate limit | 5 / minute | 30 / min | 120 / min (tunable) | Custom |
| Monthly request cap | Yes | Yes | No small daily cap | Custom |
| Queued behind paid traffic | Yes | Yes (Behind Premium) | No | No (Dedicated GPUs) |
| Models & Performance | ||||
| Optimized models over 8B | No | Limited | Yes | Yes |
| Production-grade inference | No | No | Yes | Yes |
| Priority execution | Lowest | Medium | Highest | Optional |
| Average Response Latency | ≤ 7 seconds | ≤ 1 second | ≤ 100 ms | - tunable |
| Request & Context Limits (Caps are for safety and performance, not billing) | ||||
| Input tokens / request | Up to 2k | Up to 4k | 8k-16k | Custom |
| Output tokens / request | Up to 512 | Up to 1k | Up to 4k | Custom |
| Pricing & Billing | ||||
| Request-based pricing | Yes | Yes | Yes | Yes |
| Token-based billing | No | No | No | No |
| Fixed monthly limits | Yes | Yes | Yes | Custom |
| Usage limits visible upfront | Yes | Yes | Yes | Yes |
| Developer Experience | ||||
| Open-source models | Yes | Yes | Yes | Yes |
| Simple API integration | Yes | Yes | Yes | Yes |
| Model-agnostic pricing | Yes | Yes | Yes | Yes |
| Support level | Community | Community | Priority | Dedicated |
| Infrastructure & Technical Differentiation | ||||
| Gateway-level request metering | Yes | Yes | Yes | Yes |
| Pricing independent of prompt length | Yes | Yes | Yes | Yes |
| Traffic prioritization by plan | No | Yes | Yes | Yes |
| Async and batch-friendly workloads | Yes | Yes | Yes | Yes |
With Oxlo.ai's request-based pricing, you pay a flat monthly subscription that includes a set number of API requests per day. Each request costs the same regardless of how many tokens are in your prompt or response. A 100-token prompt costs the same as a 50,000-token prompt. This is fundamentally different from token-based pricing used by OpenAI, Together AI, Fireworks AI, OpenRouter, and Replicate.
For long-context workloads, yes. Together AI, Fireworks AI, and OpenRouter all charge per token, so costs scale linearly with prompt length. Running 500 API calls per day with 3,000-token prompts costs approximately $40-60/month on these providers vs $49.90/month on Oxlo.ai Premium. But as prompt length increases beyond 10,000 tokens, Oxlo.ai can be 10-100x cheaper since every request costs the same flat rate.
Yes. New users get a 7-day free trial with full access to all 40+ models including Qwen 3 32B, Llama 3.3 70B, DeepSeek R1, and premium image generation. No credit card required to start. The Free tier (60 requests/day, 16+ models) is available permanently.
When you reach your daily request limit, additional requests are queued until the next day or you can upgrade your plan for higher limits. There are no overage charges - your costs are always predictable and fixed. This is unlike token-based providers where a single runaway prompt can spike your bill.
Yes, you can upgrade or downgrade your plan at any time. When upgrading, you get immediate access to the higher plan's limits. All plans are billed monthly with no long-term contracts required.