Cerebras
Cerebras provides a generous free tier, but tokens per minute is slightly more limited than others.
Model | Tokens | Requests | Max Context Length |
---|---|---|---|
qwen-3-235b-a22b | minute: 64,000 hour: 1,000,000 day: 1,000,000 | minute: 30 hour: 900 day: 14,400 | 40,000 |
ModelScope.AI
ModelScope offers 2,000 free API calls if you are in China mainland. https://github.com/QwenLM/qwen-code/commit/dc087deace4f30c21793ec6c78a400de7a24b2a2
- 免费推理API由阿里云提供算力支持,要求您的ModelScope账号必须绑定阿里云账号后才能正常使用。
- 每位魔搭注册用户,当前每天允许进行总数为2000次的API-Inference调用,平台后续可能随时调整此额度。
Gemini Code Assist
https://developers.google.com/gemini-code-assist/resources/quotas
Individual usage quota of requests from Gemini Code Assist agent mode and the Gemini CLI:
- 60 requests per user per minute
- 1000 requests per user per day
Gemini API
https://ai.google.dev/gemini-api/docs/rate-limits
Rate limits in 3 dimensions: RPM, TPM, RPD. See link above for details.
Requests per day/minute: 100/5 for Gemini 2.5 Pro and 250/10 for Gemini 2.5 Flash
OpenRouter
https://openrouter.ai/docs/api-reference/limits
Free usage limits: If you’re using a free model variant (with an ID ending in :free
), you can make up to 20 requests per minute. The following per-day limits apply:
- If you have purchased less than 10 credits, you’re limited to 50
:free
model requests per day. - If you purchase at least 10 credits, your daily limit is increased to 1000
:free
model requests per day.