Cerebras

Cerebras provides a generous free tier, but tokens per minute is slightly more limited than others.

ModelTokensRequestsMax Context Length
qwen-3-235b-a22bminute: 64,000
hour: 1,000,000
day: 1,000,000
minute: 30
hour: 900
day: 14,400
40,000

ModelScope.AI

ModelScope offers 2,000 free API calls if you are in China mainland. https://github.com/QwenLM/qwen-code/commit/dc087deace4f30c21793ec6c78a400de7a24b2a2

  • 免费推理API由阿里云提供算力支持,要求您的ModelScope账号必须绑定阿里云账号后才能正常使用
  • 每位魔搭注册用户,当前每天允许进行总数为2000次的API-Inference调用,平台后续可能随时调整此额度。

Gemini Code Assist

https://developers.google.com/gemini-code-assist/resources/quotas

Individual usage quota of requests from Gemini Code Assist agent mode and the Gemini CLI:

  • 60 requests per user per minute
  • 1000 requests per user per day

Gemini API

https://ai.google.dev/gemini-api/docs/rate-limits

Rate limits in 3 dimensions: RPM, TPM, RPD. See link above for details.

Requests per day/minute: 100/5 for Gemini 2.5 Pro and 250/10 for Gemini 2.5 Flash

OpenRouter

https://openrouter.ai/docs/api-reference/limits

Free usage limits: If you’re using a free model variant (with an ID ending in :free), you can make up to 20 requests per minute. The following per-day limits apply:

  • If you have purchased less than 10 credits, you’re limited to 50 :free model requests per day.
  • If you purchase at least 10 credits, your daily limit is increased to 1000 :free model requests per day.