Cloudflare Workers AI
Cloudflare Workers AI is Cloudflare's edge AI inference platform. $5/mo Workers plan includes 10,000 free AI calls per day, permanently valid. 50+ open-source models: LLM (Llama 3.1/3.3, Gemma, Mistral), image generation (Stable Diffusion XL), embeddings (BGE), speech-to-text (Whisper), and more. Key advantage: if you already use Cloudflare Workers, this is essentially free. Inference runs on 300+ global edge nodes with ultra-low latency. Direct China access. Pay-as-you-go after free quota, no hard cutoff.
🎁 Free Tier
Daily Limit: 10,000 free requests/day
| Model | Context | Limit | Notes |
|---|---|---|---|
| @cf/meta/llama-3.1-8b-instruct | 128k | 10000/day | Meta Llama 3.1 8B, lightweight chat model |
| @cf/meta/llama-3.3-70b-instruct-fp8-fast | 128k | 10000/day | Llama 3.3 70B FP8 accelerated |
| @cf/google/gemma-7b-it-lora | 8k | 10000/day | Google Gemma 7B with LoRA fine-tuning support |
| @cf/stabilityai/stable-diffusion-xl-base-1.0 | | 10000/day | Stable Diffusion XL image generation, completely free |
| @cf/baai/bge-base-en-v1.5 | | 10000/day | BGE embedding model for RAG and semantic search |
| @cf/microsoft/phi-2 | 2k | 10000/day | Microsoft Phi-2 small model |
| @cf/mistral/mistral-7b-instruct-v0.2-lora | 32k | 10000/day | Mistral 7B with LoRA support |
🔑 Free API
Free Credits: 每天 10000 神经元(永久有效)
Rate Limit: 10000 requests/day
Cloudflare Workers $5/mo plan includes Workers AI with 10,000 free neurons/day. 50+ open-source models available including LLM, image generation (SD XL), embeddings, speech-to-text. OpenAI-compatible API via AI Gateway, or direct Workers AI binding. Inference runs on Cloudflare's global edge network with ultra-low latency. Direct China access.