DeepInfra Free API Credits and Pricing Guide
DeepInfra is a developer-focused hosted API platform for open models, including Llama, Qwen, Mistral, embeddings, rerankers, and image models. It is useful when you want an OpenAI-compatible path to open-source models without operating GPUs. Free or trial credits, rate limits, and available models change over time, so verify the DeepInfra dashboard and official docs before production.
🎁 Free Tier
Daily Limit: Free serverless model testing after account signup; exact quota varies by model and promotion
| Model | Context | Limit | Notes |
|---|---|---|---|
| meta-llama/Meta-Llama-3.1-8B-Instruct | 128k | Account/model dependent | Popular open chat model for low-cost API tests |
| Qwen/Qwen2.5-Coder-32B-Instruct | 32k | Account/model dependent | Useful for coding and refactoring workloads |
| BAAI/bge-large-en-v1.5 | 512 tokens | Account/model dependent | Embedding model for RAG prototypes |
🔑 Free API
Free Credits: Free trial credits / promotional balance varies by account
Rate Limit: Model and account dependent; verify dashboard before production
Offers serverless APIs, OpenAI-compatible chat endpoints, and many open-source models; free credits and pricing vary by account and model.