Modal GPU Pricing: $30 Credits, A100/H100 & Setup

Serverless GPU cloud — deploy Python code directly, per-second billing

✅ Free Tier

What is Modal

Modal is a serverless GPU cloud platform, often called "Vercel for GPUs." The core idea: write Python code, add a decorator, and it runs on cloud GPUs — no Docker, Kubernetes, or infra management needed.

Modal supports A100, H100 and other high-end GPUs with per-second billing, no charge when idle. $30 free credits monthly, enough for plenty of experiments. Industry-fastest cold starts, typically 1-2 seconds.

Free Tier & Pricing

Free credits: $30/month (~10 hours A100 or 7.5 hours H100)

Popular GPU pricing:
- T4: $0.59/hr
- A10G: $1.10/hr
- A100 40GB: $2.78/hr
- A100 80GB: $3.72/hr
- H100: $3.95/hr

Per-second billing, auto-release when idle. Pricier than RunPod but much better developer experience.

Editor's note

Editor's note: If you only need API inference, you may not need a GPU rental. Compare free quota, rate limits, and latency first.

China Access Guide

Modal requires proxy access from China. Both registration and usage need stable international network.

For China-based GPU needs, consider AutoDL or RunPod. For model APIs only, use API aggregator with direct China access.

FAQ

Q: Modal vs RunPod?
A: Modal has better DX (Python-native), ideal for rapid prototyping and serverless. RunPod is cheaper for long-running tasks.

Q: Is $30 free credits enough?
A: Enough for plenty of experiments. Continuous production workloads need paid plans.

Q: What frameworks are supported?
A: PyTorch, TensorFlow, vLLM, Hugging Face, and any Python code.

🎁 Free Resource Pack

Get the Free AI Startup Toolkit

Free API credits list, AI business case studies, payment stack, risk checklist, and a monetization roadmap.

Get it free →
🐑 AI Assistant