yangmao.ai · cURL setup money page
vLLM cURL API Setup
Use cURL to smoke-test vLLM before wiring SDK code. Confirm the exact endpoint, model name, and quota in the provider dashboard.
Quick verdict
- Free API: Self-hosted OpenAI-compatible API; no vendor credits required.
- Rate limits: Hardware-bound; depends on GPU memory, model size, and concurrency.
- Best model starting point: OpenAI-compatible server
- China access: direct or relatively friendly
Provider fit matrix
Production readiness checklist
cURL smoke test
Use this to verify endpoint, auth header, model name, response shape, and quota before adding SDK abstractions.
curl http://localhost:8000/v1/chat/completions \
-H "Authorization: Bearer $VLLM_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "OpenAI-compatible server",
"messages": [{"role": "user", "content": "Hello from yangmao.ai"}]
}' Free API and pricing notes
Self-hosted OpenAI-compatible API; no vendor credits required.
vLLM can turn open models into an OpenAI-compatible API for private deployments, lower-cost inference, and high throughput.
Access and production risk
China-friendly / direct path likely
Self-hosted deployment; China access depends on your cluster, mirrors, and model download path.
How to set it up
Create an API key and copy the provider endpoint from official docs.
Export the key into your shell session.
Send a minimal chat completion payload with cURL.
Check status code, JSON body, and rate-limit headers.
Move the tested endpoint into your app or fallback relay.
Fallback CTA with tracked UTM
If you do not want to juggle provider keys, rate limits, and regional access, use openllmapi.com as a unified API fallback.
Try openllmapi with one key →UTM: utm_source=yangmao.ai · utm_medium=seo · utm_campaign=provider · utm_content=vllm-setup-curl
Related internal links
Source snapshot
Data source: yangmao.ai provider YAML tracker plus provider docs reviewed by the daily crawler. Official dashboards can change quota and pricing without notice; verify before production.
- yangmao.ai provider id
- vllm
- Official source
- https://docs.vllm.ai/
- Last updated
- 2026-05-16
- Free tier
- Apache-2.0 open-source.
- API credits
- Self-hosted OpenAI-compatible API; no vendor credits required.
- Rate limit
- Hardware-bound; depends on GPU memory, model size, and concurrency.
- Access note
- Self-hosted deployment; China access depends on your cluster, mirrors, and model download path.
FAQ
Does vLLM have a free API?
Yes. Current yangmao.ai record: Self-hosted OpenAI-compatible API; no vendor credits required.. Rate limit note: Hardware-bound; depends on GPU memory, model size, and concurrency..
Is vLLM OpenAI-compatible?
The recorded setup uses an OpenAI-compatible pattern or SDK-style call. Validate the latest base URL and model names in vLLM docs.
Can I use vLLM from China?
vLLM is marked as relatively direct or China-friendly in the current tracker.
What should I do when vLLM credits run out?
Compare the alternatives below, check /en/free-ai-api/, or use the openllmapi CTA on this page as a one-key fallback with tracked UTM: campaign=provider, content=vllm-setup-curl.