Conclusion
- Test endpoint, key, and model with curl first.
- 401 usually means permission or endpoint mismatch.
- OpenAI-compatible still needs streaming and JSON smoke tests.
- A gateway reduces config drift across tools.
What to do next
- Copy current base_url from official docs.
- Create the key in the right console/project.
- Run minimal curl with exact model name.
- Check quota, billing, region, and compatible-mode support.
- Use OpenLLMAPI when multiple apps share routes.
Recommended paths
| Provider | Free / credits | Best for |
|---|---|---|
| DeepSeek | Verify current credits/pricing | Low-cost reasoning and coding |
| Qwen | Signup credits vary | China-friendly compatible setup |
| Zhipu GLM | Signup tokens vary | Domestic fallback |
| OpenLLMAPI | Trial varies | One endpoint with logs and fallback |
Global developer checklist
- Confirm whether signup, billing, and API keys work from your country before writing production code.
- Prefer OpenAI-compatible endpoints when you may need to switch models, regions, or providers later.
- Test free credits with a real smoke prompt and record latency, error shape, streaming behavior, and quota burn.
- Keep at least one fallback route for provider outages, model deprecations, and regional access changes.
Production handoff
Stop fixing base_url in every tool
Use one compatible endpoint for Qwen, GLM, DeepSeek, and fallback models with logs and budget controls.
FAQ
Why does Qwen return 401?
Usually wrong base_url, workspace key, quota, or model access.
Why does GLM say unauthorized?
Check endpoint path, bearer key, model name, permissions, and default endpoint settings.
Should I test curl first?
Yes. Curl removes SDK abstraction.