Conclusion
- First check: key, endpoint, model name, and account permission must come from the same GLM/Zhipu console environment.
- Do not debug inside an agent first; run a minimal cURL request so you can see the real HTTP status and response body.
- If the tool hides headers or rewrites base_url, test with Python OpenAI SDK separately.
- If GLM access keeps failing, switch the same OpenAI-compatible app to Qwen, DeepSeek, SiliconFlow, or a gateway fallback.
What to do next
- Regenerate or re-copy the GLM/Zhipu API key and remove extra spaces, quotes, or old environment-variable values.
- Copy the current base_url from official docs or the console; do not mix old v3/v4 snippets with current model names.
- Verify the model name is enabled for your account and matches the endpoint family and client format.
- Run cURL before Cursor, Claude Code-style tools, SillyTavern, or custom agents so auth errors are visible.
- After the raw request works, move the same base_url, model, and key into your tool and set a fallback provider.
Recommended paths
| Provider | Free / credits | Best for |
|---|---|---|
| Zhipu GLM | 5M signup tokens / GLM Flash route | Native GLM setup and China-friendly testing |
| Qwen | 70M signup tokens | DashScope compatible-mode fallback |
| DeepSeek | $5 signup / current credit | Low-cost chat and coding fallback |
| SiliconFlow | ¥14 + free model routes | China-direct OpenAI-compatible testing |
| OpenLLMAPI | Signup credit varies | One endpoint when GLM auth or routing is unstable |
Global developer checklist
- Confirm whether signup, billing, and API keys work from your country before writing production code.
- Prefer OpenAI-compatible endpoints when you may need to switch models, regions, or providers later.
- Test free credits with a real smoke prompt and record latency, error shape, streaming behavior, and quota burn.
- Keep at least one fallback route for provider outages, model deprecations, and regional access changes.
Production handoff
Need a fallback while GLM auth is failing?
Use a single OpenAI-compatible endpoint and route GLM-like workloads to Qwen, DeepSeek, Gemini, Claude, or GPT until the direct provider key is fixed.
FAQ
Why does GLM return Unauthorized even with the right key?
The key may be from a different account/project, the model may not be enabled, the base URL may be stale, or the client may be sending the key in the wrong header. Test with cURL first.
Should I use the OpenAI SDK for GLM?
Use it only if the current GLM endpoint supports the OpenAI-compatible path you are configuring. Set base_url, api_key, and model explicitly.
Can a proxy or relay cause GLM Unauthorized?
Yes. Some relays strip Authorization headers, rewrite paths, or point to a different model route. Test direct provider access and relay access separately.
What is the fastest fallback if GLM auth fails?
For compatible chat/coding workloads, try Qwen, DeepSeek, SiliconFlow, or a single-key relay while you continue debugging GLM access.