Nvidia’s NIM now runs GLM‑5, powering Claude Code for free users today.
Photo by Javier Esteban (unsplash.com/@javiestebaan) on Unsplash
Nvidia’s NIM platform now includes the GLM‑5 model, enabling the free‑tier Claude Code CLI to run on it, according to a recent report. The update adds tool‑calling fixes and lets users access Anthropic’s Claude Code experience without cost.
Key Facts
- •Key company: Nvidia
Nvidia’s NIM inventory now lists the GLM‑5 model with a fresh tool‑calling patch, and the community‑run free‑claude‑code proxy has been updated to route Anthropic’s Claude Code CLI through it — all at zero cost, the maintainer announced on X 【report】. The change means developers can tap the same agentic coding experience that powers Claude Code without an Anthropic subscription, simply by pointing the CLI at Nvidia’s free‑tier NIM endpoint, which offers 40 requests per minute without a credit‑card requirement 【report】.
Beyond the GLM‑5 fix, the proxy now supports a broader ecosystem of back‑ends. OpenRouter integration lets users select any model on that marketplace, while LMStudio adds a fully local option for those who prefer on‑device inference 【report】. The author also rolled out Discord bot support alongside the existing Telegram bot, enabling remote task submission from a server channel—a handy feature for developers who spend more time in chat than at a terminal 【report】. Additional polish includes a VS Code extension that embeds Claude Code’s interleaved‑thinking UI directly into the editor, and a configurable sliding‑window rate limiter that smooths concurrent sessions 【report】.
The appeal of this stack lies in its cost‑free operation and its preservation of “interleaved thinking” tokens across turns, a capability that many hosted Claude Code front‑ends lack. According to the maintainer, models such as GLM‑5 and Kimi‑K2.5 can carry over reasoning from previous calls, improving the fidelity of multi‑step coding tasks 【report】. Five built‑in optimizations further trim unnecessary LLM calls—fast prefix detection, title‑generation skips, and suggestion‑mode bypasses—making the workflow faster and cheaper than the official Anthropic offering 【report】. Because the proxy’s architecture is modular, new models that land on Nvidia’s NIM can be plugged in without code changes, and community contributors can add custom providers or messaging platforms via pull requests 【report】.
The move arrives as Nvidia doubles down on AI‑centric partnerships, most recently announcing a $1 billion investment in Nokia to push AI to the edge and laying groundwork for AI‑native 6G wireless networks — coverage of those deals appears in VentureBeat, The Register and Wccftech 【Additional Coverage】. While those initiatives target large‑scale infrastructure, the free‑claude‑code project illustrates how Nvidia’s cloud‑AI services are being repurposed for developer‑level tooling. By surfacing GLM‑5 on NIM and bundling it with open‑source glue, Nvidia effectively lowers the barrier to entry for sophisticated code‑generation assistants, potentially expanding the user base that will later migrate to paid tiers or Nvidia‑hosted AI products.
Sources
No primary source found (coverage-based)
- Reddit - r/ClaudeAI
This article was created using AI technology and reviewed by the SectorHQ editorial team for accuracy and quality.