Is GLM-5.2 free to use?

GLM-5.2 is an open-weight model under the MIT license, so you can download the model weights for free from Hugging Face or ModelScope and use them commercially. That said, at 753B total parameters it is far too large to run casually on a personal computer. The realistic way to try it is through Z.ai's chat or coding plan, or via an API such as Claude Code.

Is GLM-5.2 better than GPT-5.5?

On the SWE-bench Pro and FrontierSWE coding benchmarks, GLM-5.2 beats GPT-5.5. However, GPT-5.5 is ahead on Terminal-Bench 2.1, and the top SWE-bench Pro score belongs to Claude Opus 4.8. It does not beat GPT-5.5 or Opus 4.8 across every benchmark; it is best understood as an open model that draws level on long-horizon coding.

Can I use GLM-5.2 in Claude Code?

Yes. Coding agents such as Claude Code, ZCode, and OpenCode support it, and you switch to it by setting the model name to "GLM-5.2". To use the 1-million-token context, specify "GLM-5.2[1m]", and you can set the thinking effort level to High or Max.

What Is GLM-5.2? The Open-Weight Model Beating GPT-5.5 on Coding

What GLM-5.2 Is (Z.ai's Open-Weight Model)

GLM-5.2 is a large language model released on June 17, 2026 by Z.ai (formerly Zhipu AI, based in Beijing), built for long-horizon tasks. Its defining trait is that the model itself is distributed as "open weights" — and under the permissive MIT license at that. In other words, anyone can download the model and build it into a commercial service, or adapt it to their own needs — and that is the decisive difference from closed models like ChatGPT or Claude, whose internals are not public.

GLM-5.2 key specs (official)

Developer

Z.ai (formerly Zhipu AI, Beijing)

Released

June 17, 2026

Total params

753B (MoE / Mixture-of-Experts)

Context

Up to 1 million (1M) tokens

License

MIT (commercial use and modification allowed)

Distribution

Open weights (via Hugging Face / ModelScope)

A 753B MoE and How IndexShare Works

GLM-5.2 has a huge 753B total parameters, but it uses a Mixture-of-Experts (MoE) design. MoE means only a part of the giant network fires for any given input, so not every parameter is computed each time. Reports put the active size at roughly 40B per token, which keeps such a large model manageable to run. On top of that, GLM-5.2's new "IndexShare" mechanism cuts the per-token compute (FLOPs) by about 2.9× even at a 1M-token context.

Z.ai official model card (GLM-5.2)View official source →

We propose IndexShare, which reuses the same indexer across every four sparse attention layers, reducing per-token FLOPs by 2.9× at a 1M context length. — From the model card's description of the IndexShare architecture (total parameters listed separately as "753B params")

The 1M-Token Context and the MIT License

GLM-5.2 can handle up to 1 million tokens of text (its context) at once. That is enough to load an entire large codebase or a stack of documents and keep working while preserving context. The core of GLM-5.2 is that it is designed to run long stretches of coding and agent-style autonomous execution stably. Training uses Z.ai's own RL framework, "slime," which supports everything from training through large-scale inference.

Z.ai official blog (GLM-5.2: Built for Long-Horizon Tasks)View official source →

slime serves as an integrated infrastructure layer from training to large-scale inference rollout. — From the description of the "slime" training framework

GLM-5.2 Performance vs GPT-5.5

The main reason GLM-5.2 drew attention is that its coding performance drew level with the top closed models. But lumping it together as "beats GPT-5.5" is not accurate. Look at the official benchmark table and the wins and losses are clearly split.

Key coding benchmarks compared (figures from the official model card)

Scores normalized to 100. Blue = GLM-5.2 / grey = GPT-5.5 / black = Claude Opus 4.8. Terminal-Bench uses the Terminus-2 harness. Source: Z.ai official model card.

SWE-bench Pro (GLM-5.2 wins)

GLM-5.262.1

GPT-5.558.6

Opus 4.869.2

Terminal-Bench 2.1 (GPT-5.5 ahead)

GLM-5.281.0

GPT-5.584.0

Opus 4.885.0

FrontierSWE (GLM-5.2 beats GPT-5.5)

GLM-5.274.4

GPT-5.572.6

Opus 4.875.1

Where GLM-5.2 Beats GPT-5.5

GLM-5.2 beats GPT-5.5 on SWE-bench Pro (62.1 vs 58.6) and FrontierSWE (74.4 vs 72.6), both of which test long-horizon coding. An open-weight model topping one of the leading closed models on real coding benchmarks is close to a first. VentureBeat likewise reported that GLM-5.2 beat GPT-5.5 on multiple long-horizon coding benchmarks.

Z.ai official model card (benchmark table)View official source →

SWE-bench Pro: GLM-5.2 62.1 vs GPT-5.5 58.6. FrontierSWE: GLM-5.2 74.4 vs GPT-5.5 72.6. — From the official benchmark table (Coding category)

Opus 4.8 Leads on Terminal-Bench and SWE-bench Pro

That said, it does not win everything. On Terminal-Bench 2.1, GPT-5.5 (84.0) edges out GLM-5.2 (81.0), and the top SWE-bench Pro score belongs to Claude Opus 4.8 (69.2). Terminal-Bench rankings also shift with the harness (the execution environment), and some runs report GLM-5.2 ahead of Opus 4.8. The accurate read on GLM-5.2 is not "it dethroned the champion" but "an open model that pulled level with the very top on several key coding metrics." Keeping that distinction in mind helps avoid disappointment born of over-high expectations.

It Beats Claude Fable on Design Taste

It is also well regarded beyond coding. On Design Arena, which pits models against each other on design quality, GLM-5.2 was reported to beat Claude Fable (Fable 5). Scoring well on something a user feels immediately — the look of what comes back on the first prompt — is another reason GLM-5.2 became a talking point. AI researcher Nathan Lambert called it the first open model that feels right as a general agent inside coding harnesses.

Interconnects — GLM-5.2 is the step change for openView official source →

GLM-5.2 is the open weight model that feels right in coding harnesses as a general agent. It's the first one. — From the analysis of GLM-5.2's significance

GLM-5.2 Pricing and How to Use It in Claude Code

Alongside performance, the low cost drew attention. VentureBeat reported that GLM-5.2 delivers comparable coding performance at about one-sixth the cost of GPT-5.5. Beyond being open-weight, the fact that it is also cheaper to use via API or a flat-rate plan is what motivates teams to consider switching.

Rough cost compared with GPT-5.5 (an estimate based on reporting)

Relative cost with GPT-5.5 set to 100. GLM-5.2 is an estimate based on the "about one-sixth" reporting; actual cost varies by usage. Source: VentureBeat and other reporting.

GLM-5.2~1/6

GPT-5.5100 (baseline)

About One-Sixth the Cost of GPT-5.5

Because GLM-5.2's weights are distributed for free, running it on your own server incurs no extra per-model usage fee. On Z.ai's coding-focused flat-rate plan, usage is billed at 3× during peak hours and 2× during off-peak hours. The official blog also announced a limited-time promotion that bills off-peak usage at 1× through the end of September 2026. The fact that self-hosting incurs no usage-based fee in principle is what creates the large cost gap with closed ChatGPT-family models.

Z.ai official blog (GLM-5.2: Built for Long-Horizon Tasks)View official source →

Consumes 3× during peak hours and 2× during off-peak hours. — From the description of the coding plan's consumption rate

Using It in Claude Code, ZCode, and OpenCode

GLM-5.2 is available from the major coding agents. Claude Code, Z.ai's own ZCode, and OpenCode all support it, so you can swap in just the model while staying in the tool you already know.

Steps to use GLM-5.2 in Claude Code

Step 1

↓

Step 2

In Claude Code, switch the model name to "GLM-5.2"

↓

Step 3

To use the 1M-token context, specify "GLM-5.2[1m]"

↓

Step 4

Set the thinking effort level to High or Max

Speaking as someone who uses Claude Code at the center of day-to-day work, being able to try just a different model without changing tools is no small thing. Being able to swap in models with a different cost-performance balance, without touching your existing workflow, is the biggest practical upside of having more open models around.

Pros and Cautions for GLM-5.2

Finally, here are the points to weigh when deciding whether GLM-5.2 is for you. Its strengths are clear, but a model this large also brings real-world hurdles.

What to weigh before choosing GLM-5.2

Good fit

Long-horizon coding and agent use, keeping costs down, running it in your own environment

Watch out

At 753B it is hard to run locally, and it does not beat GPT-5.5 or Opus 4.8 on every benchmark

The Strength of Open Weights You Can Self-Host

GLM-5.2's biggest strength is that, under the MIT license, you get the model itself. You can run it on your own server even for work where data cannot leave the building, and you can modify or fine-tune it to fit your use case. Not depending on a single AI provider — and not being at the mercy of pricing changes or a service shutting down — pays off the longer you use it in production. This kind of open option also dovetails with rising interest in sovereign AI and open-weight models.

The Catch of Running Such a Large Model Locally

On the other hand, 753B total parameters is not something you can casually run on a personal computer. Running it yourself takes a substantial GPU setup, so for most people an API or chat front-end is the realistic entry point. "Open weights" does not mean "anyone can run it on their own machine right away" — worth understanding correctly before you adopt it. A sensible path is to first check the performance through Z.ai's chat or Claude Code, then consider self-hosting if needed.

When you research models from overseas, you will often need to read English official docs and model cards. If you want an AI to summarize a long English document, converting it to Markdown first preserves the heading and table structure and improves accuracy. Pasting a web page as-is tends to mix in styling noise, so tidying the source before handing it over is the shortcut to stable results.

Free ToolURL to Markdown ConverterConvert any public web page URL to Markdown. Preserves headings, tables, lists, and links — perfect for LLM and RAG preprocessing, research notes, and archiving web articles.Try it now →

GLM-5.2 is a milestone model: open-weight, yet beating GPT-5.5 on some coding benchmarks. It is not top of the table on every metric, but having an option that handles long-horizon coding cheaply — standing alongside the closed models — matters a great deal. To judge whether it fits your needs, the steady approach is to try its actual responses via chat or API first. For developers who want to keep costs down on long-horizon coding and agent work, it has become a strong candidate.