What LongCat-2.0 is (a 1.6T MoE)
LongCat-2.0: key specs
LongCat-2.0 is an AI model that China's Meituan — best known as a food-delivery giant — open-sourced on June 30, 2026. Its total parameter count reaches 1.6 trillion, but it uses an MoE (Mixture-of-Experts) design so that only part of the model runs at a time. For the broader context of China's AI investment, see our guide to China's AI Five-Year Plan.
A 1.6T MoE with a 1M-token context
The model's backbone is huge yet efficiency-focused. It has 1.6 trillion total parameters, but uses only 33B–56B (about 48B on average) per token, and natively supports a 1-million-token context.
"1.6T total parameters with dynamic activation of 33B–56B per token" / "natively supports 1 million token context" — from the LongCat official site
MoE works by calling only the "experts" it needs, keeping compute in check. Even if the whole is enormous, only a fraction runs per request, making it easier to balance performance against running cost.
Specialized for agentic coding
LongCat-2.0 is aimed not at general chat but at writing code. The official site positions it as a "1.6T open-source MoE for agentic coding," meant for handing off multi-step coding work. Its design philosophy contrasts with running long tasks on far fewer parameters — compare our guide to Scaling the Horizon (a 35B agent) to see the difference.
The "industry-first" claim of training on domestic chips
The compute used for training and inference
The biggest reason LongCat-2.0 drew attention is less about performance than about what hardware built it. Under US export controls that limit access to high-end GPUs, it claims to have avoided that dependence.
Training and inference on a 50,000-card domestic cluster
The official site makes a pointed claim about its compute. It says LongCat-2.0 is the first trillion-parameter-class model to complete both training and inference entirely on a 50,000-card domestic compute cluster.
"it is the industry's first trillion-parameter model to complete full training and inference on a 50,000-card domestic compute cluster" — from the LongCat official site (LongCat-2 model page)
The official wording is "domestic compute cluster" and does not name NVIDIA explicitly. Still, external media frame it as a setup that avoids NVIDIA GPUs, treating it as a real-world example of moving away from foreign GPUs under a restrictive environment.
Training tokens: "30T+" on the official site (differs from reports)
The numbers come with a caveat. Training data is listed as "30T+ tokens" on the official site. Hugging Face's model description and some external reports say "35T+ tokens," so the figure differs by source. This article uses the primary source's "30T+ tokens" and flags the gap with reported figures.
"Pretrained from scratch on 30T+ tokens spanning Chinese, English, multilingual, and code data" — from the LongCat official site (LongCat-2 model page)
LongCat-2.0's coding performance and significance
SWE-bench Pro score comparison (higher is better)
Figures from the LongCat official site. 0–100 scale. Source: LongCat official site.
Let's check the claimed performance against the benchmarks the site published. In coding, the numbers come out ahead of established models.
Leading major models on SWE-bench Pro
The official benchmarks show strong coding performance. On SWE-bench Pro, which measures realistic software development, LongCat-2.0 scores 59.5, leading Gemini 3.1 Pro (54.2), GPT-5.5 (58.6), and the Claude Opus 4.6 (57.3) listed on the same page.
"SWE-bench Pro 59.5 (leads Gemini 3.1 Pro 54.2, GPT-5.5 58.6, Claude Opus 4.6 57.3)" / "SWE-bench Multilingual 77.3 (on par with Claude Opus 4.6 77.8)" — from the LongCat official site (benchmark table)
On the multilingual SWE-bench Multilingual it scores 77.3, roughly on par with Claude Opus 4.6 (77.8). Note that the official page compares against Claude Opus 4.6, not the newer 4.7 or 4.8 versions of Claude Opus.
Significance and caveats (reasoning results are mixed)
LongCat-2.0's significance lies less in the coding scores than in "building a frontier-class coding model on domestic chips alone." Showing that an independent path can keep pace even when high-end GPUs are hard to source echoes the "self-contained AI" theme in our guide to sovereign AI (Apertus).
There is a caveat, though. Some external coverage notes weaker results on reasoning-oriented benchmarks such as instruction following and math/science, but those scores do not appear on the official page. Overall strength beyond coding is hard to state definitively from public information alone. If your use is coding-centric, it's a strong option; if you also need general reasoning, it's best confirmed on your own tasks against other models.
When feeding long specs or documents to a model like this, converting them to Markdown first preserves structure and tends to improve accuracy.



