Ant Group Open-Sources Ling-2.6-1T, a Trillion-Parameter Model Built for Agents

Ant Group's InclusionAI lab on April 30, 2026 released the open weights for Ling-2.6-1T, a trillion-parameter Mixture-of-Experts (MoE) language model purpose-built for agentic workflows, tool use, and long-running software tasks. The release lands the model on Hugging Face and ModelScope under a permissive license, escalating an already aggressive year for Chinese open-source AI.
Key Highlights
- 1 trillion total parameters, with roughly 50–63B active per token via Mixture-of-Experts routing.
- 262,144-token context window, with up to 32,800 output tokens per response.
- Hybrid attention architecture combining Multi-Head Latent Attention (MLA) with Linear Attention to enable a "Fast-Thinking" mode that cuts token overhead.
- Open weights published on Hugging Face and ModelScope, with a first-party API at $0.30 per million input tokens and $2.50 per million output tokens.
- 15-point jump on the Artificial Analysis Intelligence Index versus its predecessor Ling-1T, reaching 34 — comparable to DeepSeek V3.2 (32) and Kimi K2.5 (37).
Details
Developed by the Bailing Large Model team at Ant Group's InclusionAI lab, Ling-2.6-1T is positioned as an "execution-first" flagship rather than a pure reasoning model. Ant frames the design as a deliberate departure from chain-of-thought heavy systems: instead of producing long visible reasoning traces, the model is tuned to act decisively on instructions, tool calls, and structured outputs.
According to Ant's release notes, the model achieves state-of-the-art open-source results on SWE-bench Verified, AIME 26, TAU2-Bench for agent workflows, and BFCL-V4 for tool invocation. Independent evaluation by Artificial Analysis confirmed strong scientific reasoning numbers — 75 percent on GPQA — putting Ling-2.6-1T in the same intelligence tier as DeepSeek V3.2 on graduate-level knowledge tasks.
The "Fast-Thinking" mechanism is the headline efficiency claim. Ant reports that Ling-2.6-1T uses roughly 16 million output tokens to complete the full Artificial Analysis Intelligence Index, compared with about 75 million for GLM-5.1 and 27 million for Kimi K2.6. At list pricing, that translates to roughly 95 dollars to run the entire benchmark suite — an order of magnitude cheaper than several closed frontier models.
Impact
For developers building agents, the cost-to-capability ratio is the most consequential part of the announcement. A trillion-parameter open model that runs agent workflows for under one dollar per significant task removes a meaningful barrier for startups and research teams that have been priced out of frontier API tiers. The 262K context window also opens room for long codebase reviews, multi-document analysis, and extended tool-use loops without aggressive truncation.
The factual reliability picture is more mixed. Artificial Analysis flagged a 92 percent hallucination rate on its AA-Omniscience benchmark — close to GPT-5.5 non-reasoning — meaning teams deploying the model for retrieval-grounded or compliance-sensitive workloads will still need careful guardrails and verification layers.
For the broader ecosystem, the release adds another data point to a narrative that has dominated 2026: Chinese labs are no longer trailing the open-source frontier — they are setting it. Ling-2.6-1T joins recent open releases from Alibaba's Qwen team, Moonshot's Kimi line, Zhipu's GLM-5, and DeepSeek as evidence that the gap to closed Western frontier models is, at minimum, narrowing.
Background
Ant Group, the fintech affiliate in which Alibaba holds roughly a 33 percent minority stake, operates InclusionAI as its open research lab. The Ling family began with smaller dense models, scaled to the original Ling-1T late last year, and has now iterated through Ling-2.5-1T, Ling-2.6-Flash, and Ling-2.6-1T over a period of roughly six months. A separate reasoning-oriented sibling, Ring-1T, was released earlier as the first open-source trillion-parameter "thinking" model, and the two product lines now serve distinct roles: Ring for deliberate reasoning, Ling for fast execution.
The strategy mirrors a broader pattern across Chinese AI labs of distributing open weights aggressively while monetizing through inference APIs and enterprise integrations. Ant has not disclosed training compute, but the architectural improvements — particularly the linear-attention component — suggest meaningful work on inference economics rather than raw scale alone.
What's Next
Ant has signaled that the Ling family will continue iterating on token efficiency and agent specialization, with deeper integration into Ant's own enterprise products expected later this year. Third-party hosts including Novita, OpenRouter, and ZenMux already serve Ling-2.6-1T, which should accelerate developer experimentation. Watch for community fine-tunes, distillations to smaller sizes, and benchmarks on agent-heavy workloads — those, more than raw intelligence-index numbers, will determine whether Ling-2.6-1T becomes a default choice for production agent stacks or a strong-but-niche alternative to DeepSeek and Qwen.
Source: AIBase
Discuss Your Project with Us
We're here to help with your web development needs. Schedule a call to discuss your project and how we can assist you.
Let's find the best solutions for your needs.