Moonshot AI released Kimi K2.7-Code on June 12, 2026, a coding-focused, open-weight successor to Kimi K2.6 that the company says posts a 21.8% gain on its Kimi Code Bench v2 while burning roughly 30% fewer reasoning tokens. The model ships under a Modified MIT license with weights live on Hugging Face, putting frontier-class agentic coding within reach of teams that want to run it on their own infrastructure.
It is the fifth major release in Moonshot's K-series in under a year, and it stakes out a clear position: rather than chase the largest general-purpose chatbot, the Beijing-based lab is doubling down on long-horizon, agentic software engineering as an open-weight alternative to closed coding models.
Key Highlights
- 1-trillion-parameter mixture-of-experts model with 32B active parameters per token, 384 experts (8 selected plus 1 shared), and 61 layers.
- 256K-token context window aimed at long, multi-file agentic workflows.
- Modified MIT license, with weights (roughly 595 GB) on Hugging Face and native INT4 quantization for cheaper deployment.
- Reports +21.8% on Kimi Code Bench v2, +11.0% on Program Bench, and +31.5% on MLS Bench Lite versus K2.6.
- Roughly 30% fewer reasoning tokens than K2.6, lowering output costs on multi-step tasks.
Details
K2.7-Code is built as a mixture-of-experts (MoE) network: of its 1 trillion total parameters, only about 32 billion activate for any given token, which keeps inference cost far below what a dense model of comparable size would demand. The architecture pairs MLA attention with SwiGLU feed-forward layers and includes a 400M-parameter MoonViT vision encoder for multimodal inputs. Moonshot says the weights can be served with vLLM, SGLang, or KTransformers, and native INT4 quantization is intended to make on-premises deployment more affordable.
On Moonshot's own benchmarks, the headline jump is Kimi Code Bench v2, which climbs from 50.9 to 62.0. Program Bench rises from 48.3 to 53.6, and MLS Bench Lite, a multi-language test, jumps from 26.7 to 35.1. For tool-use and agentic behavior, the company reports 81.1 on MCP Mark Verified (up from 72.8) and 76.0 on MCP Atlas. Moonshot also claims K2.7-Code edges out Claude Opus 4.8 on MCP Mark Verified, while trailing GPT-5.5 on most measures.
On pricing, the Kimi API lists cache-miss input at $0.95 per million tokens, cached input at $0.19, and output at $4.00 per million tokens — undercutting most frontier closed models for comparable coding work.
A Caveat on the Numbers
Every headline figure published so far comes from Moonshot's own proprietary benchmarks. As of June 12, 2026, there were no independent third-party results for K2.7-Code on standard public suites such as SWE-bench Verified, SWE-bench Pro, Terminal-Bench, or LiveCodeBench. The vendor-reported gains are notable, but the open-weight release means the community can now validate them directly — which is precisely the point of shipping the weights.
Impact
For development teams, the appeal is sovereignty and cost. An open-weight, commercially usable coding model that runs on owned hardware sidesteps per-token API bills and keeps proprietary source code off third-party servers — a meaningful consideration for organizations bound by data-residency rules. The 30% reduction in reasoning tokens compounds that advantage across the long, multi-step agentic loops that now dominate AI-assisted coding.
The release also sharpens a trend that has defined 2026: open-weight Chinese coding models steadily closing the gap with closed Western systems. Alongside recent open-source pushes from Cohere, Alibaba's Qwen line, and Zhipu, K2.7-Code adds pressure on the premise that the most capable coding agents must be proprietary and API-gated.
Background
Moonshot AI, founded in 2023, has shipped its K-series at an aggressive cadence, moving from K2 through K2.7-Code between July 2025 and June 2026. The K2 family established the lab's reputation for large MoE models tuned for agentic tasks; K2.7-Code narrows that focus to software engineering specifically, trading some generality for measurable gains on coding and tool-use benchmarks.
What's Next
With the weights public, the immediate test is independent benchmarking on the standard suites the vendor numbers do not yet cover. If K2.7-Code holds up on SWE-bench-class evaluations, it strengthens the case for open-weight models as the default substrate for self-hosted coding agents — and gives MENA teams another route to building on AI without locking into a single foreign API provider.
Source: MarkTechPost