writing/news/2026/05
NewsMay 22, 2026·6 min read

Alibaba Launches Qwen3.7-Max and Custom Chips in Full-Stack AI Factory Bid

Alibaba unveiled Qwen3.7-Max at its Cloud Summit — a flagship reasoning agent with a 1M-token context window that sustained 35 hours of autonomous operation — alongside the Zhenwu M890 chip and Panjiu AL128 server, completing a five-layer AI factory stack.

Alibaba unveiled Qwen3.7-Max at its annual Cloud Summit in Hangzhou on May 20–21, 2026, positioning the model as its most capable AI agent to date and anchoring a full-stack "AI factory" strategy that extends from custom silicon all the way to end-user applications.

Key Highlights

  • Qwen3.7-Max carries a 1M-token context window, doubling the 256K limit of its predecessor Qwen3.6
  • The model executed more than 1,000 autonomous tool calls in a single internal test run
  • Demonstrated sustained autonomous execution of up to 35 hours on complex coding tasks without performance degradation
  • Achieved roughly 10x inference speed improvement in kernel optimization workloads over the prior version
  • Ranked #5 overall on the Artificial Analysis Intelligence Index with a score of 56.6

The Three-Product Stack

Alibaba used the summit to launch three interconnected products that together form what the company calls its AI factory infrastructure.

Qwen3.7-Max is the model layer — a proprietary, closed-weight reasoning agent designed for long-horizon tasks. Before committing to an answer, the model generates an internal chain-of-thought, producing around 97 million reasoning tokens on benchmarks compared with roughly 24 million for comparable models.

Zhenwu M890 is Alibaba's purpose-built AI accelerator, developed by its semiconductor subsidiary T-Head and optimized for the large inference workloads that agent models require.

Panjiu AL128 is a rack-scale server that links 128 M890 accelerators into a single deployable unit, providing the compute density needed to sustain multi-hour autonomous tasks at scale.

Liu Weiguang, Alibaba Cloud's senior vice-president, framed the company's ambition directly: it now operates "all five layers of the full AI stack — chips, cloud infrastructure, AI models, service platforms, and applications."

Benchmark Results

Independent evaluations show meaningful gains over the previous generation:

  • CritPt: 13.4% — up 9.7 percentage points
  • Humanity's Last Exam: 38.1% — up 9.2 points
  • Terminal-Bench Hard: 50.8% — up 6.9 points
  • GDPval-AA (Elo): 1,546 — up 42 points

Zhou Jingren, Alibaba's newly appointed Chief AI Architect, said the model "consistently ranked among the top tier" and outperformed competing Chinese AI models across categories.

One limitation worth noting: on the AA-Omniscience benchmark, the model's attempt rate dropped to 48.0% from 67.3%, suggesting the model abstains more often on uncertain knowledge-recall tasks rather than hallucinating.

Competitive Context

The announcement lands amid an intensifying race in China's AI sector. Tencent launched its Mavis AI assistant during the same window, while ByteDance released Seedance 2.0 video generation. Alibaba's vertical integration play — controlling chips through T-Head and cloud infrastructure through Alibaba Cloud — mirrors the strategy pursued by Nvidia and Google in Western markets.

What's Next

As of late May 2026, Qwen3.7-Max is preview-only with API access rolling out progressively on Alibaba Cloud. Pricing has not been announced; the predecessor Qwen3.6 Max Preview was available at $1.30 per million input tokens and $7.80 per million output tokens. No open-weight version of the 3.7-Max has been released.


Source: Alibaba Cloud via MarkTechPost