Cohere has released Command A+, its most powerful large language model to date and the first the company has published under a fully open-source Apache 2.0 license. Announced on May 20, 2026, the model uses a sparse Mixture-of-Experts (MoE) architecture with 218 billion total parameters and 25 billion active parameters during inference — and can run on as little as two NVIDIA H100 GPUs or a single NVIDIA B200.
Key Highlights
- 218B/25B MoE architecture — enterprise-grade power with efficient active parameter usage
- Apache 2.0 license — permissive open-source license allows commercial use with no restrictions
- 48 language support — up from 23 in previous Command A models, with major gains in non-European tokenization
- 128K context window — with up to 64K token generation output
- 63% faster than Command A Reasoning in token generation speed
- Native citation generation — every factual claim automatically grounded to its source document
Details
Command A+ introduces what Cohere calls lossless quantization — a technique allowing the model to be compressed to 4-bit precision (W4A4) without meaningful degradation in output quality. This makes it possible to run an enterprise-grade 218B model on two H100s, a milestone that significantly lowers the infrastructure barrier for self-hosted AI deployments.
The model's tokenizer has been optimized for global reach. The token cost for Arabic drops by 20%, Japanese by 18%, and Korean by 16% compared to previous versions. This efficiency improvement translates directly to lower inference costs and faster response times for users in those language markets.
Native citation generation is another major differentiator. When Command A+ retrieves information from external tools or documents, it generates explicit "grounding spans" — links that tie every factual claim to the specific source that produced it. This is particularly valuable for regulated industries such as healthcare, finance, and legal services, where auditability is a compliance requirement.
The model supports multimodal inputs (text, images, and tool use) with a 128K context window and up to 64K tokens of output — well-suited for long document processing, complex agentic workflows, and retrieval-augmented generation (RAG) pipelines.
Impact
The Apache 2.0 release marks a strategic shift for Cohere. The company has historically been positioned as an enterprise-first, API-driven AI vendor. This release brings it directly into competition with Meta's Llama series and Mistral's open models, but with a sovereign AI focus that sets it apart.
For governments and regulated enterprises in the MENA region, Europe, and Southeast Asia, Command A+ offers a rare combination: a model powerful enough to handle complex agentic tasks, efficient enough to self-host without massive GPU fleets, and open enough to deploy in air-gapped environments without sharing data with a third-party API.
The 20% improvement in Arabic tokenization efficiency makes Command A+ one of the most cost-effective open-source models available for Arabic-language AI deployments in the region.
Background
Cohere was founded in 2019 by former Google Brain researchers and has historically focused on enterprise customers via proprietary API models. Command A+ is its first model released with full weights under a permissive open-source license — a significant strategic departure.
The release follows Cohere's merger with Aleph Alpha, the German AI company, which strengthened its European sovereign AI footprint. Command A+ is positioned as the flagship model for both companies' combined customer base across public sector and critical infrastructure.
What's Next
Command A+ is available immediately on Hugging Face and Cohere's Model Vault, with a free API trial via the Cohere platform. Organizations can deploy it on-premises, in private cloud environments, or in fully air-gapped configurations. Cohere confirmed day-0 support via the vLLM inference framework, making integration into existing AI infrastructure straightforward.