Groq Raises $650M to Build an Inference Neocloud After Nvidia's $20B Deal

AI chip startup Groq is raising $650 million from its existing investors to fund a second act as an inference-focused cloud provider, roughly six months after Nvidia paid an estimated $20 billion in a licensing deal that stripped away the company's founder and much of its core engineering team. The round was first reported by Axios in late May 2026 and is being led internally by interim CEO Adam Winter and CFO Matt Eng.

Key Highlights

Groq is raising $650 million from current backers, with investors Disruptive and Infinitum prepared to cover any portion of the round that other shareholders do not take up pro rata.
The capital follows Nvidia's roughly $20 billion December 2025 "not-acqui-hire," a non-exclusive license to Groq's chip technology that also moved founder Jonathan Ross and senior engineers to Nvidia.
The fresh money will fund GroqCloud capacity and next-generation Language Processing Unit (LPU) development, repositioning Groq as an AI inference neocloud.

Details

The arrangement that reshaped Groq was unusual even by 2026 standards. Rather than acquiring the company outright, Nvidia structured a licensing deal reportedly worth around $20 billion: it obtained rights to Groq's chip architecture, paid out the startup's existing investors in cash, and brought on a number of senior Groq employees, including founder and chief executive Jonathan Ross. Crucially, Groq itself remained an independent company.

What is left is the business that Groq is now betting on. The startup has refocused on its inference cloud, which lets developers and enterprises host inference-heavy applications on Groq's proprietary LPU hardware, an architecture purpose-built for the post-prompt processing that powers live chatbots, agents, and real-time AI features. Groq says it has already shipped its chips to multiple model providers and cloud customers.

Leadership has changed alongside the strategy. With Ross gone to Nvidia, Adam Winter is serving as interim chief executive and Matt Eng as chief financial officer, the two executives steering the $650 million raise toward expanding GroqCloud capacity and developing the next generation of LPU silicon.

Impact

The pivot reflects one of the clearest shifts in the AI hardware market: inference, the work that happens every time an AI prompt is answered, has overtaken model training as the larger and more durable source of demand. As enterprises move from experimenting with AI to running it in production, the cost and latency of serving models at scale matter more than the one-time expense of training them.

For developers, a well-capitalized Groq competing on inference speed and price could mean more choice beyond the dominant GPU clouds. Groq has long marketed its LPU as delivering low-latency, high-throughput inference, and a dedicated neocloud built around that hardware would put it in direct competition with the inference offerings of larger providers.

Background

Groq was founded by Jonathan Ross, a former Google engineer who helped create the company's Tensor Processing Unit, and it spent years positioning the LPU as a specialized alternative to general-purpose GPUs for language workloads. The December 2025 deal with Nvidia, by absorbing the founder and key talent while licensing the core technology, blurred the line between acquisition and partnership and left the remaining company to chart a new course with a familiar product.

What's Next

If the round closes as described, Groq will enter the second half of 2026 with fresh capital, a narrowed focus on inference, and the backstop of investors willing to fund the entire raise themselves. The open question is whether a neocloud strategy can sustain a company whose original founding team now sits inside its largest industry partner, and whether demand for fast, cheap inference is enough to carve out lasting ground in an increasingly crowded market.

Source: TechCrunch