AI inference startup Baseten is closing a roughly $1.5 billion funding round at a valuation of up to $13 billion, according to a report from The Wall Street Journal published June 18, 2026. The deal lands just five months after the company raised a $300 million Series E at a $5 billion valuation — making this a 160% valuation jump in under half a year.
The round underscores how serving AI models — not just training them — has become one of the most fiercely contested and well-capitalized layers in the entire technology stack.
Key Highlights
- Baseten is raising approximately $1.5 billion in new funding.
- The round uses a split-priced structure: some investors are entering at an $11 billion valuation, others at $13 billion.
- The deal is co-led by Spark Capital, Sands Capital, Altimeter Capital, Wellington Management, and Conviction.
- The company's annualized revenue run-rate reportedly tripled from $200 million to $600 million in a single quarter.
- Customers include Cursor, OpenEvidence, Abridge, Notion, Clay, Mercor, and Lovable.
Details
Founded in 2019 by Tuhin Srivastava, Amir Haghighat, Phil Howes, and Pankaj Gupta, Baseten builds software and multi-cloud compute infrastructure that helps companies run, optimize, and customize AI models in production. Its focus is inference — the computationally expensive process of actually getting a trained model to produce useful outputs after a user submits a prompt.
The pitch is increasingly attractive to companies watching their AI bills climb. Baseten specializes in serving open-source models efficiently, and several of its customers report cost savings of up to 30% versus closed-source APIs. That positioning has powered an extraordinary growth curve: the company's $300 million Series E in early 2026 came just nine months after a $150 million Series D, and the new round would push total funding past $2 billion.
Impact
The financing reflects what investors are calling an "inference gold rush." For years, capital and attention flowed toward the labs training frontier models. Now the economics of running those models at scale — cheaply, reliably, and across multiple clouds — has become its own infrastructure battleground.
Industry estimates project that inference will account for roughly two-thirds of all AI compute by the end of 2026, up from about one-third in 2023. As model quality across providers converges, the competitive frontier is shifting from who has the smartest model to who can serve intelligence most economically at scale.
"If cloud was the foundation that enabled the last generation of great technology companies, inference is the foundation for the next," CEO Tuhin Srivastava said, framing the company's bet.
Background
Baseten competes in a crowded and rapidly funding category that includes the likes of Fireworks, Together AI, and a wave of "neocloud" providers building dedicated GPU and inference capacity. The split-priced valuation structure — where different investor cohorts buy in at different prices — has become increasingly common among late-stage AI companies racing to raise before the next wave of growth.
The strategy hinges on a thesis that open-source and open-weight models have closed enough of the quality gap that the deciding factor for many production workloads is no longer raw capability, but the cost and control of serving them.
What's Next
For developers and businesses across regions where AI spend is heavily scrutinized — including the MENA market, where cost efficiency and data sovereignty weigh heavily on infrastructure decisions — the rise of specialized inference platforms is a meaningful shift. Cheaper, multi-cloud serving of open-weight models lowers the barrier to deploying production AI without locking into a single closed-API vendor.
Whether Baseten's valuation can keep pace with its revenue remains the open question. With inference compute demand still climbing and the round not yet formally closed, the company is betting that the economics of serving AI are only getting bigger.
Source: TechCrunch