
OpenAI has moved to significantly expand its artificial intelligence infrastructure by partnering with chipmaker Cerebras Systems, signalling a strategic push to scale inference capacity as demand for real-time AI services continues to surge. The agreement reflects a broader shift in AI infrastructure priorities, where running models efficiently at scale is becoming as critical as training them.
The multi-year deal focuses on deploying large volumes of specialised computing systems designed specifically for inference workloads. Cerebras is expected to provide up to 750 megawatts of wafer-scale computing capacity between 2026 and 2028, representing one of the largest dedicated inference infrastructure commitments in the AI sector. The scale highlights how inference, rather than training, is increasingly driving data centre expansion and long-term capacity planning.
Cerebras’ technology differs from conventional GPU-based clusters by integrating compute, memory, and bandwidth onto a single wafer-scale chip. This architecture is designed to reduce data movement and latency, both of which are key constraints for interactive AI applications. For OpenAI, faster and more predictable inference performance supports smoother user experiences as model usage grows across consumer and enterprise services.
The partnership also underscores OpenAI’s infrastructure diversification strategy. Rather than relying on a single hardware supplier, the company is assembling a heterogeneous compute stack to balance performance, cost efficiency, and energy consumption. As inference workloads scale rapidly, infrastructure decisions are increasingly shaped by power availability, cooling requirements, and the ability to deliver consistent throughput at scale.
From a data centre perspective, integrating wafer-scale systems introduces new operational challenges. Facilities must adapt to higher power densities and manage mixed hardware environments, placing greater emphasis on advanced cooling, network orchestration, and system resilience.
The agreement highlights intensifying competition for AI infrastructure. As access to specialised compute becomes a strategic asset, long-term partnerships with silicon providers are emerging as a critical factor in sustaining growth and maintaining service reliability.