// news · compute · industry2026-05-25source: aws / networkworld / cerebras

AWS becomes first hyperscaler to deploy Cerebras chips in its own data centers — Trainium for prefill, CS-3 for decode, both via Amazon Bedrock

AWS is now the first hyperscaler to deploy Cerebras CS-3 wafer-scale chips inside its own data centers, pairing them with Trainium accelerators in a split-architecture inference pipeline: Trainium handles the prefill (initial context ingestion), CS-3 handles the decode (token generation). The integration is exposed through Amazon Bedrock — meaning enterprise customers can opt into the wafer-scale inference path without changing their API calls.

The architectural split is the news. Prefill and decode are different workloads at the hardware level: prefill is matmul-dominant and benefits from training-class FLOPS, decode is memory-bandwidth-dominant and benefits from wafer-scale interconnect. AWS has been one of the only hyperscalers with a credible non-NVIDIA training accelerator in Trainium; pairing it with the best non-NVIDIA inference accelerator in CS-3 creates an internal NVIDIA-free path for the customer fraction that wants it. Bedrock customers don't need to know — the routing happens below the API surface.

The competitive consequence: Microsoft Azure has Maia + NVIDIA. Google Cloud has TPU + NVIDIA. AWS has Trainium + Cerebras + NVIDIA. The Cerebras path is the differentiator — none of the other clouds have a wafer-scale option at production scale. If the AWS-Cerebras price-per-token undercuts the all-NVIDIA path for typical inference workloads by even 20%, that pricing pressure compounds across every Bedrock customer's monthly bill. NVIDIA's data-center revenue concentration on hyperscaler customers (covered in yesterday's PM cycle) is now exposed to a real per-token alternative at the largest cloud. That's the kind of structural change that takes a couple of quarters to show up in NVIDIA's customer-mix numbers.

See our analysis →

Network World — OpenAI turns to Cerebras in a mega deal to scale AI inference infrastructure → · eWeek — Cerebras targets 33B IPO challenging Nvidia → · MLQ — AI Chips and Accelerators →