Home / Technology / Why Did AWS Abandon Its Custom Silicon Strategy for Cloud RAN?

Why Did AWS Abandon Its Custom Silicon Strategy for Cloud RAN?

Mar 3, 2026 Interview

Lisa AidleTelecom Policy Expert

Vladislav Zaimov stands at the intersection of traditional telecommunications and the rapidly evolving world of cloud-native infrastructure. With a career dedicated to securing vulnerable networks and optimizing enterprise-grade communications, he has witnessed the industry’s turbulent transition from proprietary black boxes to the promised land of the cloud. As a specialist in risk management, Zaimov offers a grounded perspective on why major players like AWS are currently recalibrating their multi-billion dollar strategies. His insights bridge the gap between high-level architectural shifts and the gritty, physical realities of radio frequency processing.

This conversation explores the recent strategic retreat of hyperscalers from custom radio hardware and the financial implications of a shrinking RAN market. We delve into the technical battle between integrated accelerators and general-purpose CPUs, the massive impact of Nvidia’s billion-dollar entry into the space, and the persistent hurdles that keep radio workloads out of the public cloud.

AWS recently moved away from Graviton3 hardware to focus on the CaaS layer and AI in the RAN. What technical hurdles in Layer 1 processing drove this shift, and how does a CaaS-first strategy change the way operators deploy AI-driven radio functions?

The pivot away from Graviton3 wasn’t just a business decision; it was a surrender to the sheer physics of Layer 1 processing. Layer 1, or the physical layer, is the most demanding slice of the RAN stack, particularly when it comes to Forward Error Correction (FEC). While AWS initially bet that their Arm-based Graviton3 could handle these resource-hungry tasks using high core counts and SVE2 vector processing, the real-world performance simply couldn’t match dedicated hardware. By shifting to a CaaS-first strategy using their Elastic Kubernetes Service (EKS), AWS is essentially admitting that the underlying hardware should be a commodity—whether it’s from Intel, Dell, or others—while they provide the intelligent software “glue.” This allows operators to bolt on AI solutions directly to the container layer, treating radio functions more like flexible IT workloads rather than rigid, hard-coded radio signals. It’s a move toward a software-defined future where the “low whirr of a shredder” metaphorically consumes the old hardware-centric blueprints in favor of algorithmic agility.

The RAN market saw revenues plunge by $10 billion over just two years as 5G investment cooled. Given this contraction, what financial risks do hyperscalers face when designing custom silicon, and what benchmarks must new hardware meet to compete with established platforms?

The financial landscape for custom silicon has become incredibly treacherous, with Omdia reporting a drop from $45 billion in 2022 to a stagnant $35 billion by 2024. When you see $10 billion evaporate from the market in such a short window, the massive R&D costs required to develop a chip like Graviton3 for a specific niche like RAN become almost impossible to justify. Hyperscalers face the “valley of death” where their custom designs might fall between major waves of operator investment, leaving them with sophisticated silicon and no buyers. To compete with the x86 dominance of Intel, any new hardware must not only match the core count and vector-processing capabilities—like Intel’s AVX-512—but also provide a specialized FEC accelerator that doesn’t drain the main CPU’s power. Without meeting these strict benchmarks for power efficiency and latency-sensitive processing, custom chips are doomed to remain experimental curiosities rather than carrier-grade staples.

Intel utilizes integrated vRAN Boost accelerators, while others try to rely on high-core-count CPUs. Can you detail the performance trade-offs of skipping a discrete accelerator, and what infrastructure changes are needed to handle demanding Layer 1 tasks?

Skipping a discrete accelerator is like trying to run a marathon while carrying a heavy backpack; you might have the “muscles” or core count to do it, but you’ll burn out far faster than someone using specialized gear. Intel’s vRAN Boost is an integrated FEC accelerator that offloads the most grueling calculations, whereas a general-purpose CPU—even one with high core counts—must dedicate a significant portion of its processing power just to stay upright. To handle Layer 1 without that specialized help, an operator would need to significantly over-provision their infrastructure, leading to massive power consumption and heat issues at the edge of the network. We’ve seen Nokia take a different route by offloading all Layer 1 functions to custom Marvell chips, which highlights the industry consensus that “general purpose” has its limits. The infrastructure change required is a total rethink of the server rack, moving away from simple “Outposts” and toward hybrid configurations where the CPU handles Layer 2 and 3, while a dedicated card manages the brutal mathematics of the radio wave.

Nokia is making a massive pivot toward Nvidia’s GPU-based architecture. How does this $1 billion investment affect the competition between ARM and x86, and what are the practical implications for operators trying to maintain vendor neutrality?

Nokia’s $1 billion bet on Nvidia is a seismic shift that complicates the traditional ARM versus x86 rivalry by introducing a third, dominant power: the GPU. By designing future 5G and 6G software to run on Nvidia’s mixture of GPUs and Arm-based CPUs, Nokia is essentially creating a new ecosystem that favors AI-heavy workloads over traditional signal processing. This pivot puts immense pressure on both Intel and traditional ARM chipmakers to prove they can handle the emerging “agentic AI” demands of modern networks. For operators, this is a double-edged sword; while it offers incredible performance for AI-driven radio, it threatens the very “vendor neutrality” that Open RAN was supposed to provide. If the network becomes dependent on a specific GPU architecture to function efficiently, the dream of swapping hardware like Lego bricks becomes much harder to realize, as the software becomes tightly coupled with Nvidia’s proprietary libraries.

Many operators use cloud servers for core functions but stay away from the public cloud for the RAN. What specific barriers prevent this migration, and are there examples of how difficult it is to achieve carrier-grade reliability?

The barriers to moving RAN to the public cloud are a mix of cold physics and rigid regulation. Unlike core network functions, which can tolerate a bit of latency, the RAN requires millisecond-level precision that the public cloud struggle to guarantee consistently across vast distances. We saw a stark example of this with EchoStar-owned Dish Network, which attempted to build a cloud-based open RAN across the United States, only to see losses escalate to the point where they had to decommission parts of the network and sell spectrum licenses. Even major players like Telefónica Germany, who are comfortable using AWS Outposts for their core network, refuse to touch the public cloud for their radio workloads due to performance and sovereignty concerns. There is a visceral fear among operators that a software glitch or a latency spike in a distant data center could silence thousands of cell sites simultaneously, a risk that “carrier-grade” tradition simply won’t tolerate.

What is your forecast for cloud RAN?

I believe the future of cloud RAN lies in a “hybrid-intelligent” model rather than a total migration to the public cloud. We will see a permanent split where the heavy lifting of Layer 1 remains on specialized, accelerated hardware at the edge, while the orchestration and AI-driven optimization move to the CaaS layers managed by hyperscalers. The $10 billion market contraction has been a wake-up call, signaling that the industry isn’t ready for a pure “off-the-shelf” CPU approach just yet. Expect to see more partnerships like the Nokia-Nvidia tie-up, where the “cloud” part of the RAN is less about where the server sits and more about how the software is managed and updated. Ultimately, the successful operators will be those who stop trying to force the radio into a generic box and instead embrace a specialized, accelerated architecture that treats the radio wave with the respect its complexity deserves.

Why Did AWS Abandon Its Custom Silicon Strategy for Cloud RAN?

Related Publications

Subscribe to our weekly news digest.