Amazon’s AI Chips Win Over Uber, Marking Latest Major Customer Adoption
Photo by Kevin Ku on Unsplash
While Uber once ran its own data centers, it’s now expanding AWS contracts to run ride‑sharing features on Amazon’s Graviton CPUs and trialing Trainium 3 AI chips, TechCrunch reports.
Key Facts
- •Key company: Amazon
- •Also mentioned: Google, Nvidia, Amazon
Amazon’s expansion of Uber’s AWS contract signals a decisive shift in the ride‑hailing firm’s compute strategy, moving critical workloads from on‑premise data centers onto Amazon’s Graviton ARM‑based CPUs and the newly announced Trainium 3 AI accelerator. According to TechCrunch, the deal “will particularly expand its use of AWS’s Graviton (a low‑power, ARM‑based server CPU) and start a new trial testing Trainium 3, AWS’s Nvidia competitor AI chip.” Graviton’s 64‑bit ARM architecture offers a lower TDP and higher core density than traditional x86 processors, which aligns with Uber’s need to run latency‑sensitive micro‑services for real‑time ride matching and pricing. The Trainium 3 trial, meanwhile, is intended to offload inference workloads for demand‑forecasting models that power surge pricing and driver‑allocation algorithms, providing a purpose‑built silicon path that bypasses the need for third‑party GPUs.
The move is noteworthy because it reverses Uber’s 2023 migration to Oracle Cloud Infrastructure (OCI) and Google Cloud Platform (GCP), where the company “took on the dual challenge of shifting massive workloads and introducing Arm‑powered compute instances into a previously x86‑dominated environment,” as Uber explained in its December blog post. At the time, Uber highlighted the use of Ampere‑based ARM chips in Oracle’s cloud, a decision that was heavily influenced by Ampere’s founder Renee James and Oracle’s one‑third ownership stake in the startup. TechCrunch notes that “Oracle sold its stake for a handsome $2.7 billion pre‑tax gain” after SoftBank’s acquisition of Ampere, and that Oracle has since pivoted toward buying chips—primarily Nvidia GPUs—for its data‑center strategy. By shifting to AWS, Uber is effectively sidestepping both Oracle’s and Google’s competing silicon roadmaps and aligning with Amazon’s vertically integrated hardware stack.
From a technical perspective, Graviton’s design leverages the latest ARM Neoverse cores, which integrate a high‑throughput memory subsystem and support for AVX‑like SIMD extensions via the Scalable Vector Extension (SVE). This architecture enables Uber to consolidate multiple micro‑service containers onto a single physical host, reducing inter‑node network hops and improving cache locality for request‑routing pipelines. Trainium 3, Amazon’s third‑generation AI inference chip, builds on the custom matrix‑multiply units introduced in the original Trainium, offering up to 2× higher throughput per watt for transformer‑based models. According to the TechCrunch report, the trial will evaluate Trainium 3’s ability to handle “ride‑sharing features” that rely on deep‑learning inference, suggesting that Uber plans to replace GPU‑based inference clusters with purpose‑built ASICs for cost and latency gains.
Strategically, the partnership underscores Amazon’s broader ambition to challenge Nvidia’s dominance in the AI accelerator market. While the TechCrunch article frames the Uber deal as “a thorough thumbing of the nose” at Google and Oracle, it also acknowledges that the arrangement “is a bit less about a long‑term threat to Nvidia than it is a thorough thumbing of the nose by Amazon at AWS’s cloud competitors.” By offering a unified stack—Graviton for general compute and Trainium 3 for AI inference—AWS can present a compelling value proposition to enterprises that have traditionally relied on heterogeneous hardware environments. Uber’s adoption serves as a high‑visibility case study that could accelerate other data‑intensive workloads, such as fraud detection and route optimization, onto Amazon’s silicon.
The broader industry context is shaped by the tangled relationships among chipmakers, cloud providers, and AI model developers. TechCrunch outlines how “Oracle, Softbank, and Nvidia are also part of OpenAI’s orbit of circular deals” that fund massive data‑center build‑outs, while Amazon is simultaneously courting AI‑heavy customers with its own in‑house silicon. Uber’s shift therefore reflects not only a technical preference for ARM‑based efficiency and custom AI acceleration but also a strategic alignment with a cloud vendor that is aggressively expanding its hardware portfolio to compete with the entrenched GPU ecosystem. If the Trainium 3 trial demonstrates measurable improvements in inference latency and cost, Uber may transition from a trial to a full‑scale deployment, further cementing AWS’s position as a one‑stop shop for both general‑purpose and AI‑specific workloads.
Sources
Reporting based on verified sources and public filings. Sector HQ editorial standards require multi-source attribution.