Apple is accelerating its artificial intelligence (AI) ambitions by integrating AWS’s Trainium2 chips into its AI pretraining processes.
During the AWS re:Invent conference on December 4, Benoit Dupin, Apple’s senior director of machine learning and AI, discussed the company’s early evaluations of AWS hardware. “In early stages of evaluating Trainium2, we expect early numbers up to 50% improvement in efficiency with pretraining,” Dupin told the audience in a brief appearance on-stage.
The partnership underscores a deepening relationship between Apple and AWS, with the tech giant relying on Trainium2-powered UltraServers to scale its AI operations while optimizing efficiency and costs.
Apple and AWS are Long Term Partners
Apple’s use of AWS infrastructure is not new. The company has been leveraging AWS chips like Graviton and Inferentia for over a decade to support key products and services, including Siri, Apple Maps, and Apple Music.
AWS’s ability to support large-scale AI workloads has made it a critical partner for Apple. As Dupin put it, “We have a strong relationship, and the infrastructure is both reliable and able to serve our customers worldwide.”
The addition of Trainium2 to Apple’s AI toolkit reflects both companies’ commitment to pushing the boundaries of AI efficiency and scalability.
Trainium2 Chips and UltraServers: Meeting the Needs of Modern AI
AWS launched its Trainium2 chips and Trn2 UltraServers yesterday, marking a major milestone in AI hardware development. Trainium2 chips deliver up to 20.8 petaflops of dense FP8 compute per instance and are designed to handle the increasing computational demands of trillion-parameter AI models.
The Trn2 UltraServers, featuring 64 Trainium2 chips, achieve up to 83.2 petaflops of sparse FP8 performance. This is made possible by AWS’s proprietary NeuronLink interconnect, which ensures low-latency, high-bandwidth communication across distributed systems.
David Brown, AWS’s Vice President of Compute and Networking, emphasized the transformative potential of Trainium2: “Trainium2 is the highest performing AWS chip created to date. And with models approaching trillions of parameters, we knew customers would need a novel approach to train and run those massive models.”
AWS demonstrated Trainium2’s capabilities by running Meta’s Llama 405B model, achieving three times higher token-generation throughput than competing offerings from other cloud providers. This breakthrough addresses a critical need for faster text generation, summarization, and real-time inference.
Apple Intelligence and the Multi-Cloud Approach
Apple’s AI strategy revolves around its generative AI platform, Apple Intelligence, which powers features like natural language processing in Siri, advanced notification summaries, and creative tools like emoji generation.
Apple Intelligence operates on a hybrid model, utilizing on-device computations via its M-series chips for privacy and efficiency, while relying on cloud infrastructure for complex workloads.
This multi-cloud approach includes both AWS and Google Cloud. Earlier this year, Apple confirmed its use of Google TPU chips for training components of Apple Intelligence. This diversified strategy allows Apple to optimize specific workloads based on the strengths of each platform.
With Trainium2, AWS provides a cost-effective alternative to Nvidia GPUs, enabling Apple to scale its AI operations without compromising performance.
Related: Apple Siri’s AI Overhaul Slips to 2026 as Google’s Gemini Leads the Way
Project Rainier: AWS’s Collaboration with Anthropic
AWS’s broader AI ambitions include Project Rainier, a partnership with Anthropic to develop one of the world’s largest AI compute clusters. Featuring thousands of Trainium2 chips, Project Rainier is designed to deliver unprecedented scalability for generative AI.
Anthropic, the company behind the Claude 3.5 Sonnet language model, plans to use the cluster to scale its model training fivefold. AWS’s investment in Anthropic, which now totals $8 billion, underscores its commitment to fostering innovation in AI infrastructure.
By supporting both Apple and Anthropic, AWS demonstrates its ability to cater to diverse AI workloads, from pretraining to real-time inference.
AWS Trainium3 and the Future of AI Hardware
AWS is already looking ahead to its next-generation chip, Trainium3, slated for release in late 2025. Built on a three-nanometer process, Trainium3 promises a fourfold performance improvement over Trainium2. This development will enable even larger AI models and faster training times, reinforcing AWS’s position as a leader in AI hardware.
The Ultracluster, a supercomputer based on Trainium3, will further enhance AWS’s capabilities. AWS describes it as the world’s largest AI compute cluster, capable of handling trillion-parameter models with unmatched efficiency.
These advancements reflect AWS’s strategic vision to challenge Nvidia’s dominance in the AI hardware market, offering enterprises like Apple and Anthropic cost-effective, scalable solutions tailored to their needs.
The Industry Shift Toward Custom Silicon
Apple’s adoption of AWS Trainium2 chips is part of a broader industry trend toward custom silicon for AI workloads. Companies are increasingly moving away from traditional GPU-based solutions in favor of hardware designed specifically for AI applications.
AWS’s integrated approach, combining hardware like Trainium2 with tools such as the Neuron SDK, positions it as a viable alternative to Nvidia. By investing in custom silicon, Apple and AWS are driving innovation in AI infrastructure, paving the way for more efficient and scalable solutions.