HomeWinBuzzer NewsAWS Launches Trainium2 AI Chips for LLMs; Trainium3 Set for 2025

AWS Launches Trainium2 AI Chips for LLMs; Trainium3 Set for 2025

AWS introduces Trainium3 chips, promising a fourfold performance boost, alongside the launch of Trainium2-powered UltraServers.

-

AWS has taken a significant leap in AI infrastructure by unveiling its second-generation Trainium2 chips and introducing Trn2 UltraServers, designed to push the boundaries of large language model (LLM) performance.

Announced at the re:Invent conference, these developments position AWS as a formidable player in the rapidly evolving AI landscape. AWS also previewed Trainium3, a next-generation chip promising a fourfold performance increase, slated for release in late 2025.

David Brown, Vice President of Compute and Networking at AWS, described the significance of these innovations:
 
“Trainium2 is the highest performing AWS chip created to date. And with models approaching trillions of parameters, we knew customers would need a novel approach to train and run those massive models. The new Trn2 UltraServers offer the fastest training and inference performance on AWS for the world’s largest models.”

Trainium2: Unlocking Unprecedented Compute Power

Trainium2 chips are engineered to meet the rising computational demands of modern AI models, offering up to 20.8 petaflops of dense FP8 compute per instance.

Each EC2 Trn2 instance integrates 16 Trainium2 chips connected via AWS’s proprietary NeuronLink interconnect, which ensures low-latency, high-bandwidth communication. This architecture allows for seamless scalability, critical for training and deploying expansive LLMs.

AWS showcased Trainium2’s capabilities with Meta’s Llama 405B model, which achieved “three times higher token-generation throughput compared to other available offerings by major cloud providers,” according to the company.

This enhancement significantly accelerates tasks like text generation, summarization, and real-time inference, addressing the demands of businesses relying on generative AI.
 

UltraServers: Scaling Beyond Limits

For enterprises tackling trillion-parameter models, AWS introduced Trn2 UltraServers, which combine 64 Trainium2 chips to deliver up to 83.2 petaflops of sparse FP8 performance.

The servers enable faster training times and real-time inference for ultra-large AI models, representing a critical advancement for businesses aiming to deploy highly complex systems.

Gadi Hutt, Senior Director at Annapurna Labs, explained the UltraServers’ capabilities:
“Next, we break that [16-chip] boundary and provide 64 chips in the UltraServer, and that is for extremely large models. So if you have a 7 billion-parameter model, that used to be large, but not anymore—or an extremely large model, let’s call it 200 billion or 400 billion. You want to serve at the fastest latency possible. So, you use the UltraServer.”

Collaboration with Anthropic: Building the Largest AI Cluster

AWS has partnered with Anthropic to develop Project Ranier, a compute cluster that will feature hundreds of thousands of Trainium2 chips.

AWS describes it as Five times more powerful than Anthropic’s current systems, and the world’s largest AI compute cluster reported to date. This collaboration underscores AWS’s commitment to empowering its partners with cutting-edge technology.

Anthropic, known for its Claude 3.5 Sonnet LLM, relies on AWS infrastructure to maintain a competitive edge against rivals like OpenAI and Google. AWS recently doubled its investment in Anthropic to $8 billion, reinforcing its strategic focus on generative AI.

Trainium3: Shaping the Future of AI Compute

AWS also announced Trainium3, its upcoming chip built on a three-nanometer process, which promises to deliver four times the performance of UltraServers. Expected to launch in late 2025, Trainium3 will enable even faster training and inference for next-generation AI models, further solidifying AWS’s leadership in AI compute.

David Brown outlined AWS’s vision for the new chip: “With our third-generation Trainium3 chips, we will enable customers to build bigger models faster and deliver superior real-time performance when deploying them.”

AWS’s strategy with Trainium3 aims to challenge Nvidia’s dominance in the AI hardware market. While Nvidia’s upcoming Blackwell GPUs boast up to 720 petaflops of FP8 performance, AWS’s integrated Trainium solutions offer a cost-effective and scalable alternative tailored for enterprise-scale AI workloads.

Ecosystem Support: Tools for Seamless Integration

To complement its hardware innovations, AWS provides the Neuron SDK, a development toolkit optimized for frameworks like PyTorch and JAX.

The SDK includes tools for distributed training and inference, enabling developers to leverage Trainium chips without extensive reconfiguration.

AWS also offers pre-configured Deep Learning AMIs, ensuring that developers can rapidly deploy their AI applications.

Markus Kasanmascheff
Markus Kasanmascheff
Markus has been covering the tech industry for more than 15 years. He is holding a Master´s degree in International Economics and is the founder and managing editor of Winbuzzer.com.

Recent News

0 0 votes
Article Rating
Subscribe
Notify of
guest
0 Comments
Newest
Oldest Most Voted
Inline Feedbacks
View all comments
0
We would love to hear your opinion! Please comment below.x
()
x