Addressing the growing demand for specialized computing capabilities, Amazon is pushing the frontier in AI model training and inference development. Introducing innovative chips set to enhance the computational efficiency and model creation feasibility, Amazon is poised to redefine this arena traditionally dominated by GPUs.
The tech giant recently unveiled its latest custom chip, the AWS Trainium2, at its annual re:Invent conference. This second-generation chip promises to deliver performance that's four times superior to its predecessor and double the energy efficiency, according to Amazon's reports. The Trainium2 will be available in EC Trn2 instances in clusters of 16 chips in the AWS cloud and can scale up to 100,000 chips in AWS’ EC2 UltraCluster product, symbolizing tremendous scalability.
In the realm of computational prowess, 100,000 Trainium chips can deliver 65 exaflops of compute. Each Trainium2 chip potentially promises around 200 teraflops of performance, outperforming Google’s AI training chips circa 2017 by a significant margin. Factoring in computational ability and performance variables, this impressive capability is set to provide unparalleled computational speed and potency for AI model deployment.
Amazon posits that such a cluster of Trainium chips can effectively reduce the training time of a 300-billion parameter AI large language model from months to weeks. The underlying parameters of an AI model, which are learned from the training data, essentially represent the model's proficiency in solving issues, such as generating text or code.
The notable VP of AWS compute and networking, David Brown, delineated the importance and impact of the new development during the unveiling, stressing the crucial role of such advancements in customer workload fulfillment and the burgeoning interest in generative AI. He further touted the Trainium2 for its capacity to enhance the speed of ML model training, reduce overall costs, and bolster energy efficiency but did not disclose when the Trainium2 instances will be available for AWS customers.
Simultaneously, Amazon introduced its fourth-generation AI inference chip, the Arm-based Graviton4. Purposefully distinct from Amazon’s other inference chip, Inferentia, the Graviton4 marks an impressive upgrade from its predecessor, offering 30% better compute performance, 50% more cores, and 75% more memory bandwidth than Graviton3.
Ramping up data security, all of Graviton4's physical hardware interfaces feature encryption, an enhancement designed to bolster the protection level of AI training workloads and data for customers with stringent encryption needs.
David Brown further highlighted Graviton4 as superior to all prior offerings from Amazon, marking the most powerful and energy-efficient chip tailored for diverse workloads. He championed Amazon's practice of focusing chip design on actual customer workloads to deliver the most advanced cloud infrastructure through its offerings.
The new Graviton4 will feature in Amazon EC2 R8g instances, presently available in preview with a general release scheduled for the forthcoming months.
No-code platforms such as AppMaster are increasingly integrating AI functionalities to streamline numerous processes. The development of enhanced AI training chips like the Trainium2 and Graviton4 chips by Amazon potentially signal an improvement in the capacity of such platforms to deliver AI-based solutions more effectively and efficiently. This can be particularly relevant to platforms that offer no-code design tools, boosting their ability to harness AI for automation and design optimization.