Today, AWS announces the general availability of Amazon Elastic Compute Cloud (Amazon EC2) Inf2 instances. These instances deliver high performance at the lowest cost in Amazon EC2 for generative AI models including large language models (LLMs) and vision transformers. Inf2 instances are powered by up to 12 AWS Inferentia2 chips, the latest AWS designed deep learning (DL) accelerator. They deliver up to four times higher throughput and up to 10 times lower latency than first-generation Amazon EC2 Inf1 instances.
Source:: Amazon AWS