SageMaker announces ml.inf2 and ml.trn1 instances for model deployment

We are excited to announce the availability of ml.inf2 and ml.trn1 family of instances on Amazon SageMaker for deploying machine learning (ML) models for Real-time and Asynchronous inference. You can use these instances on SageMaker to achieve high performance at a low cost for generative artificial intelligence (AI) models, including large language models (LLMs) and vision transformers. In addition, you can use SageMaker Inference Recommender to help you run load tests and evaluate the price-performance benefits of deploying your model on these instances.

Source:: Amazon AWS