Announcing Provisioned Concurrency for Amazon SageMaker Serverless Inference

Today, we are excited to announce general availability of Provisioned Concurrency support for Amazon SageMaker Serverless Inference. Provisioned Concurrency allows you to deploy models on serverless endpoints with predictable performance and high scalability. You can add provisioned concurrency to your serverless endpoints, and for the pre-defined amount of provisioned concurrency SageMaker will keep the endpoints warm and ready to respond to requests instantaneously. Provisioned Concurrency is ideal for customers who have predictable traffic, with low throughput.

Source:: Amazon AWS