Attach multiple Elastic Inference accelerators to a single EC2 instance
You can now attach multiple Amazon Elastic Inference accelerators to a single Amazon EC2 instance. With this capability, you can use a single EC2 instance in an auto-scaling group when you are running inference for multiple models. By attaching multiple accelerators to a single instance, you can avoid deploying multiple auto-scaling groups of CPU or GPU instances for your inference and lower your operating costs.
Source:: Amazon AWS