F5 this week said it’s working with Intel to offer customers a way to develop and securely deliver AI-based inference models and workloads.
Specifically, the companies will combine the security and traffic-management capabilities from F5’s NGINX Plus suite with Intel’s distribution of OpenVINO toolkit and Intel’s infrastructure processing units (IPUs). The package will offer customers protection, scalability, and performance for advanced AI inference development, the vendors said.
NGINX Plus is F5’s application security suite that includes a software load balancer, content cache, web server, API gateway, and microservices proxy designed to protect distributed web and mobile applications.
OpenVINO is an open-source toolkit that accelerates AI inference and lets developers use AI models trained using popular frameworks such as TensorFlow, PyTorch, ONNX, and others. OpenVINO model server uses Docker container technology and can be deployed in a clustered environment to handle high inference loads and scale as needed.
“OpenVINO model server supports remote inference, enabling clients to perform inference on models deployed on remote servers,” according to Intel. “This feature is useful for distributed applications or scenarios where AI inference needs to be performed on powerful servers while the client device has limited resources.”
Intel IPUs are hardware accelerators that offload a number of tasks such as packet processing, traffic shaping, and virtual switching from the server CPU.
The integrated F5/Intel offering, which is available, will be particularly beneficial for edge applications, such as video analytics and IoT, where low latency and high performance are crucial, wrote Kunal Anand, chief technology officer with F5, in a blog about the technology.
“F5 NGINX Plus works as a reverse proxy, offering traffic management and protection for AI model servers,” Anand wrote. “With high-availability configurations and active health checks, NGINX Plus can ensure requests from apps, workflows, or users reach an operational OpenVINO model server.”
It also enables the use of HTTPS and mTLS certificates to encrypt communications between the user application and model server without slowing performance, Anand added.
With OpenVINO, developers first convert and can further optimize and compress models for faster responses, according to Anand. “… the AI model is ready to be deployed by embedding the OpenVINO runtime into their application to make it AI capable. Developers can deploy their AI-infused application via a lightweight container in a data center, in the cloud, or at the edge on a variety of hardware architectures,” he wrote.
Integrating an Intel IPU with NGINX Plus creates a security air gap between NGINX Plus and the OpenVINO servers, according to Intel. This extra layer of security protects against potential shared vulnerabilities to help safeguard sensitive data in the AI model, the vendors stated.
Intel IPUs are compatible with the Dell PowerEdge R760 Server with Intel Xeon processors. Using an Intel IPU with a Dell PowerEdge R760 rack server can increase performance for both OpenVINO model servers and F5 NGINX Plus, according to Anand .
“Running NGINX Plus on the Intel IPU provides performance and scalability thanks to the Intel IPU’s hardware accelerators,” Anand wrote. “This combination also leaves CPU resources available for the AI model servers.”
Read more about Intel
- Intel highlights new Xeons for AI at Hot Chips 2024
- Should enterprises be concerned about Intel’s crashing CPU flaw?
- Extreme taps Intel analytics to boost its AI Expert assistant
- Intel postpones Innovation event in wake of poor financial results, product problems
- Intel builds world’s largest neuromorphic system
- Intel flexes AI chops with Gaudi 3 accelerator, new networking for AI fabrics
- $15 billion deal with Microsoft boosts Intel’s chipmaking vision
Source:: Network World