Nvidia introduces ‘ridesharing for AI’ with DGX Cloud Lepton

Nvidia has introduced DGX Cloud Lepton, an AI-centric cloud software program that makes it easier for AI factories to rent out their hardware to developers who wish to access performant compute globally.

In announcing the service, Alexis Bjorlin, vice president of DGX Cloud at Nvidia, compared Lepton to a ridesharing app like Uber or Lyft, but rather than connecting riders to drivers, it connects developers to GPUs.

“DGX Cloud Lepton provides a modern marketplace that connects developers to GPU compute, and not just locally. It connects them to a global availability across clouds and across regions,” he said on a conference call with journalists.

The platform is currently in early access but already CoreWeave, Crusoe, Firmus, Foxconn, GMI Cloud, Lambda, Nscale, SoftBank, and Yotta have agreed to make “tens of thousands of GPUs” available for customers.

Developers can utilize GPU compute capacity in specific regions for both on-demand and long-term computing, supporting strategic and sovereign AI operational requirements. Nvidia expects leading cloud service providers and GPU marketplaces to also participate in the DGX Cloud Lepton marketplace.

The platform utilizes Nvidia AI software stack, including NIM and NeMo microservices, Nvidia Blueprints and Nvidia Cloud Functions, to accelerate and simplify the development and deployment of AI applications.

Lepton offers a unified experience across development, training, and inference, and allows for deployment of AI applications across multi-cloud and hybrid environments with minimal operational burden, using integrated services for inference, testing and training workloads.

Nvidia acquired the Lepton technology when it purchased Lepton AI just last month. Lepton AI was already offering GPU rental services at the time of acquisition, and Nvidia has made a very fast turn around to offer it as its own service.

Lepton AI operated by leasing AI clusters from CSPs and then offering them to smaller clients, eventually acting as a middleman for an AI-as-a-Service model. It provided not only the computing resources but also features like autoscaling and error handling.

Source:: Network World