
Companies looking for additional compute capacity for their AI initiatives at a price they can afford now have another option. Compute Exchange offers what it calls the world’s first auction-based exchange for AI compute resources.
Via the exchange, companies with excess compute capacity can connect directly with companies who need that capacity The would-be buyers then bid on configurations that suit their unique usage needs. The company held its first public auction Tuesday, and will host a second next Monday, Mar. 3.
“We think this is really going to help bring transparency, efficiency, and neutrality to the computing markets,” Simeon Bochev, Compute Exchange co-founder and CEO, told Network World.
GPU compute situation ‘pretty dire’, says analyst
That help is urgently needed. Obtaining GPU resources for AI processing has become increasingly challenging, as supply constraints tightened.
“The GPU compute situation is pretty dire,” said Matt Kimball, VP and principal analyst for data center compute and storage, Moor Insights & Strategy. “This is driven by what most view as a single supplier (Nvidia) selling GPUs before they can even be made to a market that has an insatiable thirst.”
Buyers also struggle with lengthy procurement cycles, fixed pricing, and long-term contracts with cloud providers who often have minimum spend requirements.
“It’s capacity-constrained and it’s expensive to the point where it’s quite difficult for folks that are not in the highest echelons to get easy access to compute,” Bochev said.
Democratizing access to compute
Compute Exchange aims to be a neutral, self-regulated exchange with an auction-based system where prices are driven by market demand. Buyers pay only for what they need, and can resell unused capacity.
“Compute Exchange is certainly unique,” said Kimball, pointing out that its competitors are either traditional cloud service providers (CSPs) such as AWS or Microsoft Azure, or specialized CSPs like CoreWeave and Lambda Labs.
Ilya Matveev, US territory Manager at Gcore, a provider on the platform, added that the exchange gives buyers “more flexibility, transparency, and cost efficiency, ensuring they get the best deal based on real-time market conditions. By eliminating fixed pricing constraints, buyers can scale their compute needs dynamically.”
The compute economy of the future
Founded in 2024, Compute Exchange has held two private auctions so far, working with a dozen providers including Gcore, Nebius and Voltage Park.
The platform is straightforward to use. After creating an account, buyers specify their configuration needs. The criteria could be general (the lowest price on an A100) or specific (‘I want an A100 in a particular region of the United States with a certain amount of memory and storage that meets certain SLAs’). The bid is listed on Computer Exchange as a legally-binding ask, and users know immediately if they’re being matched to providers.
Under normal circumstances, buyers may spend weeks analyzing different providers and their terms of service, but Compute Exchange has one service agreement that every provider has agreed to, eliminating that step.
Tuesday’s session featured six auctions of A100, H100 and H200 infrastructure on single servers or clusters of servers (more than eight GPUs). An open bid period will be followed by a no cancel period that begins at 10:00 a.m. on Wednesday, giving buyers a two-hour window to “price improve.”
At that point, Bochev explained, “you can adjust your price up so that you could match within the prevailing market conditions.”
Once the auction closes, successful buyers are connected with providers and, “within a very short period of time,” are able to use their compute, said Bochev.
“When we take all of this together — market neutrality, market efficiency, standardization, reducing opacity, and making it super easy to list exactly what you need — it is a very compelling driver of the compute economy of the future,” said Bochev.
Provisioning to stay competitive
To succeed in AI initiatives, however, it’s also important to size infrastructure to accommodate the workload, explained Forrester senior analyst Alvin Nguyen. If models can’t access the compute they need, “it is possible that the AI workload will not be able to be processed,” he said.
To avoid this issue, Nguyen said, enterprise buyers should match their AI processing goals to resources they can reasonably acquire, analyze various options to augment on-site hardware, and develop AI workloads with less stringent data and latency requirements.
In addition, Matveev noted, organizations should assess their workload requirements based on historical performance data, and set clear key performance indicators (KPIs) for resource usage to properly size their asks. By monitoring usage trends and benchmarking against industry standards, companies can avoid wasting resources.
“Regularly reviewing this data and implementing automated scaling mechanisms can help avoid over- or under-provisioning,” he said, “ensuring that cloud resources are used efficiently.”
Source:: Network World