
When the next generation of IBM mainframe, the z17, makes its anticipated summer debut, it will be outfitted with multiple technologies aimed at making the Big Iron mainframe the ultimate AI server.
Specifically, the next iterations of IBM’s z and LinuxOne mainframes will feature the Telum II processor, Spyre AI accelerator, new operating systems, and many software additions and features designed for customers who want to deploy AI inferencing and other applications with high performance requirements.
IBM Telum II has greater memory and cache capacity than the previous generation, and it integrates a new data processing unit (DPU) specialized for IO acceleration along with enhanced on-chip AI acceleration capabilities. Telum II has eight high-performance cores running at 5.5GHz, according to IBM. It includes a 40% increase in on-chip cache capacity with virtual L3 and virtual L4 growing to 360MB and 2.88GB, respectively.
The Spyre Accelerator will feature 1TB of memory and 32 AI accelerator cores that will share a similar architecture to the AI accelerator integrated into the Telum II chip, according to IBM. Multiple IBM Spyre Accelerators can be connected to bring a substantial increase in the amount of available acceleration, IBM says.
The mainframe today is already a workhorse delivering unparalleled transaction throughput and scalability, and these same types of qualities will be brought to the AI era, Meredith Stowell, vice president of ecosystems for Z and LinuxONE, told Network World.
“Tellum and Spyre, in particular, are going to take the mainframe to the next level,” Stowell said. Not only will clients be able to continue to accelerate traditional AI, but also “it’s going to enable large language models and large language model AI to come to the platform to create what we call a multi-model AI,” Stowell said. “That’s where you do that combination of both traditional AI and LLM at the same time, with the resulting high performance, but also significant increases in overall accuracy of your AI applications.”
“It will truly change what customers are able to do with AI,” Stowell said.
IBM’s mainframe processors
The next generation of processors is expected to continue a long history of generation-to-generation improvements, IBM stated in a new white paper on AI and the mainframe.
“They are projected to clock in at 5.5 GHz. and include ten 36 MB level 2 caches. They’ll feature built-in low-latency data processing for accelerated I/O as well as a completely redesigned cache and chip-interconnection infrastructure for more on-chip cache and compute capacity,” IBM wrote.
Today’s mainframes also have extensions and accelerators that integrate with the core systems. These specialized add-ons are designed to enable the adoption of technologies such as Java, cloud and AI by accelerating computing paradigms that are essential for high-volume, low-latency transaction processing, IBM wrote.
“The next crop of AI accelerators are expected to be significantly enhanced—with each accelerator designed to deliver 4 times more compute power, reaching 24 trillion operations per second (TOPS),” IBM wrote. “The I/O and cache improvements will enable even faster processing and analysis of large amounts of data and consolidation of workloads running across multiple servers, for savings in data center space and power costs. And the new accelerators will provide increased capacity to enable additional transaction clock time to perform enhanced in-transaction AI inferencing.”
In addition, the next generation of the accelerator architecture is expected to be more efficient for AI tasks. “Unlike standard CPUs, the chip architecture will have a simpler layout, designed to send data directly from one compute engine, and use a range of lower- precision numeric formats. These enhancements are expected to make running AI models more energy efficient and far less memory intensive. As a result, mainframe users can leverage much more complex AI models and perform AI inferencing at a greater scale than is possible today,” IBM stated.
Targeting AI inferencing and hybrid clouds
Training AI models has been the focus of many business use cases of AI to this point. But the mainframe today can – and will even moreso in the future – handle the growing inferencing requirements of organizations’ AI initiatives.
“A lot of our clients were having challenges in productionizing AI, because they weren’t able to score as many transactions as they wanted, because of latency, because they didn’t have the processing power. But that’s where the Tellum comes in with the capacity and power to accelerate for inferencing,” Stowell said.
The other part of the mainframe inferencing story is the Big Iron’s capacity to support hybrid clouds.
“Organizations will have different places where they may want to actually train AI. Customer data scientists may want to build that model out in a cloud, maybe in a public cloud, private cloud. Wherever it is they want to build it, they can then also train it there. Then they can actually just bring that model over onto the mainframe and actually immediately run that within with the inferencing,” Stowell said.
“So customers don’t necessarily have to build and train all in the same place. They can build and train and then bring it back onto the platform to productionize it,” Stowell said.
Yet another important mainframe capability is its support for multi-model AI development in that it can pull audio, video, text, sensor data, imaging or other input into one or multiple locations, making the training models more effective at developing the inferencing part of the AI environment that much more sophisticated and intelligent, Stowell said.
IBM in February bolstered its Granite large language model (LLM) development system, Granite 3.2, to include and improve a number of AI multi-modal capabilities. Other key software packages such as its watsonX development environment are routinely enhanced to increase AI features.
Perhaps not surprisingly, IBM’s own research recently found that enterprises were further along in deploying AI applications on the Big Iron than might be expected: 78% of IT executives surveyed said their organizations are either piloting projects or operationalizing initiatives that incorporate AI technology.
Kyndryl, too, last September reported that AI and generative AI promise to transform the mainframe environment by delivering insights into complex unstructured data and augmenting human action with advances in speed, efficiency and error reduction. Generative AI also has the potential to illuminate the inner workings of monolithic applications, Kyndryl stated.
Source:: Network World