IBM targets mainframe customers with prebuilt AI training modules

IBM is rolling out a portfolio of enterprise data sets aimed at jumpstarting AI development for its Z and LinuxONE mainframe customers.

IBM’s Synthetic Data Sets are designed to help with training or fine-tuning AI models quickly, enhancing predictive models, and validating truthful models, IBM said. The family of data sets, expected to be available at the end of February, includes modules for payment cards, banking and money laundering, and homeowners insurance.

Synthetic Data Sets are comprised of downloadable CSV and DDL files with pre-curated attributes needed for the specific IBM Z and IBM LinuxONE use cases, making them familiar to use and compatible with everything from databases to spreadsheets to hardware platforms to standard AI tools, IBM said. 

If a customer has an existing model or LLM, synthetic data provides additional data that is rich, labeled, and diverse to fine-tune the AI model. If a client does not have a model, the Synthetic Data Sets are designed to offer quick and privacy-compliant training data to create models from scratch, the vendor stated.

Customers can deploy models on IBM Z and IBM LinuxONE with AI Toolkit for IBM Z and IBM LinuxONE, Cloud Pak for Data on Z, or Machine Learning for z/OS, wrote Elpida Tzortzatos, an IBM Fellow and Z architect, and Tina Tarquinio, IBM vice president, in a blog about the news. They can “perform inference on IBM z16 and IBM LinuxONE 4, leveraging hardware acceleration investments and data gravity to dramatically enhance AI inferencing speed and scale,” the authors wrote.

In addition customers can enhance predictive AI models and fine-tune LLMs with additional rich and broad data, leading to significant cost savings in areas such as fraud detection and money laundering prevention,” the authors wrote.

As an example, Tzortzatos and Tarquinio wrote about money laundering, which “often goes undetected in real data, as criminals attempt to move illicit funds to conceal their origins. This frequently involves crossing bank and national boundaries, producing complex transaction patterns,” the authors wrote.

With IBM Synthetic Data Sets for Core Banking and Money Laundering, every transaction is identified either as money laundering or not, “spanning the entire banking ecosystem, incorporating global transactions, and even including cash transactions which are typically unavailable in real banking data,” Tzortzatos and Tarquinio wrote. “This rich dataset with known ground truth enables data scientists to validate their models, and create robust AML models, thereby reducing risk and saving costs for organizations. Moreover, reducing false positives saves countless hours of labor spent investigating flagged instances.”

Source:: Network World