At NVIDIA GTC 2023, NVIDIA unveiled notable updates to its suite of NVIDIA AI software for developers to accelerate computing. The updates reduce costs in…
At NVIDIA GTC 2023, NVIDIA unveiled notable updates to its suite of NVIDIA AI software for developers to accelerate computing. The updates reduce costs in several areas, such as data science workloads with NVIDIA RAPIDS, model analysis with NVIDIA Triton, AI imaging and computer vision with CV-CUDA, and many more.
To keep up with the newest SDK advancements from NVIDIA, watch the GTC keynote from CEO Jensen Huang.
NVIDIA RAPIDS Accelerator for Apache Spark
NVIDIA RAPIDS Accelerator for Apache Spark is now available in the NVIDIA AI Enterprise 3.1 software suite. Speed up data processing and analytics or model training with Apache Spark 3 without code changes, while lowering infrastructure costs.
Highlights:
- Integration with major platforms: Google Cloud Platform (GCP) Dataproc, Amazon EMR, Databricks on Azure and AWS, and Cloudera
- The Accelerated Spark Analysis Tool makes cost-saving predictions and recommends optimized GPU parameters to maximize the speedup of your workload
- With NVIDIA AI Enterprise, take advantage of guaranteed response times, priority security notifications, and data science experts from NVIDIA
Apply today for a free consultation to evaluate your Spark workloads for GPU acceleration and learn to configure your cluster at an average of 4x speedups.
Add this GTC session to your calendar:
- Accelerate Spark with RAPIDS for Cost Savings
NVIDIA RAPIDS
Vector search is becoming an increasingly important step in use cases such as large language models, recommender systems, and computer vision. At GTC 2023, NVIDIA announced that RAPIDS RAFT, the toolkit providing accelerated, composable ML building blocks, can now power vector search.
By integrating RAPIDS RAFT, vector databases and search engines can now deliver significantly faster performance for tasks such as building indexes, loading data, and executing many different query types.
Highlights:
- RAFT accelerates vector search use cases by offering accelerated Exact and Approximate Nearest Neighbor primitives on GPUs
- RAFT-powered index-building time is up to 95x faster and queries per second are up to 3x faster than CPU implementations
NVIDIA is already working with FAISS, Milvus, and Redis to bring improved vector search performance to their users by building on RAFT. Milvus’ GPU-powered backend optimized with RAFT will be available soon.
For more information about RAPIDS RAFT vector search capabilities and everything else it can provide, see the RAPIDS RAFT User’s Guide and /rapidsai/raft GitHub repo.
Add these GTC sessions to your calendar:
- Improving Dense Text Retrieval Accuracy with Approximate Nearest Neighbor Search
- Graph-Based, GPU-Optimized Approximate Nearest Neighbor Search Library for Both Batch Processing and Online Services
- Accelerate Data Science Workloads in Python with RAPIDS
CV-CUDA
With an open beta coming in April 2023, CV-CUDA is a new open-source library to build GPU-accelerated pre– and post-processing pipelines for AI computer vision at a cloud scale.
Highlights:
- 30+ computer vision operators with C/C++ and Python APIs to accelerate object detection, segmentation, and classification workflows
- Support for batching of variable-shape images
- Zero-copy integration with TensorFlow and PyTorch using DLPack and CUDA array interfaces
- Single-line PIP Installation and PyPi availability
- NVIDIA Triton Inference Server example using CV-CUDA, TensorRT, and VPF for video encoding and decoding
For more information, see the /CV-CUDA GitHub repo.
Add these GTC sessions to your calendar:
- Overcoming Pre- and Post-Processing Bottlenecks in AI-Based Imaging and Computer Vision Pipelines
- Building AI-Based HD Maps for Autonomous Vehicles
- Connect with the Experts: GPU-Accelerated Data Processing with NVIDIA Libraries
- Advancing AI Applications with Custom GPU-Powered Plug-ins for NVIDIA DeepStream
NVIDIA cuLitho
cuLitho, a software library for computational lithography, speeds up the largest workload in semiconductor manufacturing by 40x on NVIDIA Hopper GPUs.
As the semiconductor industry continues to push the state of the art for fabrication technology, it is increasingly facing challenges due to the limits of physics. Optical proximity correction (OPC) and other computational lithography methods are required to create masks that compensate for these challenges. The application of these complex methods has become the industry’s largest compute workload.
NVIDIA cuLitho is a library with optimized tools and algorithms for GPU-accelerating computational lithography and the manufacturing process of semiconductors by orders of magnitude over current CPU-based methods.
Highlights:
- Reducing the time to produce a mask from 2 weeks to an overnight 8-hour run
- Streamlining the data center: 1/8 the space, 1/4 the cost, and 1/9 the power
- Enabling new lithography solutions, such as curvilinear OPC and High-NA EUV
For more information and partner quotes, see NVIDIA cuLitho.
Add these GTC sessions to your calendar:
- Accelerating Computational Lithography: Enabling our Electronic Future
- AI for Microelectronics Design
NVIDIA Triton
Key updates to NVIDIA Triton Inference Server, open-source inference-serving software, brings fast and scalable AI to every application in production. Over 66 features were added in the last year.
Software updates:
- PyTriton as a simple interface to run NVIDIA Triton in native Python code, enabling rapid prototyping and easy migration of Python-based models
- Support for model ensembles and concurrent model analysis in Model Analyzer.
- Paddle Paddle support and integration with Paddle Paddle FastDeploy
- FasterTransformer backend with support for BERT, Hugging Face BLOOM, and FP8 in GPT
- NVIDIA Triton management service (early access) for the automated and resource-efficient orchestration of models for inference at scale
Kick-start your inference journey with short-term access in NVIDIA LaunchPad without setting up your own environment.
Get started with NVIDIA Triton and get enterprise-grade support.
Add these GTC sessions to your calendar:
- Taking AI Models to Production: Accelerated Inference with Triton Inference Server
- Efficient Inference of Extremely Large Transformer Models
- Connect with the Experts: Accelerating and Deploying Deep Learning Models to Production
NVIDIA TensorRT
Updates to NVIDIA TensorRT, an SDK for high-performance deep learning inference, include a deep learning inference optimizer and runtime to deliver low latency and high throughput for inference applications.
New features:
- Performance optimizations for generative AI diffusion and transformer models
- Enhanced hardware compatibility to build and run on different GPU architectures (NVIDIA Ampere architecture and later)
- Version compatibility so that you can build and run on different TensorRT versions from TensorRT 8.6 and later
- Multi-GPU, multi-node inference for GPT-3 models in early access
Kick-start your inference journey with short-term access in NVIDIA LaunchPad without setting up your own environment.
Get started with TensorRT and get enterprise-grade support.
Add these GTC sessions to your calendar:
- TensorRT 8.6: Hardware & Version Compatibility
- Exploring Next Generation Methods for Optimizing PyTorch models for Inference with Torch-TensorRT
- Accelerating Transformer-Based Encoder-Decoder Language Models for Sequence-to-Sequence Tasks
NVIDIA TAO Toolkit
With central updates to TAO Toolkit, you can use the power and efficiency of transfer learning to achieve state-of-the-art accuracy and production-class throughput for any platform. This low-code AI toolkit accelerates vision AI model development for all skill levels, from beginners to expert data scientists.
Highlights:
- New state-of-the-art vision transformers for image classification, object detection, and segmentation tasks
- AI-assisted annotation tool for auto-generated segmentation masks
- ONNX model export that enables TAO models to be deployed on any devices, such as GPUs, CPUs, and MCUs
- Increased AI transparency and explainability by offering TAO as open source
For more information, see Access the Latest in Vision AI Model Development Workflows with NVIDIA TAO Toolkit 5.0. Begin customizing your AI models with TAO Toolkit and try it on LaunchPad.
Add these GTC sessions to your calendar:
- AI Models Made Simple Using TAO
- Running TAO Toolkit API in NetsPresso for Effortless Vision AI Model Development and Optimization
- Solving Computer Vision Grand Challenges in One-Click
NVIDIA DeepStream
NVIDIA came out with the latest version of DeepStream, which adds a new runtime. It enables new capabilities and unlocks new use cases that require tight scheduling solutions. Existing DeepStream developers continue to benefit from hardware-accelerated plug-ins while unlocking smart automation and Industry 5.0 use cases.
Updates:
- New accelerated extensions
- New runtime with advanced scheduling options
- Updated accelerated plug-ins.
For more information, see Get Started with the NVIDIA DeepStream SDK. Try it on LaunchPad.
Add these GTC sessions to your calendar:
- An Intro into NVIDIA DeepStream and AI-streaming Software Tools
- Advancing AI Applications with Custom GPU-Powered Plug-ins for NVIDIA DeepStream
NVIDIA Quantum
NVIDIA announced the latest version of the NVIDIA Quantum platform for accelerating quantum computing simulation, hybrid quantum classical algorithm development, and hybrid system deployment.
cuQuantum enables the quantum computing ecosystem to solve problems at the scale of future quantum advantage, enabling the development of algorithms and the design and validation of quantum hardware.
cuQuantum highlights:
- Multi-node, multi-GPU support in the DGX cuQuantum appliance.
- Support for approximate tensor network methods.
- Adoption of cuQuantum continues to gain momentum, including CSPs and industrial quantum groups.
NVIDIA also unveiled the general availability of NVIDIA CUDA Quantum, an open, QPU-agnostic platform for hybrid quantum-classical computing. This hybrid, quantum-classical programming model is interoperable with today’s most important scientific computing applications, enabling a massive new class of domain scientists and researchers to program quantum computers.
CUDA Quantum highlights:
- Single-source C++ and Python implementations as well as a compiler toolchain for hybrid systems and a standard library of quantum algorithmic primitives
- QPU-agnostic, partnering with quantum hardware companies across a broad range of qubit modalities
- Delivering up to a 300x speedup over a leading Pythonic framework also running on an NVIDIA A100 GPU
At GTC 2023, NVIDIA and Quantum Machines announced DGX Quantum, a partnership that brings together the world’s most powerful accelerated-computing platform with the world’s most advanced quantum controllers. Quantum Machines and NVIDIA will advance the field with a first-of-its-kind architecture for high-performance and low-latency, quantum-classical computing.
DGX Quantum highlights:
- A reference architecture featuring a PCIe-connected OPX+ NVIDIA Grace Hopper system, scalable with the size of the QPU.
- A CUDA Quantum integration with QUA and the Quantum Machines stack, featuring a POC benchmark or benchmarks
- Announcement of the QCC as the first QGX deployment in Q4 2023
For more information, see the NVIDIA CUDA Quantum page.
Add these GTC sessions to your calendar:
- Defining the Quantum-Accelerated Supercomputer
- Inside QODA, the Quantum Optimized Device Architecture
NVIDIA Modulus
NVIDIA Modulus, the platform for developing physics-informed machine learning (physics-ML) models, now includes the data-driven neural operator family of architectures for training models for global scale weather prediction, such as FourCastNet. It uses the NVIDIA AI software stack to give the best performance and scaling to serve both AI research and production deployment at an industrial scale.
NVIDIA Modulus is available with expanded capabilities to cover different domains and both data-driven and physics-driven approaches. It can solve problems in a broad range of applications from computational fluid dynamics (CFD) and structural analysis to electromagnetics.
It’s available as open source under the simple Apache 2.0 license.
Along with recipes for developing physics-ML models for reference applications, Modulus is now free to use, develop, and contribute, no matter which field you are in. It includes open-source repositories that suit different workflows, from native PyTorch developers with modulus-launch to engineers that think in terms of symbolic PDEs with modulus-sym.
Download Modulus source code from the /NVIDIA/modulus GitHub repo.
For more information about Modulus Open Source, see Physics-Informed Machine Learning Platform NVIDIA Modulus Is Now Open Source.
Add this GTC session to your calendar:
- Earth-2 and Digital Technologies for Net Zero
Source:: NVIDIA