Skip to content

Improve Guidance and Performance Visualization with the New Nsight Compute

CUDA-X logo graphic

Learn more about new features and ways to improve system performance using Nsight Compute 2022.2

NVIDIA Nsight Compute is an interactive kernel profiler for CUDA applications. It provides detailed performance metrics and API debugging through a user interface and a command-line tool. Nsight Compute 2022.2 includes features to expand the supported environments and workflows for CUDA kernel profiling and optimization. 

Download now. >>

The following outlines the feature highlights of Nsight Compute 2022.2.

NVIDIA OptiX acceleration structure viewer

With the new NVIDIA OptiX acceleration structure viewer, users can inspect the structures they build before launching a ray-tracing pipeline. Acceleration structures describe a rendered scene’s geometries for ray-tracing intersection calculations. Users create these acceleration structures and OptiX translates them to internal data structures. Sometimes the description created by the user is error prone and it can be difficult to understand why the rendered result is not as expected or what is limiting performance. 

With this new feature, users can navigate through them in a 3D visualizer and view the parameters used during their creation like build flags, triangle mesh vertices, and AABB coordinates. This viewer is useful to identify overlaps or inefficient hierarchies, resulting in subpar ray-tracing performance.

Figure 1. Nsight Compute acceleration structure viewer with 3D scene navigation

Issues detection per kernel

The latest version adds a new “Issues Detected” column to the summary page for users to sort all profiled kernels by the number of performance issues detected. This gives users guidance on where to focus their efforts across multiple results (kernel profiles). If users are unsure which kernel to focus their optimization efforts on, a long running kernel with a high number of detected issues is a good starting point.

Figure 2. Issues detected column in summary page identifies kernels with the most performance issues

Additional improvements

There are improvements to the metric grouping and selection options on the source page to make them easier to use. Additionally, this release adds support for running the Nsight Compute user interface on ARM SBSA and L4T based platforms, for users to profile without needing remote connections or separate host machines for the user interface.

Check out the sessions below released at NVIDIA GTC 2022 showcasing Nsight tool capabilities, support with Jetson Orin, and more.

  • How To Understand and Optimize Shared Memory Accesses using Nsight Compute
  • What, Where, and Why? – Use CUDA Developer Tools to Detect, Locate, and Explain Bugs and Bottlenecks 
  • Orin Developer Tools: The Next Frontier

Nsight Compute Resources

  • Learn more and download
  • Documentation
  • Developer forums
  • Additional videos and blog posts

Source:: NVIDIA