Even Faster and More Scalable UMAP on the GPU with RAPIDS cuML

Workflow diagram for disaggregating data into clusters and running in batches and merging.

UMAP is a popular dimension reduction algorithm used in fields like bioinformatics, NLP topic modeling, and ML preprocessing. It works by creating a k-nearest…

UMAP is a popular dimension reduction algorithm used in fields like bioinformatics, NLP topic modeling, and ML preprocessing. It works by creating a k-nearest neighbors (k-NN) graph, which is known in literature as an all-neighbors graph, to build a fuzzy topological representation of the data, which is used to embed high-dimensional data into lower dimensions. RAPIDS cuML already contained…

Source

Source:: NVIDIA