RAPIDS 23.10 Release (2025)

RAPIDS Goes CPU/GPU (zero code change!), XGBoost 2.0, Improved Vector Search, and more

Nick Becker

·

Follow

--

RAPIDS 23.10 Release (3)

The RAPIDS 23.10 release takes a huge step forward in breaking down barriers to bringing accelerated computing to the data science community. RAPIDS now enables a zero code change CPU/GPU user experience for dataframes, graph analytics, and machine learning. This release also introduces XGBoost 2.0 and makes substantial improvements to accelerated vector search and text processing for LLMs.

Table of Contents

  • RAPIDS Goes CPU/GPU
  • XGBoost 2.0
  • Accelerated Vector Search and Text Processing

On November 8th, NVIDIA hosted the AI and Data Science Virtual Summit, bringing together experts from across the community to discuss how accelerated computing is advancing data science.

At the Summit, we announced major enhancements to RAPIDS cuDF, cuGraph, and cuML that bring a unified, zero code change CPU/GPU experience to dataframe, graph analytics, and machine learning workflows. Below, we summarize these key enhancements. For those interested in learning more, we encourage you to register for free for the Summit to watch the session replays and stay tuned for in-depth blogs in the future.

cudf.pandas

Pandas is the quintessential tool for data scientists today. With 9.5 million users, it’s the most popular library in Python for working with tabular data. But, it becomes slow as dataset sizes grow into the gigabytes.

At the Summit, we announced that cuDF’s new pandas accelerator mode (cudf.pandas) solves this problem, bringing the speed of cuDF to every pandas workflow with zero code change required. It works with most third-party libraries that operate on pandas objects and will accelerate pandas operations within these libraries, too. Just load cudf.pandas to accelerate your workflow on the GPU, with automatic CPU fallback if needed.

This new mode is available in the standard cuDF package. To accelerate IPython or Jupyter Notebooks, use the magic:

%load_ext cudf.pandas
import pandas as pd

To accelerate a Python script, use the Python module flag on the command line:

python -m cudf.pandas script.py

cudf.pandas is designed to accelerate workflows where pandas struggles with performance, so the 5 GB scale DuckDB/H2O.ai Database-like Ops Benchmark perfectly illustrates the impact of GPU-accelerating your pandas code. The benchmark requires executing a variety of common merge and groupby operations.

By just “flipping the switch”, we can turn minutes of processing into just one or two seconds.

RAPIDS 23.10 Release (4)

You can learn more about this benchmark in the cuDF documentation. And you can learn more about how cudf.pandas works at rapids.ai/cudf-pandas and test-drive the introductory notebook in a free GPU-enabled notebook environment on Google Colab.

nx-cuGraph

With more than 27 million downloads per month, NetworkX is the go-to library for graph analytics in Python thanks to its ease of use, wide selection of algorithms, and fantastic community.

But as datasets and graphs problem sizes grow, performance of NetworkX’s pure Python implementation becomes a significant hurdle, forcing users to stop using their favorite library or wait potentially hours for results. Over the past year, we’ve been collaborating with the NetworkX community to develop backend dispatching capabilities that can address these challenges.

We’re excited to share that cuGraph can now be used as a backend for NetworkX through our nx-cugraph package, enabling NetworkX users to GPU-accelerate their workflows with zero code change.

Just set an environment variable and your workflow will use cuGraph if it’s available and the algorithm is supported, falling back to standard CPU-based NetworkX otherwise.

export NETWORKX_AUTOMATIC_BACKENDS=cugraph
python my_nx_app.py

That’s it! Your NetworkX code will now use the GPU for all supported algorithms, enabling you to access up to 600x speedups when processing graphs like the US patent citation network dataset with 3.7 million nodes and 16.5 million edges.

RAPIDS 23.10 Release (5)

nx-cugraph currently includes support for three algorithms and we aim to support 12 algorithms in the 23.12 release.

You can install nx-cugraph using either conda or pip:

pip install nx-cugraph-cu11 — extra-index-url https://pypi.nvidia.com
conda install -c rapidsai -c conda-forge -c nvidia nx-cugraph

To learn more about nx-cugraph and this benchmark, please visit this in-depth blog.

cuML CPU/GPU

In this release, we’ve significantly expanded cuML’s CPU/GPU capabilities. The majority of cuML estimators now support both GPU-based and CPU-based execution capabilities with zero code change required to switch between them. A subset of estimators even support exporting models across hardware, enabling you to train and run inference on different hardware.

Now, you can prototype using cuML in your workflows even on systems without access to a GPU. When you’re ready, take the same code and run it on a GPU-enabled system to tap into the power of accelerated computing:

import cuml # no change is needed even for the import!
from cuml.manifold.umap import UMAP

X, y = …

# define the cuml UMAP model and use fit_transform function to obtain the
# low dimensional representation of the input dataset
embeddings = UMAP(
n_neighbors=10, min_dist=0.01, init=”random”
)

transformed_embeddings = embeddings.fit_transform(X)

To get started prototyping with cuML on CPU-only machines, you can install via conda:

conda install -c rapidsai -c nvidia -c conda-forge cuml-cpu=23.10

To learn more about these capabilities, please visit the cuML on GPU and CPU documentation.

In partnership with the XGBoost community, we released XGBoost 2.0 in September! This is a huge milestone for the project and a testament to the incredible group of contributors and users driving the project forward over the past few years. Today, XGBoost is downloaded more than 2.5 million times per week and is used across nearly every industry.

XGBoost 2.0 is chock full of huge improvements to both performance and user experience, but we’ll spotlight several below.

Unified GPU interface with a single device parameter

Using a GPU for different tasks within XGBoost tasks historically required setting a variety of different parameters like gpu_hist, tree_method, gpu_predictor, gpu_id, and more. Now, all of these capabilities are configurable with a single, simple device parameter that controls CPU or GPU execution.

Quantile Regression

XGBoost now supports quantile regression, a popular technique used for probabilistic forecasting scenarios in which you care about parts of the distribution beyond just the conditional mean. This may be particularly valuable for use cases in which outcomes in the tails of the distribution are particularly impactful or important, such as in supply chain forecasting.

The quantile loss (also known as pinball loss) is supported on both CPUs and GPUs.

PySpark Interface

The official XGBoost PySpark interface is now much more mature and ready for wider use! With support for GPU-based training and predictions, improved logging, performance optimizations, computing SHAP-based feature contributions, and more, we’re excited to see what the Apache Spark community creates!

To learn more, visit the XGBoost Python documentation.

Accelerated Vector Search with RAFT

23.10 brings substantial performance and functional enhancements to CAGRA, the GPU-accelerated graph-based approximate nearest neighbors technique with world class performance for large batch queries, single queries, and graph construction time.

Pre-Filtering

Pre-filtering allows removing irrelevant records before querying our vector database index to ensure we return high quality results. As the team at Pinecone noted, “Pre-filtering is excellent for returning relevant results, but significantly slows-down our search…”

With support for pre-filtering now available, CAGRA can address these CPU-based performance challenges and enable returning high quality results.

Nearest Neighbor Descent

Building vector indexes is a computationally challenging problem well suited for GPUs and nearest neighbor descent is a state-of-the-art technique for iteratively constructing a k-nearest neighbors graph. Our graph-based CAGRA algorithm can also use nn-descent to improve the initial graph-construction performance over 10x compared to using IVF-PQ to construct the graph.

Faster JSON parsing with cuDF and Dask

Training large language models can require processing terabytes of text data, often representing essentially the entire internet! cuDF and Dask has emerged as a great combination to efficiently process the documents in these training pipelines.

In the 23.10 release, we’ve made algorithmic enhancements to the cuDF JSON reader that improve performance for reading the types of files common in large-language model training pipelines. In our benchmarks, we now observe end-to-end read throughput up to 70% of the theoretical limit with H100 GPUs reading from local NVME drives.

RAPIDS 23.10 Release (6)

If you’re using cuDF as part of your LLM training pipeline, you can expect to immediately see performance gains.

At the AI and Data Science Summit, we put forward a commitment to meeting data scientists where they are today. The RAPIDS 23.10 release is a major milestone in bringing accelerated computing into the day-to-day workflows of data scientists and engineers.

Pandas and NetworkX users can now GPU-accelerate their code with zero code change. cuML now provides a large suite of CPU/GPU capabilities. XGBoost 2.0 dramatically simplifies using GPUs. And we’re continuing to improve vector search and text processing to empower emerging technologies like LLMs.

To get started using RAPIDS, visit the Quick Start Guide.

RAPIDS 23.10 Release (2025)

FAQs

RAPIDS 23.10 Release? ›

The RAPIDS 23.10 release takes a huge step forward in breaking down barriers to bringing accelerated computing to the data science community. RAPIDS now enables a zero code change CPU/GPU user experience for dataframes, graph analytics, and machine learning.

Is Nvidia Rapids free? ›

Try RAPIDS Online

Jump right into a GPU enabled RAPIDS notebook environment with a free required account.

Does Rapids work on Windows? ›

OS: One of the following OS versions: Ubuntu 20.04/22.04 or Rocky Linux 8 with gcc/++ 9.0+ Windows 11 using a WSL2 specific install.

What is rapids in Python? ›

RAPIDS™, part of NVIDIA CUDA-X, is an open-source suite of GPU-accelerated data science and AI libraries with APIs that match the most popular open-source data tools. It accelerates performance by orders of magnitude, at scale, across data pipelines. Get Started.

How to install rapids in Python? ›

Installation
  1. Install Docker.
  2. Pull our RAPIDS container. ...
  3. Run RAPIDS' container (after this step is done you should see a prompt in the main RAPIDS folder with its python environment active) ...
  4. Pull the latest version of RAPIDS. ...
  5. Make RAPIDS script executable. ...
  6. Check that RAPIDS is working. ...
  7. Optional.

What does Rapids stand for in NVIDIA? ›

The name “RAPIDS” stands for “Rapid Analytics on Platforms In Data Science.” The library provides GPU-accelerated implementations of common data science and machine learning algorithms, offering significant performance improvements over traditional CPU-based approaches.

What APIs does Rapids accelerate? ›

RAPIDS APIs
  • cuDF. cuDF is a Python GPU DataFrame library (built on the Apache Arrow columnar memory format) for loading, joining, aggregating, filtering, and otherwise manipulating data. ...
  • dask-cuDF. ...
  • cuML. ...
  • cuGraph. ...
  • cuxfilter. ...
  • cuSpatial. ...
  • cuProj. ...
  • Java + cuDF.

Is there a Windows God mode? ›

God Mode in Microsoft Windows is a curious and “secret” way to group all the system settings. It works on Windows 10 and 11.

How to install GRR on Windows? ›

Downloading the JRE Installer
  1. In a browser, go to the Java SE Runtime Environment 10 Downloads page. ...
  2. Download the JRE installer according to your requirement. ...
  3. Click Accept License Agreement, and then, under the Download menu, click the link that corresponds to the installer for your version of Windows.

How to convert Windows Home to Pro free? ›

To change your Windows edition:
  1. Open up PowerShell.
  2. Go to the extras menu.
  3. Go to change windows edition menu.
  4. Choose professional or whichever you want and choose the first method.
  5. After your system reboots, go through step 1-2 again.
  6. Choose "HWID" activation method and go through the process to activate it!
May 22, 2024

Which GPUs works with Rapids? ›

Here's what you need: Graphic Processing Unit (GPU): Your machine should have NVIDIA Volta™ or higher GPUs that have a compute capability of at least 7.0. From RAPIDS 24.02 version onwards, only GPUs with a compute capability of 7.0 or higher are supported.

What is Nvidia Rapids used for? ›

RAPIDS is a suite of open-source software libraries and APIs for executing data science pipelines entirely on GPUs—and can reduce training times from days to minutes. Built on NVIDIA® CUDA-X AI, RAPIDS unites years of development in graphics, machine learning, deep learning, high-performance computing (HPC), and more.

What is GPU CUDA? ›

In computing, CUDA (originally Compute Unified Device Architecture) is a proprietary parallel computing platform and application programming interface (API) that allows software to use certain types of graphics processing units (GPUs) for accelerated general-purpose processing, an approach called general-purpose ...

What is rapids? ›

RAPIDS is a collection of open source software libraries and APIs that gives you the ability to execute end-to-end data science and analytics pipelines entirely on NVIDIA GPUs using familiar PyData APIs.

What is cuML? ›

cuML is a suite of fast, GPU-accelerated machine learning algorithms designed for data science and analytical tasks. Our API mirrors Sklearn's, and we provide practitioners with the easy fit-predict-transform paradigm without ever having to program on a GPU.

What is the Nvidia CUDA toolkit? ›

The CUDA Toolkit from NVIDIA provides everything you need to develop GPU-accelerated applications. The CUDA Toolkit includes GPU-accelerated libraries, a compiler, development tools and the CUDA runtime. Download Now.

Can I use NVIDIA for free? ›

GeForce NOW offers three membership plans: Free, Priority, and Ultimate. You can sign up here. Day passes are also available for those looking to experience Premium benefits, without a monthly commitment.

Is Nvidia Omniverse free to use? ›

NVIDIA Omniverse is free for individuals to collaborate between apps and one other person.

Does NVIDIA have free cash flow? ›

As of today (2024-09-21), NVIDIA's share price is $116.00. NVIDIA's Free Cash Flow per Share for the trailing twelve months (TTM) ended in Jul. 2024 was $1.88. Hence, NVIDIA's Price-to-Free-Cash-Flow Ratio for today is 61.70.

Is NVIDIA Broadcast free or paid? ›

You can install NVIDIA Broadcast software for free.

Top Articles
Latest Posts
Recommended Articles
Article information

Author: Nathanael Baumbach

Last Updated:

Views: 6175

Rating: 4.4 / 5 (75 voted)

Reviews: 90% of readers found this page helpful

Author information

Name: Nathanael Baumbach

Birthday: 1998-12-02

Address: Apt. 829 751 Glover View, West Orlando, IN 22436

Phone: +901025288581

Job: Internal IT Coordinator

Hobby: Gunsmithing, Motor sports, Flying, Skiing, Hooping, Lego building, Ice skating

Introduction: My name is Nathanael Baumbach, I am a fantastic, nice, victorious, brave, healthy, cute, glorious person who loves writing and wants to share my knowledge and understanding with you.