cuDF pandas Accelerator Mode
ACCELERATING PANDAS
WITH ZERO CODE CHANGE
Zero Code Change Acceleration
Write your code with the full flexibility of pandas. Just load cudf.pandas to accelerate on the GPU, with automatic CPU fallback if needed.
One Code Path
Differing Hardware
Develop, test, and run in production with a single code path, regardless of hardware.
Third-Party Library Compatible
pandas accelerator mode is compatible with most third-party libraries that operate on pandas objects — it will even accelerate pandas operations within these libraries.
Accelerates the Entire pandas Ecosystem
Bringing the speed of cuDF to pandas users and the ecosystem built for them. With GPUs, you can keep using pandas as your data grows.
Bringing the Speed of cuDF
to Every Pandas User
How to Use It
This new mode is available in the standard cuDF package. To accelerate IPython or Jupyter Notebooks, use the magic:
%load_ext cudf.pandas
import pandas as pd
...
To accelerate a Python script, use the Python module flag on the command line:
python -m cudf.pandas script.py
Or, explicitly enable cudf.pandas via import if you can't use command line flags:
import cudf.pandas
cudf.pandas.install()
import pandas as pd
...
150x Faster, Zero Code Change
* Standard DuckDB Data Benchmark (5GB)
GPU: NVIDIA Grace Hopper, CPU: Intel® Xeon® Platinum 8480C
pandas v2.2, RAPIDS cuDF v23.10 Learn More
How It Works
Under the Hood
When cudf.pandas is enabled, import pandas (or any of its submodules) imports a magic module, rather than “regular” pandas. This magic module contains proxy types and proxy functions:
In [1]: %load_ext cudf.pandas
In [2]: import pandas as pd
In [3]: pd
Out[3]: <module 'pandas' (ModuleAccelerator(fast=cudf, slow=pandas))>
Operations on proxy types and functions execute on the GPU where possible and on the CPU otherwise, synchronizing under the hood as needed. This applies to pandas operations in both your code and in third-party libraries you may be using.
All cudf.pandas objects are a proxy to either a GPU (cuDF) or CPU (pandas) object at any given time. Operations are first attempted on the GPU (copying from CPU if necessary). If that fails, the operation is attempted on the CPU (copying from GPU if necessary).
When using cudf.pandas, cuDF's pandas compatibility mode is automatically enabled, ensuring consistency with pandas-specific semantics like default sort ordering.
Execution Flow
Bringing the Speed of cuDF
to Every Pandas User
Try Now on Colab
Take cuDF's new pandas accelerator mode for a test-drive in a free GPU-enabled notebook environment using your Google account by launching on Colab
Prefer to use your own GPU? Get started with 10 Minutes to cudf.pandas or check out our full collection of RAPIDS community notebooks
cuDF's pandas accelerator mode is part of the cuDF package and works smoothly with all RAPIDS libraries. Visit the RAPIDS Quick Start to get started with any RAPIDS library on your favorite platform.
Learn More
cuDF pandas accelerator mode is now Generally Available (GA) and ready for wide use. You can learn more through the documentation and the release blog
Prefer conference presentations to documentation? Get all the details about how cudf.pandas works, how we test it, and why we developed it at the AI and Data Science Virtual Summit
Want to contribute or share feedback, reach out on GitHub