Revise the cuML docs for 25.10 by csadorf · Pull Request #7228 · rapidsai/cuml

csadorf · 2025-09-16T19:50:15Z

Overview

Major revision of cuML introduction and user guide documentation as well as the cuml.accel example notebooks

Key Changes

Documentation Overhaul

Complete revision of main pages:
- index.rst: Complete revision with improved structure, mention of key performance metrics, quick start guide, and feature highlights
- cuml_intro.rst: Major restructuring around three core principles with detailed explanations and code examples
- user_guide.rst: Add reference to cuml.accel zero-code-change acceleration to avoid confusion on overview page
- estimator_intro.ipynb: Major revision of the estimator introduction user guide
- pickling_cuml_models.ipynb: Major revision of the serialization user guide including documenation of as_sklearn/from_sklearn
- FIL.rst: Major revision of the FIL documentation page
Expanded cuml.accel example notebooks:
- getting_started.ipynb (481 lines): Added comprehensive guide covering classification, clustering, and dimensionality reduction with real-world datasets based on the Kaggle notebook
- profiling.ipynb (384 lines): Detailed profiling and debugging guide with function and line profiler examples
- plot_kmeans_digits.ipynb: Updated title for consistency

Code Changes

Profiler styling support: Added CUML_ACCEL_PROFILER_STYLE environment variable to control profiler appearance in different environments (essential for dark mode documentation rendering)
Configuration updates: Updated conf.py to override default cuml.accel profiler style

copy-pr-bot · 2025-09-16T19:50:18Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

csadorf · 2025-09-16T19:57:23Z

/ok to test 26e948f

csadorf · 2025-09-17T17:26:11Z

/ok to test 2ca5ab2

csadorf · 2025-09-17T17:54:21Z

/ok to test 1f11be6

csadorf · 2025-09-17T17:56:03Z

/ok to test b3dcbf7

csadorf · 2025-09-17T18:41:43Z

/ok to test db63511

review-notebook-app · 2025-09-17T21:53:56Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

csadorf · 2025-09-19T22:15:50Z

/ok to test 94315c3

betatim

First pass. Generally looks good.

Did you mean to change the profiler stuff or left over code?

csadorf · 2025-09-26T14:29:22Z

Did you mean to change the profiler stuff or left over code?

Yes, I'm motivating this in the PR description.

Improve clarity and consistency, replacing variable assignments with direct values and standardizing prediction variable names.

trivialfis

The new doc looks very nice!

A few comments:

You can use syntax like ":py:methxgboost.ForestInference.apply" to add a link to the API reference.
Not sure why ForestInference.load needs a is_classifier for XGB models, it should be possible to infer this from the use of objective function.

viclafargue

Great work! Just have a few suggestions for the getting started cuml-accel notebook.

viclafargue · 2025-09-29T08:41:19Z

+      "source": [
+        "kde = KernelDensity(kernel='gaussian', bandwidth=0.5)\n",
+        "kde.fit(X)\n"
+      ]


I don't know if the example proves anything to the user here since the heavy lifting is done in the background. Maybe showing that GPU inputs (like cuPY arrays) are correctly ingested might be better.

True, I've adjusted the example in ae8804a5a014834096a37bf2edc4e361a42ee185 .

betatim · 2025-09-29T12:36:31Z

Thanks for all the fixes Simon!

jcrist

A few comments, but mostly LGTM!

jcrist · 2025-09-29T21:37:14Z

@@ -0,0 +1,384 @@
+{


Can you comment on why you think we need an example notebook for this? From my read of this this seems fairly duplicative of the existing docs on profiling and logging (and much of this notebook feels directly lifted from there).

I'd rather avoid duplicating content in multiple places. Having multiple pages on the same thing means updates need to happen in multiple places, and also doesn't leave a canonically clear page to refer users to. IMO we should drop this notebook entirely.

We can certainly work on de-duplication, but there is value in having the same content presented in different ways. And I think it's perfectly fine if that means that there is some duplication. The notebooks are not only rendered as part of our docs, but can also be downloaded and directly executed. Having fully-functional examples (i.e. tutorial-style content) is different from an overview guide or reference documentation, because they serve different purposes.

To be clear, I am not claiming that the current duplication or distribution of content is optimal, but I do not consider full deduplication a critical factor in documentation.

jcrist · 2025-09-29T21:38:48Z


        console = Console()

+        base_style = Style.parse(os.getenv("CUML_ACCEL_PROFILER_STYLE", ""))


In my local testing the profiler renders fine in a dark and light theme as is (at least using the default themes that ship with jupyter). That said, if our docs dark theme doesn't work for it I'd rather adjust our default theme to work better across all environments than special case this.

If we decide to keep the profiling notebook, mind if I push up a fix that adjusts the style handling here?

Yes, I'd prefer to keep the notebook. ~~Feel free to create a PR into mine with an adjustment to the default theme, but I'd prefer if you did not directly push to this branch.~~

Edit: I changed my mind on this, feel free to push directly to this branch.

I've pushed an update. This required two changes:

We now use a simpler color scheme for highlighting code in the line profiler. This palette works much better across light and dark themes, and generally mirrors the one used by jupyter.

We squash a css tweak added by pydata-sphinx-theme that sets a background to html output cells in dark mode notebooks. This doesn't seem beneficial in any of our example notebooks, and was the main source of things not rendering nicely. Both the sklearn rich reprs and our profiler output cells now look much nicer in the rendered notebooks.

I've inspected the outputs of these changes in both light and dark terminals, light and dark notebooks, and all rendered notebooks in our docs in both light and dark mode. I think this change is strictly beneficial.

Looks perfect in the preview. Awesome!

csadorf

@viclafargue Thanks for the feedback. I've addressed your comments in 1ed88ad .

csadorf · 2025-09-30T01:22:11Z

+      "source": [
+        "kde = KernelDensity(kernel='gaussian', bandwidth=0.5)\n",
+        "kde.fit(X)\n"
+      ]


True, I've adjusted the example in ae8804a5a014834096a37bf2edc4e361a42ee185 .

csadorf · 2025-09-30T01:31:31Z

The new doc looks very nice!

Thanks! :)

A few comments:

You can use syntax like ":py:methxgboost.ForestInference.apply" to add a link to the API reference.

Are you referring to a specific instance where we are not doing that or is that just a general suggestion?

Not sure why ForestInference.load needs a is_classifier for XGB models, it should be possible to infer this from the use of objective function.

Good question, but as of right now it is needed. Maybe something that we can improve in a future API revision? CC @hcho3

The new theme works well in dark and light environments in both consoles and notebooks.

This overrides an override set by pydata-sphinx-theme to avoid adding a background to html outputs in dark theme.

jcrist

csadorf · 2025-09-30T21:00:13Z

/merge

github-actions Bot assigned csadorf Sep 16, 2025

csadorf added doc Documentation non-breaking Non-breaking change labels Sep 16, 2025

csadorf changed the title ~~Expand the cuML docs landing page.~~ [DO NOT MERGE] Improve the cuML docs Sep 16, 2025

csadorf force-pushed the docs/issue-7096 branch from 26e948f to 2ca5ab2 Compare September 17, 2025 17:25

csadorf force-pushed the docs/issue-7096 branch from 1f11be6 to b3dcbf7 Compare September 17, 2025 17:55

csadorf force-pushed the docs/issue-7096 branch from b3dcbf7 to db63511 Compare September 17, 2025 18:41

csadorf force-pushed the docs/issue-7096 branch from 6b054f8 to 2568644 Compare September 19, 2025 21:31

betatim reviewed Sep 22, 2025

View reviewed changes

Comment thread docs/source/cuml_intro.rst

Comment thread docs/source/cuml_intro.rst

This was linked to issues Sep 24, 2025

Add docs page on sklearn interop #7206

Closed

Improve the cuml.accel documentation #7096

Closed

csadorf changed the title ~~[DO NOT MERGE] Improve the cuML docs~~ Improve the cuML docs Sep 24, 2025

csadorf added 10 commits September 24, 2025 14:30

Expand the cuML docs landing page.

dddf841

revert this: reduce PR CI to what's needed for docs build

326197a

remove open-source note

43294ef

expand on system reqs

fdd6355

minor fixups on the cuml intro page

07f5103

improve the intro page

1fef01b

minor fixups to the notebooks

59f5f90

use more consistent naming scheme and imports in the notebooks

d3bfac7

comment out line that is supposed to be commented out

3d8e2e3

apply black formatting to notebooks where sensible

9bed046

betatim reviewed Sep 26, 2025

View reviewed changes

csadorf added 13 commits September 26, 2025 09:43

Revise intro paragraphs on landing page.

a596369

Improve generic code comment on intro page

69ef1b9

Revise the intro api example code.

817c28b

Revise note on api compatibility on intro page.

ef77e33

Improve the "be fast" intro section.

61b3176

Revise the estimator intro notebook.

10a20b2

Revise the pickling notebook.

cb1e540

update pickling notebook metadata and make executable

0189847

Further revise the pickling notebook

0b8ba7c

Refactor code examples in estimator intro notebook.

0184eaa

Improve clarity and consistency, replacing variable assignments with direct values and standardizing prediction variable names.

explain n_parts choice

c455296

do not persist dask input data

6169f17

Revise the FIL landing page.

e885a51

trivialfis reviewed Sep 28, 2025

View reviewed changes

viclafargue reviewed Sep 29, 2025

View reviewed changes

csadorf mentioned this pull request Sep 29, 2025

[CI] scikit-learn integration test segfault (flaky) #7274

Closed

jcrist reviewed Sep 29, 2025

View reviewed changes

csadorf added 2 commits September 29, 2025 16:57

Merge branch 'branch-25.10' into docs/issue-7096

2717132

Improve the cuml.accel getting started notebook.

1ed88ad

csadorf commented Sep 30, 2025

View reviewed changes

jcrist added 2 commits September 30, 2025 11:56

Improve default profiler theme

9e8c458

The new theme works well in dark and light environments in both consoles and notebooks.

Tweak css

5e390ab

This overrides an override set by pydata-sphinx-theme to avoid adding a background to html outputs in dark theme.

jcrist approved these changes Sep 30, 2025

View reviewed changes

rapids-bot Bot merged commit e5adc43 into rapidsai:branch-25.10 Sep 30, 2025
101 checks passed

csadorf deleted the docs/issue-7096 branch October 1, 2025 00:48


		console = Console()

		base_style = Style.parse(os.getenv("CUML_ACCEL_PROFILER_STYLE", ""))

Conversation

csadorf commented Sep 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Key Changes

Documentation Overhaul

Code Changes

Uh oh!

copy-pr-bot Bot commented Sep 16, 2025

Uh oh!

csadorf commented Sep 16, 2025

Uh oh!

csadorf commented Sep 17, 2025

Uh oh!

csadorf commented Sep 17, 2025

Uh oh!

csadorf commented Sep 17, 2025

Uh oh!

csadorf commented Sep 17, 2025

Uh oh!

review-notebook-app Bot commented Sep 17, 2025

Uh oh!

csadorf commented Sep 19, 2025

Uh oh!

Uh oh!

Uh oh!

betatim left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

csadorf commented Sep 26, 2025

Uh oh!

trivialfis left a comment

Choose a reason for hiding this comment

Uh oh!

viclafargue left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

betatim commented Sep 29, 2025

Uh oh!

jcrist left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

csadorf Sep 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

csadorf left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

csadorf commented Sep 16, 2025 •

edited

Loading

viclafargue left a comment •

edited

Loading

csadorf Sep 29, 2025 •

edited

Loading