Add YAML based benchmark definitions for cuML regression runs by dantegd · Pull Request #7980 · rapidsai/cuml

dantegd · 2026-04-21T23:12:50Z

This PR adds YAML-based benchmark definitions to describe and run cuML benchmarks. It introduces manifest loading and validation, default and profile resolution, support for explicit shape pairs and parameter grids, and the CLI/config plumbing needed to run benchmarks directly from a YAML file. It also adds the first manifests for a tiny smoke suite and a single-GPU suite, plus test coverage and docs for the new flow.

The point of this PR is to put a clean declarative layer in front of the existing benchmark harness so we can define suites in one place instead of rebuilding them through large CLI invocations. Follow-up PRs can then build on top of this foundation: richer result metadata and output schemas, better timing statistics and warmup handling, stronger reproducibility controls, baseline comparison tooling, and potentially CI/nightly workflows.

…ecution

copy-pr-bot · 2026-04-21T23:12:53Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

…chmark defaults

coderabbitai · 2026-04-22T21:21:40Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Summary by CodeRabbit

New Features
- CLI supports YAML manifests via --config/--profile and a new --backends option; config-mode supports profile filtering, compact variant/tier expansion, and aggregated CSV output.
Documentation
- BENCHMARK guide expanded with manifest schema, examples, and details on config-mode behavior and CLI override precedence.
Tests
- New tests covering config validation, variant/profile expansion, backend/runtime behavior, and metric handling.
Chores
- Added runtime YAML parsing dependency (pyyaml).

Walkthrough

Adds YAML-driven benchmark manifests and end-to-end config handling: new CLI args (--config, --profile, --backends), a config loader/validator/resolver module, manifest files, config-driven runner wiring (including backend selection and explicit-option precedence), metric normalization in runners, tests, docs, and pyyaml runtime dependency.

Changes

Cohort / File(s)	Summary
CLI `python/cuml/cuml/benchmark/cli.py`	Registers three new optional CLI args: `--config`, `--profile`, and `--backends` (comma-separated `cpu,gpu`).
Config module `python/cuml/cuml/benchmark/config.py`	New module exposing `BenchmarkConfigError`, `load_config_file`, `validate_config`, `resolve_config`, `load_and_resolve_config`; implements strict YAML schema validation, variants/tier expansion, profile & algorithm filtering, and returns a normalized `benchmarks` list.
Manifests `python/cuml/cuml/benchmark/configs/single_gpu.yaml`, `python/cuml/cuml/benchmark/configs/test.yaml`	Adds a comprehensive `single_gpu` suite and a minimal `test` manifest (profiles, defaults, benchmark entries and variants).
Config-driven runner `python/cuml/cuml/benchmark/run_benchmarks.py`	Adds config-mode execution path: loads/resolves YAML, merges CLI explicit overrides, resolves per-entry backends/dtypes/shapes, enumerates variations and runs `runners.run_variations`, annotates results with config metadata; signature changes to accept `explicit_options`.
Runners metric handling `python/cuml/cuml/benchmark/runners.py`	Removes cuDF-specific conversions and adds `_metric_array_to_numpy` helper to normalize metric inputs (CuPy/NumPy/None) for accuracy computations.
Tests `python/cuml/tests/test_benchmark_config.py`, `python/cuml/tests/test_benchmark_runners.py`	Adds pytest suites covering config parsing/validation/resolution, variants expansion, CLI-override precedence, backend/RMM allocator behavior, and CuPy-based metric handling.
Docs `wiki/BENCHMARK.md`	Documents YAML manifest schema, examples, new CLI options (`--config`, `--profile`, `--backends`), config-mode override semantics, and notes `pyyaml` dependency.
Packaging `conda/recipes/cuml/recipe.yaml`	Adds `pyyaml` to runtime dependencies.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Allow using cuML benchmark tools on systems without cuML installed #7593: Overlapping changes to benchmark CLI and runner codepaths (build_parser / run_benchmark / backend selection).

Suggested reviewers

csadorf
jcrist
bdice

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 31.25% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately summarizes the main change: adding YAML-based benchmark definitions for cuML regression runs, which is clearly reflected in the comprehensive changeset including config loading, validation, manifests, CLI integration, and tests.
Description check	✅ Passed	The description is directly related to the changeset, explaining the purpose of introducing YAML-based benchmark definitions, manifest loading/validation, profile resolution, parameter grids, CLI plumbing, new manifests, test coverage, and documentation—all of which are present in the changes.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 6

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@python/cuml/cuml/benchmark/config.py`:
- Around line 593-607: The algorithm filter currently does exact matches in
_apply_algorithm_filter, so pass-ins like "logisticregression" with different
case fail; make matching case-insensitive by normalizing both sides (use
str.casefold() for robustness) — build wanted = set(s.casefold() for s in
algorithm_filter) and test e["algorithm"].casefold() in wanted when filtering
benchmark_entries, and keep the existing BenchmarkConfigError behavior (you may
include the original algorithm_filter value in the error text if desired).
- Around line 577-590: The _apply_profile_selection function currently returns
an empty list when a selected_profile filters out all benchmark_entries; update
it to raise an exception instead of silently returning nothing: after computing
include_tags and the filtered list (using benchmark_entries, profiles,
selected_profile, include_tags), if the resulting list is empty raise a
ValueError (or a custom exception) with a clear message that includes the
selected_profile and the include_tags so CI fails fast on mistagged/overly
narrow profiles.
- Around line 379-390: The numeric validators currently accept bools because
bool is a subclass of int; update the checks to explicitly reject booleans
before the isinstance checks: for the `version` validation (around
symbol/version check), modify the guard to first check `isinstance(value, bool)`
and raise BenchmarkConfigError if True, then continue with the existing
`isinstance(..., int)` or `isinstance(..., (int, float))` checks; do the same in
the loop that validates `n_reps` and `random_state` and the `test_split` check,
and also update `_normalize_int_list()` and `_normalize_shapes()` to reject
bools (check `isinstance(item, bool)` and raise) before allowing int/float
normalization so boolean values like True/False are not treated as 1/0.

In `@python/cuml/cuml/benchmark/run_benchmarks.py`:
- Around line 390-392: The bug is that run_benchmark's explicit_options default
(empty set) causes "--skip-gpu/--skip-cpu" to be treated as explicitly absent
and thus override args.skip_gpu/args.skip_cpu; change run_benchmark signature to
default explicit_options to None and treat None as "no explicit overrides" so
the code falls back to args.skip_gpu/args.skip_cpu. Update the checks that
compute allow_gpu_runs and allow_cpu_runs (and any other "--skip-*" membership
tests) to first check if explicit_options is None (use args.skip_* flags) or
else consult the provided explicit_options set; reference symbols:
run_benchmark, explicit_options, args.skip_gpu, args.skip_cpu, allow_gpu_runs,
allow_cpu_runs.
- Around line 275-278: In both _validate_benchmark_inputs and _validate_args add
the stacklevel=2 kwarg to the warnings.warn calls so the warning points to the
caller rather than the helper; locate the warnings.warn invocation in
_validate_benchmark_inputs (the message about "--input-type... Switching to
'numpy'") and the analogous warnings.warn in _validate_args and pass
stacklevel=2 to each call.

In `@wiki/BENCHMARK.md`:
- Around line 51-59: The docs show running run_benchmarks.py with a YAML config
but omit installing PyYAML; update the standalone prerequisites to list PyYAML
as a required package and instruct users to install it (e.g., via pip) before
running the example so that python/cuml/cuml/benchmark/config.py can call
yaml.safe_load() successfully; reference run_benchmarks.py and the
yaml.safe_load() usage in config.py so readers know why PyYAML is required.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 85ab12c0-638b-461b-86f7-e112e1fbab5e

📥 Commits

Reviewing files that changed from the base of the PR and between 146f616 and b9030fc.

📒 Files selected for processing (9)

python/cuml/cuml/benchmark/cli.py
python/cuml/cuml/benchmark/config.py
python/cuml/cuml/benchmark/configs/single_gpu.yaml
python/cuml/cuml/benchmark/configs/test.yaml
python/cuml/cuml/benchmark/run_benchmarks.py
python/cuml/cuml/benchmark/runners.py
python/cuml/tests/test_benchmark_config.py
python/cuml/tests/test_benchmark_runners.py
wiki/BENCHMARK.md

…ch-yaml

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@python/cuml/cuml/benchmark/config.py`:
- Around line 209-211: The current check only verifies type via
_is_int_value(version) so any integer (e.g., 2) is accepted; change the
validation in the config parsing (around version, _is_int_value,
BenchmarkConfigError) to reject unsupported manifest versions by enforcing the
allowed set (e.g., only 1) — after confirming version is an int, verify version
is in the supported_versions list/tuple and if not raise BenchmarkConfigError
with a clear message listing supported versions so future schema bumps fail
fast.
- Around line 487-491: The resolver currently only checks that test_split is
numeric; update load_and_resolve_config (where the block uses
_is_numeric_value(entry.get("test_split"))) to also validate that the numeric
value is within [0.0, 1.0] and raise BenchmarkConfigError (using benchmark_name)
if it is out of range; locate the check around the existing _is_numeric_value
call and replace/extend it so it first ensures numeric and then verifies 0.0 <=
float(entry.get("test_split")) <= 1.0, raising a clear BenchmarkConfigError if
the range check fails.

In `@wiki/BENCHMARK.md`:
- Around line 245-249: The bullets imply mutually exclusive tiers but profile
selection in config.py matches any shared tag, so update the wording for the
'default', 'extended', and 'nightly' profiles to state they are cumulative:
'extended' includes workloads from both 'default-profile' and 'extended-profile'
(i.e., default + extended row-count tiers) and 'nightly' includes workloads from
'default-profile', 'extended-profile', and 'nightly-profile' (i.e., all three
tiers); mention that selection matches any shared tag so profiles aggregate
rather than replace earlier tiers.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: fd348e6e-9351-4c34-9e37-0ea75bb5bdee

📥 Commits

Reviewing files that changed from the base of the PR and between b9030fc and 7970bb9.

📒 Files selected for processing (5)

conda/recipes/cuml/recipe.yaml
python/cuml/cuml/benchmark/config.py
python/cuml/cuml/benchmark/run_benchmarks.py
python/cuml/tests/test_benchmark_config.py
wiki/BENCHMARK.md

✅ Files skipped from review due to trivial changes (1)

conda/recipes/cuml/recipe.yaml

coderabbitai

🧹 Nitpick comments (1)

python/cuml/cuml/benchmark/config.py (1)
824-831: Consider adding strict=True to zip() for defensive programming.

Since Python 3.10, zip() has an optional strict keyword argument. While grid_keys and the unpacked combo tuple are guaranteed to have matching lengths (since both derive from the same grid dict and itertools.product maintains length consistency), adding strict=True provides defensive protection against potential bugs elsewhere. Given that cuML requires Python 3.11+, the parameter is fully available.
Suggested improvement
     for combo in combinations:
-        combo_dict = dict(zip(grid_keys, combo))
+        combo_dict = dict(zip(grid_keys, combo, strict=True))
         result.append({**fixed, **combo_dict})
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@python/cuml/cuml/benchmark/config.py` around lines 824 - 831, The zip call in
the combination builder should be made strict to guard against mismatched
lengths: update the dict creation that uses dict(zip(grid_keys, combo)) to pass
strict=True (i.e., dict(zip(grid_keys, combo, strict=True))) so combo and
grid_keys must match; locate this change around the variables grid_keys,
combinations, combo, and combo_dict in the combination generation block and add
strict=True to the zip invocation.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@python/cuml/cuml/benchmark/config.py`:
- Around line 824-831: The zip call in the combination builder should be made
strict to guard against mismatched lengths: update the dict creation that uses
dict(zip(grid_keys, combo)) to pass strict=True (i.e., dict(zip(grid_keys,
combo, strict=True))) so combo and grid_keys must match; locate this change
around the variables grid_keys, combinations, combo, and combo_dict in the
combination generation block and add strict=True to the zip invocation.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 3b7d7a84-2d24-4cec-8787-1d5c38ad0673

📥 Commits

Reviewing files that changed from the base of the PR and between 7970bb9 and c9e9c66.

📒 Files selected for processing (4)

python/cuml/cuml/benchmark/config.py
python/cuml/cuml/benchmark/configs/single_gpu.yaml
python/cuml/tests/test_benchmark_config.py
wiki/BENCHMARK.md

✅ Files skipped from review due to trivial changes (1)

python/cuml/cuml/benchmark/configs/single_gpu.yaml

🚧 Files skipped from review as they are similar to previous changes (1)

wiki/BENCHMARK.md

coderabbitai

🧹 Nitpick comments (1)

python/cuml/cuml/benchmark/config.py (1)
648-653: Consider validating tag types in _named_tags.

If explicit_tags is a non-list value (e.g., a bare string from YAML like tags: foo), line 652 creates [name, explicit_tags] without validation. While _validate_tag_list catches this elsewhere, this function could produce unexpected results if called with unvalidated input.

This is minor since the validation pipeline should catch malformed tags before reaching this point.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@python/cuml/cuml/benchmark/config.py` around lines 648 - 653, The helper
_named_tags currently appends non-list explicit_tags blindly which can produce
unexpected types; update _named_tags to validate the type when explicit_tags is
not None and not a list: if explicit_tags is a str return [name, explicit_tags],
otherwise raise a clear TypeError/ValueError (or coerce to str only if intended)
so callers like _validate_tag_list no longer receive malformed tag values;
reference _named_tags and _validate_tag_list to locate and enforce this
validation.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@python/cuml/cuml/benchmark/config.py`:
- Around line 648-653: The helper _named_tags currently appends non-list
explicit_tags blindly which can produce unexpected types; update _named_tags to
validate the type when explicit_tags is not None and not a list: if
explicit_tags is a str return [name, explicit_tags], otherwise raise a clear
TypeError/ValueError (or coerce to str only if intended) so callers like
_validate_tag_list no longer receive malformed tag values; reference _named_tags
and _validate_tag_list to locate and enforce this validation.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: ab3c6499-d2df-43b9-befc-d6be930ac399

📥 Commits

Reviewing files that changed from the base of the PR and between c9e9c66 and 3c15b4a.

📒 Files selected for processing (7)

python/cuml/cuml/benchmark/cli.py
python/cuml/cuml/benchmark/config.py
python/cuml/cuml/benchmark/configs/single_gpu.yaml
python/cuml/cuml/benchmark/configs/test.yaml
python/cuml/cuml/benchmark/run_benchmarks.py
python/cuml/tests/test_benchmark_config.py
wiki/BENCHMARK.md

✅ Files skipped from review due to trivial changes (2)

python/cuml/cuml/benchmark/configs/test.yaml
python/cuml/cuml/benchmark/configs/single_gpu.yaml

coderabbitai

🧹 Nitpick comments (1)

python/cuml/cuml/benchmark/run_benchmarks.py (1)

460-460: Consider adding strict=True to zip() for defensive coding.

Both lists are derived from the same source, so they should always have matching lengths. However, adding strict=True would catch any future bugs that might cause length mismatches.

Suggested fix

-    for entry, selected_backends in zip(benchmark_entries, entry_backends):
+    for entry, selected_backends in zip(benchmark_entries, entry_backends, strict=True):

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@python/cuml/cuml/benchmark/run_benchmarks.py` at line 460, The zip over
benchmark_entries and entry_backends may silently truncate if their lengths
diverge; update the iteration that currently reads "for entry, selected_backends
in zip(benchmark_entries, entry_backends):" to use strict pairing by passing
strict=True to zip so mismatched lengths raise an error (i.e., change to
zip(..., strict=True)); target the loop in run_benchmarks.py where
benchmark_entries and entry_backends are zipped to implement this defensive
check.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@python/cuml/cuml/benchmark/run_benchmarks.py`:
- Line 460: The zip over benchmark_entries and entry_backends may silently
truncate if their lengths diverge; update the iteration that currently reads
"for entry, selected_backends in zip(benchmark_entries, entry_backends):" to use
strict pairing by passing strict=True to zip so mismatched lengths raise an
error (i.e., change to zip(..., strict=True)); target the loop in
run_benchmarks.py where benchmark_entries and entry_backends are zipped to
implement this defensive check.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 353290e0-bf39-4a2c-84ed-85a5b599a305

📥 Commits

Reviewing files that changed from the base of the PR and between 3c15b4a and a00bec5.

📒 Files selected for processing (2)

python/cuml/cuml/benchmark/run_benchmarks.py
python/cuml/tests/test_benchmark_config.py

mgrauer · 2026-04-28T02:35:43Z

                )

        # Run CPU benchmark
        if run_cpu and algo_pair.has_cpu():


I ran this on my dgx spark - smoke test passed but single_gpu failed.

cuml: 26.06.00

cupy: 14.0.1

numpy: 2.4.3

sklearn: 1.8.0

scipy: 1.16.3

CUDA toolkit: 13.0

GPU: NVIDIA GB10, sm_121 (Blackwell, aarch64)

quoting Claude:

The bug: single_gpu.yaml sets input_type: cupy, which means all benchmark data is generated as CuPy (GPU) arrays. When the runner compared cuML against the sklearn CPU baseline, it passed those CuPy arrays directly to sklearn. CuPy 13.x removed implicit NumPy conversion, so sklearn's internal numpy.asarray(cupy_array) call raises TypeError instead of silently converting.

The fix: One line at the top of the CPU benchmark block in runners.py — convert data to NumPy before passing it to anything CPU-side:
cpu_data = datagen._convert_to_numpy(data)
_convert_to_numpy already existed and handles tuples recursively, so it covers the setup, run, and accuracy sections all at once.

# Run CPU benchmark if run_cpu and algo_pair.has_cpu(): + cpu_data = datagen._convert_to_numpy(data) setup_override = algo_pair.setup_cpu( - data, **param_overrides, **cpu_param_overrides + cpu_data, **param_overrides, **cpu_param_overrides ) cpu_timer = BenchmarkTimer(self.n_reps) for rep in cpu_timer.benchmark_runs(): cpu_model = algo_pair.run_cpu( - data, + cpu_data, **setup_override, ) cpu_elapsed = np.min(cpu_timer.timings) if algo_pair.accuracy_function: if algo_pair.cpu_data_prep_hook is not None: - X_test, y_test = algo_pair.cpu_data_prep_hook(data[2:]) + X_test, y_test = algo_pair.cpu_data_prep_hook(cpu_data[2:]) else: - X_test, y_test = data[2:] + X_test, y_test = cpu_data[2:] if hasattr(cpu_model, "predict"): y_pred_cpu = cpu_model.predict(X_test) else:

jcrist · 2026-04-29T20:20:10Z

    - numba >=0.60.0,<0.65.0
    - numba-cuda >=0.22.2,<0.29.0
    - numpy >=1.23,<3.0
+    - pyyaml


Is this benchmark CLI user-facing? Do we want to always require the dependencies for it to run be installed (like you added pyyaml here to the recipe)? Or should we instead error at runtime nicely and flag the extra dependency needed? I might vote for the latter since I suspect most users won't ever use this functionality.

Also, I suspect we might need a similar change in dependencies.yaml.

Agreed. This is user-facing, but we should not add pyyaml to our standard run dependencies.

jcrist · 2026-04-29T20:21:46Z

+try:
+    import yaml
+except ImportError as exc:  # pragma: no cover - dependency issue
+    raise RuntimeError(


Nit: this should be an ImportError, not a RuntimeError.

jcrist · 2026-04-29T20:22:54Z

+try:
+    from cuml.benchmark import algorithms
+except ImportError:
+    if not any("cuml/benchmark" in p for p in sys.path):
+        raise
+    import algorithms  # noqa: E402


Is this to hack around local execution instead of the packaged execution? If so, I'm against this - the executing version should always be the built version of the package, and not what's in the source tree.

We want to be able to run the benchmark suite without having cuML installed at all to generate the sklearn base-line.

jcrist · 2026-04-29T20:24:23Z

+    import algorithms  # noqa: E402
+
+
+TOP_LEVEL_KEYS = {"version", "suite", "profiles", "defaults", "benchmarks"}


No strong thoughts on the schema, but I would advise we use something like pydantic (or my library msgspec) to manage these rather than manually writing out the parsing and validation logic. It's easier to manage, easier to read, and would drastically reduce the amount of code added in this PR.

If we opt to make the deps for running the benchmarks not required, then any additional dep should be fine here. I'd vote for pydantic since it's ubiquitous and I'm not having to support it :).

csadorf

Quick initial review.

csadorf · 2026-04-23T15:50:13Z

+
+try:
+    import yaml
+except ImportError as exc:  # pragma: no cover - dependency issue


Suggested change

except ImportError as exc: # pragma: no cover - dependency issue

except ModuleNotFound as exc: # pragma: no cover - dependency issue

csadorf · 2026-04-24T20:28:09Z

    - numba >=0.60.0,<0.65.0
    - numba-cuda >=0.22.2,<0.29.0
    - numpy >=1.23,<3.0
+    - pyyaml


I don't think we should add pyaml to our runtime dependencies.

csadorf · 2026-04-29T21:44:17Z

    - numba >=0.60.0,<0.65.0
    - numba-cuda >=0.22.2,<0.29.0
    - numpy >=1.23,<3.0
+    - pyyaml


Agreed. This is user-facing, but we should not add pyyaml to our standard run dependencies.

csadorf · 2026-04-29T21:48:27Z

+try:
+    from cuml.benchmark import algorithms
+except ImportError:
+    if not any("cuml/benchmark" in p for p in sys.path):
+        raise
+    import algorithms  # noqa: E402


We want to be able to run the benchmark suite without having cuML installed at all to generate the sklearn base-line.

csadorf · 2026-04-29T21:48:37Z

+    import algorithms  # noqa: E402
+
+
+TOP_LEVEL_KEYS = {"version", "suite", "profiles", "defaults", "benchmarks"}


FEA first commit for YAML-driven benchmark suite configuration and ex…

cda21e7

…ecution

dantegd added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Apr 21, 2026

github-actions Bot assigned dantegd Apr 21, 2026

github-actions Bot added the Cython / Python Cython or Python issue label Apr 21, 2026

dantegd added 3 commits April 22, 2026 16:02

ENH improve validation of required runtime fields after resolving ben…

1b5625c

…chmark defaults

FIX precommit fixes

ea3c199

Merge branch 'main' into fea-bench-yaml

b9030fc

dantegd marked this pull request as ready for review April 22, 2026 21:13

dantegd requested a review from a team as a code owner April 22, 2026 21:13

dantegd requested a review from jcrist April 22, 2026 21:13

coderabbitai Bot reviewed Apr 22, 2026

View reviewed changes

csadorf self-requested a review April 23, 2026 15:42

dantegd added 2 commits April 23, 2026 18:02

FIX PR review comments

45428aa

Merge branch 'fea-bench-yaml' of github.com:dantegd/cuml into fea-ben…

7970bb9

…ch-yaml

dantegd requested a review from a team as a code owner April 23, 2026 23:03

dantegd requested a review from bdice April 23, 2026 23:03

github-actions Bot added the conda conda issue label Apr 23, 2026

coderabbitai Bot reviewed Apr 23, 2026

View reviewed changes

Comment thread python/cuml/cuml/benchmark/config.py

Comment thread python/cuml/cuml/benchmark/config.py

Comment thread wiki/BENCHMARK.md

dantegd added 2 commits April 23, 2026 18:28

Merge branch 'main' into fea-bench-yaml

696d5bb

FIX PR review comments

c9e9c66

coderabbitai Bot reviewed Apr 24, 2026

View reviewed changes

ENH use backends instead of skip flags for better expressiveness

3c15b4a

coderabbitai Bot reviewed Apr 24, 2026

View reviewed changes

FIX bug when using backend cpu,gpu and add pytests for cli args

a00bec5

coderabbitai Bot reviewed Apr 28, 2026

View reviewed changes

mgrauer reviewed Apr 28, 2026

View reviewed changes

dantegd mentioned this pull request Apr 28, 2026

Default CPU benchmark estimators to max parallelism #8025

Draft

This was referenced Apr 28, 2026

Improve benchmark progress output formatting #8026

Draft

Add JSON output for benchmark results #8027

Draft

[TRACKER] cuml.benchmark tool improvements #8028

Open

jcrist reviewed Apr 29, 2026

View reviewed changes

csadorf requested changes Apr 29, 2026

View reviewed changes

		import algorithms # noqa: E402


		TOP_LEVEL_KEYS = {"version", "suite", "profiles", "defaults", "benchmarks"}

	except ImportError as exc: # pragma: no cover - dependency issue
	except ModuleNotFound as exc: # pragma: no cover - dependency issue

Conversation

dantegd commented Apr 21, 2026

Uh oh!

copy-pr-bot Bot commented Apr 21, 2026

Uh oh!

coderabbitai Bot commented Apr 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Summary by CodeRabbit

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

csadorf left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

coderabbitai Bot commented Apr 22, 2026 •

edited

Loading