Skip to content

Remove deprecation warning in TargetEncoder#7892

Merged
rapids-bot[bot] merged 2 commits intorapidsai:release/26.04from
jcrist:remove-targetencoder-depr
Mar 13, 2026
Merged

Remove deprecation warning in TargetEncoder#7892
rapids-bot[bot] merged 2 commits intorapidsai:release/26.04from
jcrist:remove-targetencoder-depr

Conversation

@jcrist
Copy link
Copy Markdown
Member

@jcrist jcrist commented Mar 13, 2026

This removes the deprecated 1D output handling in TargetEncoder.

Fixes #7890.

This removes the deprecated 1D output handling in `TargetEncoder`.
@jcrist jcrist self-assigned this Mar 13, 2026
@jcrist jcrist requested a review from a team as a code owner March 13, 2026 20:33
@jcrist jcrist requested a review from betatim March 13, 2026 20:33
@jcrist jcrist added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Mar 13, 2026
@github-actions github-actions Bot added the Cython / Python Cython or Python issue label Mar 13, 2026
@jcrist jcrist requested a review from csadorf March 13, 2026 20:33
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 13, 2026

📝 Walkthrough

Summary by CodeRabbit

  • Bug Fixes

    • Removed a spurious deprecation warning about 1D outputs in combination mode.
  • Refactor

    • TargetEncoder now consistently returns 2D arrays with shape (n_samples, 1), simplifying output across fit/transform paths.
  • Documentation

    • Updated example and docstring to reflect the new consistent 2D output behavior.

Walkthrough

Removed a module-level deprecation warning from TargetEncoder and made output handling consistently return 2D numpy arrays; tests updated to request output_type="numpy" and expect column-vector outputs.

Changes

Cohort / File(s) Summary
Core TargetEncoder Implementation
python/cuml/preprocessing/TargetEncoder.py
Removed module-level flag and one-time deprecation/FutureWarning for 1D outputs. Simplified _impute_and_sort to always return a 2D array with shape (n_samples, 1). Updated class docstring/example to reflect uniform 2D output.
Test Updates
python/cuml/tests/test_target_encoder.py
Updated test instantiations to pass output_type="numpy" and adjusted expected results to column vectors (using [:, None]) across fit/transform/fit_transform and single-/multi-column scenarios.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Suggested reviewers

  • divyegala
  • csadorf
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 44.44% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Remove deprecation warning in TargetEncoder' clearly and directly summarizes the main change in the pull request, which is the removal of deprecated 1D output handling.
Description check ✅ Passed The description 'This removes the deprecated 1D output handling in TargetEncoder. Fixes #7890.' is directly related to the changes shown in the code modifications and properly references the linked issue.
Linked Issues check ✅ Passed The PR successfully removes the deprecated 1D output handling by simplifying the _impute_and_sort method to always reshape to 2D, eliminating the deprecation warning and associated logic, which meets the primary objective of issue #7890.
Out of Scope Changes check ✅ Passed The changes are narrowly scoped to removing deprecated code and updating tests to use the new output behavior; all modifications align with the stated objective of eliminating the deprecation warning.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
📝 Coding Plan
  • Generate coding plan for human review comments

Comment @coderabbitai help to get the list of available commands and usage tips.

Tip

CodeRabbit can scan for known vulnerabilities in your dependencies using OSV Scanner.

OSV Scanner will automatically detect and report security vulnerabilities in your project's dependencies. No additional configuration is required.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@python/cuml/tests/test_target_encoder.py`:
- Line 91: Replace the unsafe eval usage when calling a cupy statistical
function: instead of eval(f"cp.{stat}")(y).item(), use getattr(cp, stat) to look
up the attribute at runtime and then call it with y (i.e. call getattr(cp,
stat)(y).item()). Update the expression in the assignment to answer to use
getattr(cp, stat) with the same arguments and .item() behavior so behavior
remains identical.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 3cc892ff-dc1f-4d66-b46f-3323884baf08

📥 Commits

Reviewing files that changed from the base of the PR and between 3cc7c9c and 00022ea.

📒 Files selected for processing (2)
  • python/cuml/cuml/preprocessing/TargetEncoder.py
  • python/cuml/tests/test_target_encoder.py

Comment thread python/cuml/tests/test_target_encoder.py Outdated
Comment thread python/cuml/tests/test_target_encoder.py
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
python/cuml/tests/test_target_encoder.py (2)

284-289: Use NumPy for expected values when output_type="numpy" is set.

Line 288 currently builds ans with CuPy, which works but is a bit inconsistent with the explicit host-output expectation.

Small consistency tweak
-    ans = cp.asarray([0, 1, 0.5, 0.5])[:, None]
+    ans = np.array([0, 1, 0.5, 0.5])[:, None]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@python/cuml/tests/test_target_encoder.py` around lines 284 - 289, The
expected array for the test should be created with NumPy rather than CuPy when
TargetEncoder is instantiated with output_type="numpy"; change the construction
of ans in the test_target_encoder.py snippet (where t_enc =
TargetEncoder(output_type="numpy"), t_enc.fit(X, y), train_encoded =
t_enc.transform(X)) to build ans using numpy (e.g., numpy.asarray) so the
comparison via array_equal compares host arrays consistently with the transform
output.

29-317: Consider centralizing the temporary output_type="numpy" setup.

This pattern is repeated many times; a tiny helper (or fixture) would make the eventual #7893 revert a one-liner and reduce churn in this file.

Refactor sketch
+def _te_numpy(**kwargs):
+    return TargetEncoder(output_type="numpy", **kwargs)
...
-    encoder = TargetEncoder(output_type="numpy")
+    encoder = _te_numpy()
...
-    encoder = TargetEncoder(stat="median", output_type="numpy")
+    encoder = _te_numpy(stat="median")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@python/cuml/tests/test_target_encoder.py` around lines 29 - 317, Many tests
repeatedly pass TargetEncoder(output_type="numpy"); create a small helper or
pytest fixture (e.g., numpy_encoder or target_encoder_numpy) that returns a
TargetEncoder configured with output_type="numpy" and replace direct
constructions in tests like test_targetencoder_fit_transform,
test_targetencoder_transform, test_targetencoder_multi_column,
test_targetencoder_newly_encountered, test_one_category,
test_targetencoder_pandas, test_targetencoder_numpy, test_targetencoder_cupy,
test_targetencoder_smooth, test_targetencoder_customized_fold_id,
test_targetencoder_var, test_transform_with_index, and test_targetencoder_median
with calls to that helper/fixture to centralize the configuration and make
future changes (like reverting `#7893`) a one-line update.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@python/cuml/tests/test_target_encoder.py`:
- Around line 284-289: The expected array for the test should be created with
NumPy rather than CuPy when TargetEncoder is instantiated with
output_type="numpy"; change the construction of ans in the
test_target_encoder.py snippet (where t_enc =
TargetEncoder(output_type="numpy"), t_enc.fit(X, y), train_encoded =
t_enc.transform(X)) to build ans using numpy (e.g., numpy.asarray) so the
comparison via array_equal compares host arrays consistently with the transform
output.
- Around line 29-317: Many tests repeatedly pass
TargetEncoder(output_type="numpy"); create a small helper or pytest fixture
(e.g., numpy_encoder or target_encoder_numpy) that returns a TargetEncoder
configured with output_type="numpy" and replace direct constructions in tests
like test_targetencoder_fit_transform, test_targetencoder_transform,
test_targetencoder_multi_column, test_targetencoder_newly_encountered,
test_one_category, test_targetencoder_pandas, test_targetencoder_numpy,
test_targetencoder_cupy, test_targetencoder_smooth,
test_targetencoder_customized_fold_id, test_targetencoder_var,
test_transform_with_index, and test_targetencoder_median with calls to that
helper/fixture to centralize the configuration and make future changes (like
reverting `#7893`) a one-line update.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 361286df-8a63-4648-8a03-49095954bd0f

📥 Commits

Reviewing files that changed from the base of the PR and between 00022ea and 9cfa78c.

📒 Files selected for processing (1)
  • python/cuml/tests/test_target_encoder.py

@jcrist
Copy link
Copy Markdown
Member Author

jcrist commented Mar 13, 2026

/merge

@rapids-bot rapids-bot Bot merged commit 4f00f71 into rapidsai:release/26.04 Mar 13, 2026
91 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Cython / Python Cython or Python issue improvement Improvement / enhancement to an existing function non-breaking Non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants