Don't call to_output on a cupy array#7044
Don't call to_output on a cupy array#7044rapids-bot[bot] merged 5 commits intorapidsai:branch-25.10from
Conversation
|
|
||
| def _transform_one(transformer, X, y, weight, **fit_params): | ||
| res = transformer.transform(X).to_output('cupy') | ||
| res = transformer.transform(X) |
There was a problem hiding this comment.
If this is the right fix, where should I put the appropriate unit test?
There was a problem hiding this comment.
Tests for ColumnTransformer are in cuml/tests/test_compose.py, that's where I'd add a test for this.
jcrist
left a comment
There was a problem hiding this comment.
Thanks for the PR! I think we can do a simpler fix here (as documented below), but we definitely should do some follow-up work to make cuml.preprocessing estimators better follow our type reflecting conventions.
| res = transformer.transform(X).to_output('cupy') | ||
| res = transformer.transform(X) | ||
|
|
||
| if isinstance(res, cpu_np.ndarray): |
There was a problem hiding this comment.
There are two issues here:
OneHotEncoderdoesn't follow our type reflection conventions (I think much ofcuml.preprocessingalso has this issue).ColumnTransformerneeds to be more robust to consuming disparate output types from transformers (if nothing else, user defined transformers may return a variety of types).
I think the best immediate fix for this issue is something like:
with cuml.using_output_type("cupy"):
return transformer.transform(X)
This should ensure that any properly implemented cuml transformer returns a cupy array (and not a CumlArray or something else). This is already done for _fit_transform_one below, but not for _transform_one here.
The consumer of this output then handles possibly sparse outputs so we don't want to error here if the return type is somethign else. So the branching structure here in _transform_one isn't needed.
There was a problem hiding this comment.
Thanks! Do you mind filing an issue for the follow-up work?
|
|
||
| def _transform_one(transformer, X, y, weight, **fit_params): | ||
| res = transformer.transform(X).to_output('cupy') | ||
| res = transformer.transform(X) |
There was a problem hiding this comment.
Tests for ColumnTransformer are in cuml/tests/test_compose.py, that's where I'd add a test for this.
|
Hi, original submitter of #7039 here. Given that the last two commits were simply merging from |
6e6bfdb to
9a1e368
Compare
jcrist
left a comment
There was a problem hiding this comment.
Apologies for dropping this earlier. I've pushed a small test improvement, this LGTM!
|
/merge |
|
Blocked by #7230 . |
|
@Matt711 I am very sorry that it took us so long to be able to accept and merge your contribution. Thank you very much for your patience! |
AFAIK
cuml/python/cuml/cuml/_thirdparty/sklearn/preprocessing/_column_transformer.py
Lines 292 to 297 in 242829b
cuml/python/cuml/cuml/preprocessing/encoders.py
Lines 443 to 447 in 242829b
transformer.transform(X)returns acupyarray, so we shouldn't need to callto_output('cupy')on it.