Convert np.dtype(str) to np.dtype(object) for cuDF by mroeschke · Pull Request #21354 · rapidsai/cudf

mroeschke · 2026-02-05T22:21:41Z

Description

cuML needed to work around a bug in rapidsai/cuml#7762 probably caused by #21281 where we were allowing np.dtype("str") though to our column logic. Generally pandas doesn't have support for this type and converts to np.dtype(object) to represent string instead which is what (IMO) cuDF should do too.

I have historically though that cuDF should disallow object type because it can mean "PyObject" type in pandas which we don't support. Now I'm starting to go backwards and think maybe cuDF should always just interpret it as string. For the "PyObject" cases in pandas, we can maybe just document this as an expected difference when using cudf.pandas

Checklist

I am familiar with the Contributing Guidelines.
New or existing tests cover these changes.
The documentation is up to date with these changes.

mroeschke · 2026-02-10T18:36:04Z

I think we need a different approach here. I think we need to only convert str to object when constructing a column but keep this in astype so closing

mroeschke added 3 commits February 5, 2026 21:39

Convert np.dtype(str) to np.dtype(object) for cuDF

cac040c

Align DatetimeColumn.as_string_column to pandas behavior, fix tests

2ad4e3b

Add fixture

1beef66

mroeschke self-assigned this Feb 5, 2026

mroeschke requested a review from a team as a code owner February 5, 2026 22:21

mroeschke added bug Something isn't working Python Affects Python cuDF API. labels Feb 5, 2026

mroeschke requested a review from galipremsagar February 5, 2026 22:21

mroeschke added the non-breaking Non-breaking change label Feb 5, 2026

mroeschke requested a review from Matt711 February 5, 2026 22:21

github-project-automation Bot added this to cuDF Python Feb 5, 2026

GPUtester moved this to In Progress in cuDF Python Feb 5, 2026

mroeschke added 2 commits February 5, 2026 22:27

Merge remote-tracking branch 'upstream/main' into ref/cudf/np_str

91b432e

Adjust more unit test for enabling pandas_compat behavior

a72a6ad

mroeschke closed this Feb 10, 2026

github-project-automation Bot moved this from In Progress to Done in cuDF Python Feb 10, 2026

mroeschke deleted the ref/cudf/np_str branch February 10, 2026 18:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Convert np.dtype(str) to np.dtype(object) for cuDF#21354

Convert np.dtype(str) to np.dtype(object) for cuDF#21354
mroeschke wants to merge 5 commits intorapidsai:mainfrom
mroeschke:ref/cudf/np_str

mroeschke commented Feb 5, 2026

Uh oh!

mroeschke commented Feb 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

mroeschke commented Feb 5, 2026

Description

Checklist

Uh oh!

mroeschke commented Feb 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants