Skip to content

Convert np.dtype(str) to np.dtype(object) for cuDF#21354

Closed
mroeschke wants to merge 5 commits intorapidsai:mainfrom
mroeschke:ref/cudf/np_str
Closed

Convert np.dtype(str) to np.dtype(object) for cuDF#21354
mroeschke wants to merge 5 commits intorapidsai:mainfrom
mroeschke:ref/cudf/np_str

Conversation

@mroeschke
Copy link
Copy Markdown
Contributor

Description

cuML needed to work around a bug in rapidsai/cuml#7762 probably caused by #21281 where we were allowing np.dtype("str") though to our column logic. Generally pandas doesn't have support for this type and converts to np.dtype(object) to represent string instead which is what (IMO) cuDF should do too.

I have historically though that cuDF should disallow object type because it can mean "PyObject" type in pandas which we don't support. Now I'm starting to go backwards and think maybe cuDF should always just interpret it as string. For the "PyObject" cases in pandas, we can maybe just document this as an expected difference when using cudf.pandas

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@mroeschke mroeschke self-assigned this Feb 5, 2026
@mroeschke mroeschke requested a review from a team as a code owner February 5, 2026 22:21
@mroeschke mroeschke added bug Something isn't working Python Affects Python cuDF API. labels Feb 5, 2026
@mroeschke mroeschke added the non-breaking Non-breaking change label Feb 5, 2026
@mroeschke mroeschke requested a review from Matt711 February 5, 2026 22:21
@GPUtester GPUtester moved this to In Progress in cuDF Python Feb 5, 2026
@mroeschke
Copy link
Copy Markdown
Contributor Author

I think we need a different approach here. I think we need to only convert str to object when constructing a column but keep this in astype so closing

@mroeschke mroeschke closed this Feb 10, 2026
@github-project-automation github-project-automation Bot moved this from In Progress to Done in cuDF Python Feb 10, 2026
@mroeschke mroeschke deleted the ref/cudf/np_str branch February 10, 2026 18:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working non-breaking Non-breaking change Python Affects Python cuDF API.

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

2 participants