Fix type casting in Series.__setitem__#11904
Fix type casting in Series.__setitem__#11904rapids-bot[bot] merged 10 commits intorapidsai:branch-22.12from
Conversation
To mimic pandas, we must upcast a column to the numpy result_type of the column itself and the input value dtype. This previously occurred in all relevant cases except when the index provided to __setitem__ was a single integer (originally introduced in rapidsai#2442). Closes rapidsai#11901.
|
I have marked as non-breaking; technically this changes user-facing behaviour so one could argue that it is a breaking change (although I hope no-one was relying on the previous behaviour). |
|
rerun tests |
Only string and non-decimal numeric columns should try and up-cast.
|
Fixed the test fail 🤞 |
Codecov ReportBase: 87.40% // Head: 88.06% // Increases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## branch-22.12 #11904 +/- ##
================================================
+ Coverage 87.40% 88.06% +0.65%
================================================
Files 133 135 +2
Lines 21833 22003 +170
================================================
+ Hits 19084 19376 +292
+ Misses 2749 2627 -122
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
If the desired dtype is the same as ours, we can just return ourselves. Surely no-one is relying on foo.astype(foo.dtype) being a copy.
|
I think to step back a bit here, I think we need to figure out how much of pandas behaviour we are going to mimic here. This PR contains the approximately minimal set of changes to allow setting a single entry in a numeric column with a value, but there are other things that will still not work correctly. I can try and collate all of those and then fix in this PR, but it is probably better to collate the missing cases, decide which we want to support and then fix in a followup. |
|
@bdice any strong opinion one way or the other about how much of this thread I unravel in this PR as opposed to doing the smaller thing here and then going round again? My +ε preference is to do the minimal thing here (i.e. not changing behaviour over and above enabling |
|
@wence- I'm happy to make incremental changes. No need to fix everything at once. |
|
@bdice I think this is now good to go if you have time for a final look. |
bdice
left a comment
There was a problem hiding this comment.
LGTM. Thanks for the iterations on this and for follow-up issue documentation.
Co-authored-by: Bradley Dice <[email protected]>
|
@gpucibot merge |
|
Waiting on #12067 |
|
rerun tests |
Description
To mimic pandas, we must upcast a column to the numpy result_type of the column itself and the input value dtype. This previously occurred in all relevant cases except when the index provided to setitem was a single integer (originally introduced in #2442). Closes #11901.
Checklist