-
Notifications
You must be signed in to change notification settings - Fork 1.7k
feat: Enhance array_slice functionality to support ListView and LargeListView types
#18432
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
cc @brancz |
array_slice functionality to support ListView and LargeListView types
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice refactor! The SlicePlan and extraction of functions makes the rest of the code a lot more readable.
| let field = match array.data_type() { | ||
| ListView(field) | LargeListView(field) => Arc::clone(field), | ||
| other => { | ||
| return Err(internal_datafusion_err!( | ||
| "array_slice got unexpected data type: {}", | ||
| other | ||
| )); | ||
| } | ||
| }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest moving this to before the slice to avoid unnecessary work
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch!
| ); | ||
| } | ||
|
|
||
| #[test] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice tests! This PR for reverse of FixedSizeList also added sqllogictest cases https://github.com/apache/datafusion/pull/16423/files#diff-317c67cc9ce87268e4ccec1cb75316eed82f99ae2ffc226874e5897913ffa4c8
It doesn't look like that is currently possible for ListView on arrow-rs 57, as it hits this path, which did not have a branch for ListView at the time of release
https://github.com/apache/arrow-rs/blob/062d766a9c3070d191d1a1fd0baca01b9d13994f/arrow-schema/src/datatype_parse.rs#L70-L96 (but it was added recently: apache/arrow-rs#8649)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we can add to SLTs and expect it to fail on the cast (leaving the expected results as comments) so when we bump to arrow with that fix we will automatically know
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the introduction of SlicePlan
| } | ||
| } | ||
|
|
||
| fn adjusted_from_index<O: OffsetSizeTrait>(index: i64, len: O) -> Result<Option<O>> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this + adjusted_to_index are pulled out from their inner function general_array_slice verbatim without changes 👍
(Just a note for myself so I know there isn't an actual diff on the function)
| let field = match array.data_type() { | ||
| ListView(field) | LargeListView(field) => Arc::clone(field), | ||
| other => { | ||
| return internal_err!("array_slice got unexpected data type: {}", other); | ||
| } | ||
| }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| let field = match array.data_type() { | |
| ListView(field) | LargeListView(field) => Arc::clone(field), | |
| other => { | |
| return internal_err!("array_slice got unexpected data type: {}", other); | |
| } | |
| }; | |
| let field = match array.data_type() { | |
| ListView(field) | LargeListView(field) => Arc::clone(field), | |
| _ => unreachable!() | |
| }; |
Given array is most definitely a GenericListViewArray
Though I wonder why we handle this differently to how its done in general_array_slice 🤔
| len, | ||
| from_array.value(row_index), | ||
| to_array.value(row_index), | ||
| stride.map(|s| s.value(row_index)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't check for stride nullability
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
| return Ok(SlicePlan::Empty); | ||
| }; | ||
|
|
||
| let stride_value = stride_raw.unwrap_or(1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Null strides are assumed 1 by default? Is this intended?
defabd2 to
9dee50f
Compare
Which issue does this PR close?
Rationale for this change
array_sliceacceptsListView/LargeListViewinputs.What changes are included in this PR?
ListView/LargeListViewarrays directly.SlicePlan.Are these changes tested?
Yes
Are there any user-facing changes?
Yes.
array_slicenow acceptsListViewandLargeListViewarrays without requiring an explicit cast.