Skip to content

Conversation

@guojidan
Copy link
Contributor

@guojidan guojidan commented Mar 8, 2024

Which issue does this PR close?

Closes #9322.

Rationale for this change

What changes are included in this PR?

move make_array array_append array_prepend array_concat function to datafusion-functions-array crate

Are these changes tested?

Are there any user-facing changes?

@github-actions github-actions bot added sql SQL Planner logical-expr Logical plan and expressions physical-expr Changes to the physical-expr crates optimizer Optimizer rules labels Mar 8, 2024
@guojidan
Copy link
Contributor Author

guojidan commented Mar 8, 2024

because #9343 pr lasting for too long,difficult to rebase or merge, so I open a new pr

Copy link
Contributor

@jayzhan211 jayzhan211 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I take a quick glance and I think it is ready to merge.

There are 2 things that I'm trying to figure out to improve further

  1. Make sure we have Scalar that we need make_scalar_function_with_hints, since I think we should only get Array for array functions, if we really have Scalar, it seems something wrong to me.
  2. Simplify cfg feature flag, I would like to change OperatorToFunction to OperatorToArrayFunciton and see whether we can import the entire rule (file) optionally.

They don't need to be solved in this PR, but I think they are possible to be improved

chrono = { workspace = true }
datafusion-common = { workspace = true, default-features = true }
datafusion-expr = { workspace = true }
datafusion-functions-array = { workspace = true, optional = true }
Copy link
Contributor

@jayzhan211 jayzhan211 Mar 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alamb I think having optional importing is fine. Is there any other downside to this?

@jayzhan211
Copy link
Contributor

let me rebase and fix the conflicts

Signed-off-by: jayzhan211 <[email protected]>
@jayzhan211
Copy link
Contributor

  1. Make sure we have Scalar that we need make_scalar_function_with_hints, since I think we should only get Array for array functions, if we really have Scalar, it seems something wrong to me.

I think we need make_scalar_function_with_hints, Scalar::List are represented in ColumnValue::Scalar, and others are represented in ColumnValue::Array. Although we don't do the optimized Scalar path for array functions, to be consistent with the definition of ColumnValue, it may be nice be convert it to ColumnValue::Scalar if we got Scalar::List.

Copy link
Member

@Weijun-H Weijun-H left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. thanks @guojidan

Comment on lines 404 to 406
array element, // arg name
"appends an element to the end of an array.", // doc
array_append_udf // internal function name
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
array element, // arg name
"appends an element to the end of an array.", // doc
array_append_udf // internal function name
array element,
"appends an element to the end of an array.",
array_append_udf

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that we can get rid of these comments since the structure is apparent.

Comment on lines 458 to 460
element array, // arg name
"Prepends an element to the beginning of an array.", // doc
array_prepend_udf // internal function name
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
element array, // arg name
"Prepends an element to the beginning of an array.", // doc
array_prepend_udf // internal function name
element array,
"Prepends an element to the beginning of an array.",
array_prepend_udf

Comment on lines 512 to 513
"Concatenates arrays.", // doc
array_concat_udf // internal function name
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"Concatenates arrays.", // doc
array_concat_udf // internal function name
"Concatenates arrays.",
array_concat_udf

Comment on lines 590 to 591
"Returns an Arrow array using the specified input expressions.", // doc
make_array_udf // internal function name
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"Returns an Arrow array using the specified input expressions.", // doc
make_array_udf // internal function name
"Returns an Arrow array using the specified input expressions.",
make_array_udf

@jayzhan211
Copy link
Contributor

jayzhan211 commented Mar 9, 2024

make_scalar_function_with_hints

but maybe we should call another name, it is not the same as make_scalar_function_with_hints in physical expr.

@jayzhan211
Copy link
Contributor

@Weijun-H I think we can merge this and improve others on follow up PR

@jayzhan211
Copy link
Contributor

but I have no idea why the test is running endless

Signed-off-by: jayzhan211 <[email protected]>
@jayzhan211
Copy link
Contributor

I want to split them to different files to avoid crazy conflict

@jayzhan211
Copy link
Contributor

jayzhan211 commented Mar 9, 2024

Upd2: I found out the reason and solution.
Upd: select array_ndims([[[[[[[[[[[[[[[[[[[[[1]]]]]]]]]]]]]]]]]]]]]); is running super slow

It seems super slow to run array.slt with this PR

2031259 has been canceled after 6hr, so the issue might be the change in this PR.

@jayzhan211
Copy link
Contributor

@guojidan Besides fixing conflicts. I split those functions to their own files, and rename to make_scalar_funciton (there should be a better name).

let me merge this PR first, and file another PR for any other changes.

Thanks @guojidan and @Weijun-H

@jayzhan211 jayzhan211 merged commit 88187d4 into apache:main Mar 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

logical-expr Logical plan and expressions optimizer Optimizer rules physical-expr Changes to the physical-expr crates sql SQL Planner

Projects

None yet

Development

Successfully merging this pull request may close these issues.

move make_array array_append array_prepend array_concat function to datafusion-functions-array crate

3 participants