-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Move code to fold Stable functions like now() from Simplifier to ConstEvaluator
#1176
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| let mut const_evaluator = utils::ConstEvaluator::new(execution_props); | ||
|
|
||
| match plan { | ||
| LogicalPlan::Filter { predicate, input } => Ok(LogicalPlan::Filter { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no reason to special case LogicalPlan::Filter as the predicate is handled by LogicalPlan::expressions -- and if you look carefully this doesn't call rewrite using the const_evaluator (I totally missed this in #1153 ) but found it while updating tests in this PR
| Expr::Not(inner) | ||
| } | ||
| } | ||
| // convert now() --> the time in `ExecutionProps` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The point of this PR is to remove this code (it is now handled by ConstEvaluator)
| } | ||
| } | ||
|
|
||
| #[test] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
covered in the tests for ConstEvaluator
| let expected = "Filter: TimestampNanosecond(1599566400000000000) < CAST(totimestamp(Utf8(\"2020-09-08T12:05:00+00:00\")) AS Int64) + Int32(50000)\ | ||
| // Note that constant folder runs and folds the entire | ||
| // expression down to a single constant (true) | ||
| let expected = "Filter: Boolean(true)\ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The filter expression has been totally simplified 🎉
| let plan = LogicalPlanBuilder::from(table_scan) | ||
| .filter( | ||
| now_expr() | ||
| cast_to_int64_expr(now_expr()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will be pretty awesome when we get #194 so casting to do timestamp arithmetic is no longer needed
| // To evaluate stable functions, need ExecutionProps, see | ||
| // Simplifier for code that does that. | ||
| Volatility::Stable => false, | ||
| // Values for functions such as now() are taken from ExecutionProps |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is the actual change that allows the const evaluator to replace now() with a constant.
now() from Simplifier to ConstEvaluatornow() from Simplifier to ConstEvaluator
75d0b21 to
8c1d856
Compare
now() from Simplifier to ConstEvaluatornow() from Simplifier to ConstEvaluator
|
This one is now ready for review @rdettai @houqp and @Dandandan |
rdettai
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! thanks Andrew 😃
* feat: add support for array_contains expression * test: add unit test for array_contains function * Removes unnecessary case expression for handling null values * chore: Move more expressions from core crate to spark-expr crate (apache#1152) * move aggregate expressions to spark-expr crate * move more expressions * move benchmark * normalize_nan * bitwise not * comet scalar funcs * update bench imports * remove dead code (apache#1155) * fix: Spark 4.0-preview1 SPARK-47120 (apache#1156) ## Which issue does this PR close? Part of apache/datafusion-comet#372 and apache/datafusion-comet#551 ## Rationale for this change To be ready for Spark 4.0 ## What changes are included in this PR? This PR fixes the new test SPARK-47120 added in Spark 4.0 ## How are these changes tested? tests enabled * chore: Move string kernels and expressions to spark-expr crate (apache#1164) * Move string kernels and expressions to spark-expr crate * remove unused hash kernel * remove unused dependencies * chore: Move remaining expressions to spark-expr crate + some minor refactoring (apache#1165) * move CheckOverflow to spark-expr crate * move NegativeExpr to spark-expr crate * move UnboundColumn to spark-expr crate * move ExpandExec from execution::datafusion::operators to execution::operators * refactoring to remove datafusion subpackage * update imports in benches * fix * fix * chore: Add ignored tests for reading complex types from Parquet (apache#1167) * Add ignored tests for reading structs from Parquet * add basic map test * add tests for Map and Array * feat: Add Spark-compatible implementation of SchemaAdapterFactory (apache#1169) * Add Spark-compatible SchemaAdapterFactory implementation * remove prototype code * fix * refactor * implement more cast logic * implement more cast logic * add basic test * improve test * cleanup * fmt * add support for casting unsigned int to signed int * clippy * address feedback * fix test * fix: Document enabling comet explain plan usage in Spark (4.0) (apache#1176) * test: enabling Spark tests with offHeap requirement (apache#1177) ## Which issue does this PR close? ## Rationale for this change After apache/datafusion-comet#1062 We have not running Spark tests for native execution ## What changes are included in this PR? Removed the off heap requirement for testing ## How are these changes tested? Bringing back Spark tests for native execution * feat: Improve shuffle metrics (second attempt) (apache#1175) * improve shuffle metrics * docs * more metrics * refactor * address feedback * fix: stddev_pop should not directly return 0.0 when count is 1.0 (apache#1184) * add test * fix * fix * fix * feat: Make native shuffle compression configurable and respect `spark.shuffle.compress` (apache#1185) * Make shuffle compression codec and level configurable * remove lz4 references * docs * update comment * clippy * fix benches * clippy * clippy * disable test for miri * remove lz4 reference from proto * minor: move shuffle classes from common to spark (apache#1193) * minor: refactor decodeBatches to make private in broadcast exchange (apache#1195) * minor: refactor prepare_output so that it does not require an ExecutionContext (apache#1194) * fix: fix missing explanation for then branch in case when (apache#1200) * minor: remove unused source files (apache#1202) * chore: Upgrade to DataFusion 44.0.0-rc2 (apache#1154) * move aggregate expressions to spark-expr crate * move more expressions * move benchmark * normalize_nan * bitwise not * comet scalar funcs * update bench imports * save * save * save * remove unused imports * clippy * implement more hashers * implement Hash and PartialEq * implement Hash and PartialEq * implement Hash and PartialEq * benches * fix ScalarUDFImpl.return_type failure * exclude test from miri * ignore correct test * ignore another test * remove miri checks * use return_type_from_exprs * Revert "use return_type_from_exprs" This reverts commit febc1f1ec1301f9b359fc23ad6a117224fce35b7. * use DF main branch * hacky workaround for regression in ScalarUDFImpl.return_type * fix repo url * pin to revision * bump to latest rev * bump to latest DF rev * bump DF to rev 9f530dd * add Cargo.lock * bump DF version * no default features * Revert "remove miri checks" This reverts commit 4638fe3aa5501966cd5d8b53acf26c698b10b3c9. * Update pin to DataFusion e99e02b * update pin * Update Cargo.toml Bump to 44.0.0-rc2 * update cargo lock * revert miri change --------- Co-authored-by: Andrew Lamb <[email protected]> * update UT Signed-off-by: Dharan Aditya <[email protected]> * fix typo in UT Signed-off-by: Dharan Aditya <[email protected]> --------- Signed-off-by: Dharan Aditya <[email protected]> Co-authored-by: Andy Grove <[email protected]> Co-authored-by: KAZUYUKI TANIMURA <[email protected]> Co-authored-by: Parth Chandra <[email protected]> Co-authored-by: Liang-Chi Hsieh <[email protected]> Co-authored-by: Raz Luvaton <[email protected]> Co-authored-by: Andrew Lamb <[email protected]>
Which issue does this PR close?
Resolves #1175
Rationale for this change
This is a follow on suggestion from @rdettai on #1153 https://github.com/apache/arrow-datafusion/pull/1153/files#r735437280
Namely the substitution of
now()for the current value is better described as "constant evaluation" rather than "algebraic simplification" so it should be done in theConstEvaluatorcode.It also has the very nice property that expressions that include
now()can also be more completely evaluated (will comment inline)What changes are included in this PR?
now()fromSimplifier()toConstEvaluatorAre there any user-facing changes?
MOAR constant folding!