-
Notifications
You must be signed in to change notification settings - Fork 1.4k
fix: Fix partition filters with timestamp value #12368
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Fix partition filters with timestamp value #12368
Conversation
✅ Deploy Preview for meta-velox canceled.
|
|
@rui-mo Please take a look. Thanks. |
rui-mo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @kecookier, thanks for your fix! Is this change proposed to align the behaviour with Spark, and could you please provide more details on the issue you have met?
| const std::unordered_map<std::string, std::shared_ptr<HiveColumnHandle>>& | ||
| partitionKeysHandle); | ||
| partitionKeysHandle, | ||
| bool asLocalTime = true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need the default value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rui-mo testFilters is also called by PositionalDeleteFileReader (https://github.com/facebookincubator/velox/blob/main/velox/connectors/hive/iceberg/PositionalDeleteFileReader.cpp#L109). I'm not sure whether this reader also needs this feature, so I set a default value. If this reader also needs it, I'll modify it accordingly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gluten with Iceberg may also need to align with Spark, so remove this default value.
| VELOX_USER_FAIL("{}", status.message()); | ||
| }); | ||
| t.toGMT(Timestamp::defaultTimezone()); | ||
| partitionValueStr = "'" + t.toString() + "'"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not entirely sure why this piece needs to be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not entirely sure why this piece needs to be removed.
This code was introduced by https://github.com/facebookincubator/velox/pull/11754/files#diff-c62590f29a2bf22e50e276f2665e0a7666968880de73a6b4db745cbc0cfde6feL202 and is only called by TableScanTest.partitionedTableTimestampKey. To clarify the code, I moved these logics into TableScanTest.partitionedTableTimestampKey from L2043 to L2085.
|
@rui-mo Yes, This PR is attend to align with Spark, when I run all UT of After comunicate with jiake offline, I kown PR #11957 will replace this commit. When I check this PR, I found it not contain the modify of HiveConnectorUtil.cpp. Yes, this PR is intended to align with Spark. When I run all unit tests for After communicating with Jiake offline, I learned that PR #11957 will replace this commit. However, when I checked this PR, I found that it does not contain modifications to |
|
@rui-mo All comments done, could you please review it again? |
rui-mo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks.
| {})) { | ||
| {}, | ||
| hiveConfig_->readTimestampPartitionValueAsLocalTime( | ||
| connectorQueryCtx_->sessionProperties()))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice if we could add test for this change on the iceberg read.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kecookier are you planning to address this comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rui-mo @majetideepak I am not familiar with Iceberg now, and I need some time to add this test case. This may be done in another PR. Perhaps this PR can be merged first?
|
@majetideepak Could you help review this? |
|
@majetideepak Can you help merge this PR? |
|
@kecookier I missed adding the ready-to-merge label which I did now. Meta should import and merge it. |
|
@kagamiori has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
|
Hi @kecookier, could you please rebase this PR onto the latest main? That is needed for merging this PR. Thanks! |
a158170 to
94a6862
Compare
Hi @kagamiori, I've rebased the PR onto the latest main. Please let me know if you need anything else. Thanks. |
|
@kagamiori has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
|
Hi @kecookier, there is an internal build failure, possibly due to passing non-constant-literal string to fmt::format in TableScanTest.cpp. Could you take a look? |
|
@kagamiori Thanks for the review, I have addressed this issue. |
|
@kagamiori has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
|
Hi @kecookier, I updated the imported code but still see the same build error. I think the problem is that fmt::format() cannot accept a variable at the first argument (but rather expect a constant literal, e.g., "abc{}"). |
Hi @kagamiori, what's your build environment? I tested this in Clang 15 on macOS. |
Hi @kecookier, I think we're using clang-17 on linux internally according to the error message above. |
|
@kagamiori I built with clang-17 in an Ubuntu Docker container, and it succeeded. Although I couldn't reproduce this build error, I also tried to make some fixes based on your suggestions. I deleted the helper function expect(). Could you help trigger your internal build? |
|
@kagamiori has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
|
Hi @kecookier, thank you for updating! The internal build now gets another error with sql() that is similar. I think this is also due to passing a variable as the first argument to fmt::format. Could we avoid that too if possible? |
|
Hi @kagamiori, I've removed the helper function sql() as suggested. Please take a look. |
7b0ee08 to
0f72f61
Compare
|
@kagamiori has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
|
@kagamiori merged this pull request in e8dd265. |
) Summary: When using partition filters with timestamp value, use configuration to control whether to interpret it as local time or UTC. Follow-up for facebookincubator#11957 Pull Request resolved: facebookincubator#12368 Reviewed By: kKPulla Differential Revision: D71047729 Pulled By: kagamiori fbshipit-source-id: 471011591f5c8ff70507d49e425b9749a06b02d4
When using partition filters with timestamp value, use configuration
to control whether to interpret it as local time or UTC.
Follow-up for #11957