CX-39170: DF51 / Arrow 57 upgrade#420
Draft
avantgardnerio wants to merge 10 commits into51_basefrom
Draft
Conversation
- Point arrow/parquet deps at CX fork (rev 7d5c1c973) - Relax object_store version (>=0.12.4, <0.13) - CI cleanup: remove push trigger, delete unused workflows, trim feature checks - Update PR template Co-Authored-By: Claude Opus 4.6 <[email protected]>
joroKr21
reviewed
Apr 29, 2026
| ] } | ||
| apache-avro = { version = "0.20", default-features = false } | ||
| arrow = { version = "57.0.0", features = [ | ||
| arrow = { git = "https://github.com/Coralogix/arrow-rs.git", rev = "7d5c1c973", features = [ |
There was a problem hiding this comment.
We don't need to do this right? We override the arrow version in DQE anyway.
Author
There was a problem hiding this comment.
Then CI doesn't test datafusion against our fork changes, which is the purpose of this PR. Does anything in coralogix use datafusion other than DQE?
Combines three fork-only commits from v49: - Hook for doing distributed CollectLeft joins (#269/apache#12523) - Add JoinContext with JoinLeftData to TaskContext in HashJoinExec (#300) - Make HASH_JOIN_SEED public (fork-only) Adds SharedJoinState/SharedJoinStateImpl trait for distributed probe coordination, JoinContext for sharing build-side state via TaskContext, contains_hash on JoinHashMapType, and converts process_unmatched_build_batch to async for shared state polling. Co-Authored-By: Claude Opus 4.6 <[email protected]>
512a189 to
6575a30
Compare
* ignore writer shutdown error * cargo check --- [Cherry-pick summary: v46→v47] Source commit: eaf5520 (Ignore writer shutdown error (#271)) Strategy: cherry-picked cleanly Upstream PR: fork-only Test coverage: insufficient (no dedicated unit test for this error path; behaviour is a runtime edge case) Tests: cargo nextest run -p datafusion-datasource passed Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
--- [Cherry-pick summary: v46→v47] Source commit: 4fff23e (Disable grouping set in CSE (fork only)) Strategy: cherry-picked cleanly Upstream PR: fork-only Test coverage: insufficient (no dedicated test for this early-return path; the change prevents a panic/incorrect optimization with GroupingSet expressions) Tests: cargo nextest run -p datafusion-optimizer passed (579 tests) Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
apache#20063) v53 Extends v49 cherry-pick a296c12 (decode-only) with the encode-side fix: seed DictionaryTracker via schema_to_bytes_with_dictionary_tracker before encoded_batch, so IPC has dict IDs for nested dictionary arrays. Adapted from upstream apache#20063 (which targets arrow 57's new encode API; we retain the arrow-56 encoded_batch API and just add the seed call).
5717991 to
532c45a
Compare
Also makes topk module public for downstream access to TopKDynamicFilters.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Test plan
🤖 Generated with Claude Code