Skip to content

Materialize tables in the experimental Parquet reader#19308

Merged
rapids-bot[bot] merged 39 commits intorapidsai:branch-25.08from
mhaseeb123:fea/materialize-hybrid-scan-columns
Jul 24, 2025
Merged

Materialize tables in the experimental Parquet reader#19308
rapids-bot[bot] merged 39 commits intorapidsai:branch-25.08from
mhaseeb123:fea/materialize-hybrid-scan-columns

Conversation

@mhaseeb123
Copy link
Member

@mhaseeb123 mhaseeb123 commented Jul 8, 2025

Description

Contributes to #17896. Completes #18011

This PR implements table materialization functions in the experimental parquet reader. The experimental reader now derives from the base parquet reader and only overloads the necessary functions reusing the base functions wherever possible.

Most of the functions reimplemented by the experimental reader are also mostly identical with the differences from the base reader mentioned in the comments

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@mhaseeb123 mhaseeb123 self-assigned this Jul 8, 2025
@mhaseeb123 mhaseeb123 added the DO NOT MERGE Hold off on merging; see PR for details label Jul 8, 2025
@copy-pr-bot
Copy link

copy-pr-bot bot commented Jul 8, 2025

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@github-actions github-actions bot added libcudf Affects libcudf (C++/CUDA) code. CMake CMake build issue labels Jul 8, 2025
@mhaseeb123 mhaseeb123 changed the title Fea/materialize hybrid scan columns 🚧 Materialize tables in the experimental Parquet reader Jul 8, 2025
@mhaseeb123 mhaseeb123 added 2 - In Progress Currently a work in progress cuIO cuIO issue labels Jul 8, 2025
@mhaseeb123 mhaseeb123 added feature request New feature or request non-breaking Non-breaking change and removed DO NOT MERGE Hold off on merging; see PR for details labels Jul 10, 2025
@mhaseeb123 mhaseeb123 requested a review from vuule July 22, 2025 18:24
@mhaseeb123 mhaseeb123 added 4 - Needs Review Waiting for reviewer to review or respond and removed 3 - Ready for Review Ready for review by team labels Jul 22, 2025
@mhaseeb123
Copy link
Member Author

/ok to test

@copy-pr-bot
Copy link

copy-pr-bot bot commented Jul 22, 2025

/ok to test

@mhaseeb123, there was an error processing your request: E1

See the following link for more information: https://docs.gha-runners.nvidia.com/cpr/e/1/

@mhaseeb123
Copy link
Member Author

/ok to test 317d9e7

@mhaseeb123
Copy link
Member Author

/ok to test 334c3ff

@mhaseeb123 mhaseeb123 moved this from Burndown to Slip in libcudf Jul 23, 2025
@mhaseeb123 mhaseeb123 added the DO NOT MERGE Hold off on merging; see PR for details label Jul 23, 2025
@mhaseeb123 mhaseeb123 requested a review from ttnghia July 23, 2025 20:01
@mhaseeb123 mhaseeb123 removed the DO NOT MERGE Hold off on merging; see PR for details label Jul 23, 2025
@mhaseeb123
Copy link
Member Author

mhaseeb123 commented Jul 23, 2025

@ttnghia @vuule I made some changes to this PR. Mainly replacing the use of thrust::host_vector<bool> with std::vector<bool> as we can't use thrust host vectors cross C++/CUDA due to ABI incompatibility issues (say in examples). See #19469 and #swrapids-cpp for more info

@mhaseeb123 mhaseeb123 moved this from Slip to Burndown in libcudf Jul 23, 2025
@mhaseeb123
Copy link
Member Author

/ok to test ed3f925

@mhaseeb123 mhaseeb123 added 5 - Ready to Merge Testing and reviews complete, ready to merge and removed 4 - Needs Review Waiting for reviewer to review or respond labels Jul 23, 2025
@mhaseeb123
Copy link
Member Author

/merge

@rapids-bot rapids-bot bot merged commit 4a3d433 into rapidsai:branch-25.08 Jul 24, 2025
98 checks passed
@mhaseeb123 mhaseeb123 deleted the fea/materialize-hybrid-scan-columns branch July 24, 2025 01:04
@mhaseeb123 mhaseeb123 moved this from Burndown to Landed in libcudf Jul 24, 2025
@GregoryKimball GregoryKimball removed this from libcudf Sep 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

5 - Ready to Merge Testing and reviews complete, ready to merge CMake CMake build issue cuIO cuIO issue feature request New feature or request libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants