Skip to content

High level interface for experimental PQ reader and implementation of metadata APIs#18480

Merged
rapids-bot[bot] merged 60 commits intorapidsai:branch-25.06from
mhaseeb123:fea/hybrid-scan-metadata-apis
Apr 30, 2025
Merged

High level interface for experimental PQ reader and implementation of metadata APIs#18480
rapids-bot[bot] merged 60 commits intorapidsai:branch-25.06from
mhaseeb123:fea/hybrid-scan-metadata-apis

Conversation

@mhaseeb123
Copy link
Member

@mhaseeb123 mhaseeb123 commented Apr 11, 2025

Description

Contributes to #17896. Part of #18011.

This PR adds the high level interface (APIs) to a new experimental Parquet reader optimized for highly selective (hybrid scan) queries. The PR also adds implementations for the basic metadata related APIs of the new reader such as reading the file footer and PageIndex.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@copy-pr-bot
Copy link

copy-pr-bot bot commented Apr 11, 2025

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@github-actions github-actions bot added libcudf Affects libcudf (C++/CUDA) code. CMake CMake build issue labels Apr 11, 2025
@mhaseeb123 mhaseeb123 added 3 - Ready for Review Ready for review by team cuIO cuIO issue feature request New feature or request non-breaking Non-breaking change labels Apr 11, 2025
@mhaseeb123 mhaseeb123 marked this pull request as ready for review April 11, 2025 00:25
@mhaseeb123 mhaseeb123 requested review from a team as code owners April 11, 2025 00:25
@mhaseeb123 mhaseeb123 marked this pull request as draft April 11, 2025 00:25
@mhaseeb123 mhaseeb123 marked this pull request as ready for review April 11, 2025 18:00
@copy-pr-bot
Copy link

copy-pr-bot bot commented Apr 29, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@mhaseeb123 mhaseeb123 requested a review from nvdbaranec April 29, 2025 21:52
@mhaseeb123
Copy link
Member Author

/ok to test 03eae85

mhaseeb123 added a commit to mhaseeb123/cudf that referenced this pull request Apr 30, 2025
Copy link
Contributor

@nvdbaranec nvdbaranec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good docs. This feels very complicated to use, but we can shake out any issues when integrating with Spark.

@mhaseeb123
Copy link
Member Author

/merge

@rapids-bot rapids-bot bot merged commit 33444ba into rapidsai:branch-25.06 Apr 30, 2025
112 checks passed
@mhaseeb123 mhaseeb123 deleted the fea/hybrid-scan-metadata-apis branch April 30, 2025 18:51
@GregoryKimball GregoryKimball moved this to Landed in libcudf May 5, 2025
@GregoryKimball GregoryKimball removed this from libcudf Jul 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

4 - Needs Review Waiting for reviewer to review or respond CMake CMake build issue cuIO cuIO issue feature request New feature or request libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants