Skip to content

Conversation

@adriangb
Copy link
Contributor

Steps towards #14993

@github-actions github-actions bot added the datasource Changes to the datasource crate label Oct 22, 2025
@adriangb adriangb force-pushed the use-table-schema-filescanconfig branch from a2348e5 to 212a032 Compare October 22, 2025 20:35
@adriangb adriangb force-pushed the use-table-schema-filescanconfig branch from 212a032 to 52b2dd4 Compare October 22, 2025 20:37
@Weijun-H Weijun-H added the api change Changes the API exposed to users of the crate label Oct 24, 2025
@Weijun-H Weijun-H added this pull request to the merge queue Oct 24, 2025
Merged via the queue into apache:main with commit 665a552 Oct 24, 2025
28 checks passed
@adriangb adriangb deleted the use-table-schema-filescanconfig branch October 24, 2025 13:35
adriangb added a commit to pydantic/datafusion that referenced this pull request Oct 24, 2025
@adriangb
Copy link
Contributor Author

Followup to add to upgrade guide: #18269

adriangb added a commit to pydantic/datafusion that referenced this pull request Oct 27, 2025
github-merge-queue bot pushed a commit that referenced this pull request Oct 27, 2025
## Which issue does this PR close?

- Related to #14993

## Rationale for this change

To enable expression pushdown to file sources, we need to plumb
expressions through the `FileScanConfig` layer. Currently,
`FileScanConfig` only tracks column indices for projection, which limits
us to simple and naive column selection.

This PR begins expression pushdown implementation by having
`FileScanConfig` own a list of `ProjectionExpr`s, instead of column
indices. This allows file sources to eventually receive and optimize
based on the actual expressions being projected.


## Notes about this PR
- The first commit is based off of
#18231
- To avoid a super large diff and a harder review, I've decided to break
(#14993) into 2 tasks:
- Have the `DataSource` (`FileScanConfig`) actually hold projection
expressions (this PR)
- Flow the projection expressions from `DataSourceExec` all the way to
the `FileSource`

---------

Co-authored-by: Adrian Garcia Badaracco <[email protected]>
tobixdev pushed a commit to tobixdev/datafusion that referenced this pull request Nov 2, 2025
tobixdev pushed a commit to tobixdev/datafusion that referenced this pull request Nov 2, 2025
## Which issue does this PR close?

- Related to apache#14993

## Rationale for this change

To enable expression pushdown to file sources, we need to plumb
expressions through the `FileScanConfig` layer. Currently,
`FileScanConfig` only tracks column indices for projection, which limits
us to simple and naive column selection.

This PR begins expression pushdown implementation by having
`FileScanConfig` own a list of `ProjectionExpr`s, instead of column
indices. This allows file sources to eventually receive and optimize
based on the actual expressions being projected.


## Notes about this PR
- The first commit is based off of
apache#18231
- To avoid a super large diff and a harder review, I've decided to break
(apache#14993) into 2 tasks:
- Have the `DataSource` (`FileScanConfig`) actually hold projection
expressions (this PR)
- Flow the projection expressions from `DataSourceExec` all the way to
the `FileSource`

---------

Co-authored-by: Adrian Garcia Badaracco <[email protected]>
codetyri0n pushed a commit to codetyri0n/datafusion that referenced this pull request Nov 11, 2025
## Which issue does this PR close?

- Related to apache#14993

## Rationale for this change

To enable expression pushdown to file sources, we need to plumb
expressions through the `FileScanConfig` layer. Currently,
`FileScanConfig` only tracks column indices for projection, which limits
us to simple and naive column selection.

This PR begins expression pushdown implementation by having
`FileScanConfig` own a list of `ProjectionExpr`s, instead of column
indices. This allows file sources to eventually receive and optimize
based on the actual expressions being projected.


## Notes about this PR
- The first commit is based off of
apache#18231
- To avoid a super large diff and a harder review, I've decided to break
(apache#14993) into 2 tasks:
- Have the `DataSource` (`FileScanConfig`) actually hold projection
expressions (this PR)
- Flow the projection expressions from `DataSourceExec` all the way to
the `FileSource`

---------

Co-authored-by: Adrian Garcia Badaracco <[email protected]>
EeshanBembi pushed a commit to EeshanBembi/datafusion that referenced this pull request Nov 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api change Changes the API exposed to users of the crate datasource Changes to the datasource crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants