Skip to content

Commit 9724c8d

Browse files
committed
add upgrading entry
1 parent 8833c30 commit 9724c8d

File tree

2 files changed

+116
-1
lines changed

2 files changed

+116
-1
lines changed

datafusion/datasource/src/file_scan_config.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -697,7 +697,7 @@ impl DataSource for FileScanConfig {
697697
&self,
698698
projection: &ProjectionExprs,
699699
) -> Result<Option<Arc<dyn DataSource>>> {
700-
match self.file_source.try_pushdown_projection(&projection)? {
700+
match self.file_source.try_pushdown_projection(projection)? {
701701
Some(new_source) => {
702702
let mut new_file_scan_config = self.clone();
703703
new_file_scan_config.file_source = new_source;

docs/source/library-user-guide/upgrading.md

Lines changed: 115 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -202,6 +202,121 @@ Additionally, the FFI structure for Scalar UDF's no longer contains a
202202
`return_type` call. This code was not used since the `ForeignScalarUDF`
203203
struct implements the `return_field_from_args` instead.
204204

205+
### Projection handling moved from FileScanConfig to FileSource
206+
207+
Projection handling has been moved from `FileScanConfig` into `FileSource` implementations. This enables format-specific projection pushdown (e.g., Parquet can push down struct field access, Vortex can push down computed expressions into un-decoded data).
208+
209+
**Who is affected:**
210+
211+
- Users who have implemented custom `FileSource` implementations
212+
- Users who use `FileScanConfigBuilder::with_projection_indices` directly
213+
214+
**Breaking changes:**
215+
216+
1. **`FileSource::with_projection` replaced with `try_pushdown_projection`:**
217+
218+
The `with_projection(&self, config: &FileScanConfig) -> Arc<dyn FileSource>` method has been removed and replaced with `try_pushdown_projection(&self, projection: &ProjectionExprs) -> Result<Option<Arc<dyn FileSource>>>`.
219+
220+
2. **`FileScanConfig.projection_exprs` field removed:**
221+
222+
Projections are now stored in the `FileSource` directly, not in `FileScanConfig`.
223+
Various public helper methods that access projection information have been removed from `FileScanConfig`.
224+
225+
3. **`FileScanConfigBuilder::with_projection_indices` now returns `Result<Self>`:**
226+
227+
This method can now fail if the projection pushdown fails.
228+
229+
4. **`FileSource::create_file_opener` now returns `Result<Arc<dyn FileOpener>>`:**
230+
231+
Previously returned `Arc<dyn FileOpener>` directly.
232+
Any `FileSource` implementation that may fail to create a `FileOpener` should now return an appropriate error.
233+
234+
5. **`DataSource::try_swapping_with_projection` signature changed:**
235+
236+
Parameter changed from `&[ProjectionExpr]` to `&ProjectionExprs`.
237+
238+
**Migration guide:**
239+
240+
If you have a custom `FileSource` implementation:
241+
242+
**Before:**
243+
244+
```rust,ignore
245+
impl FileSource for MyCustomSource {
246+
fn with_projection(&self, config: &FileScanConfig) -> Arc<dyn FileSource> {
247+
// Apply projection from config
248+
Arc::new(Self { /* ... */ })
249+
}
250+
251+
fn create_file_opener(
252+
&self,
253+
object_store: Arc<dyn ObjectStore>,
254+
base_config: &FileScanConfig,
255+
partition: usize,
256+
) -> Arc<dyn FileOpener> {
257+
Arc::new(MyOpener { /* ... */ })
258+
}
259+
}
260+
```
261+
262+
**After:**
263+
264+
```rust,ignore
265+
impl FileSource for MyCustomSource {
266+
fn try_pushdown_projection(
267+
&self,
268+
projection: &ProjectionExprs,
269+
) -> Result<Option<Arc<dyn FileSource>>> {
270+
// Return None if projection cannot be pushed down
271+
// Return Some(new_source) with projection applied if it can
272+
Ok(Some(Arc::new(Self {
273+
projection: Some(projection.clone()),
274+
/* ... */
275+
})))
276+
}
277+
278+
fn projection(&self) -> Option<&ProjectionExprs> {
279+
self.projection.as_ref()
280+
}
281+
282+
fn create_file_opener(
283+
&self,
284+
object_store: Arc<dyn ObjectStore>,
285+
base_config: &FileScanConfig,
286+
partition: usize,
287+
) -> Result<Arc<dyn FileOpener>> {
288+
Ok(Arc::new(MyOpener { /* ... */ }))
289+
}
290+
}
291+
```
292+
293+
We recommend you look at [#18627](https://github.com/apache/datafusion/pull/18627)
294+
that introduced these changes for more examples for how this was handled for the various built in file sources.
295+
296+
We have added [`SplitProjection`](https://docs.rs/datafusion-datasource/latest/datafusion_datasource/projection/struct.SplitProjection.html) and [`ProjectionOpener`](https://docs.rs/datafusion-datasource/latest/datafusion_datasource/projection/struct.ProjectionOpener.html) helpers to make it easier to handle projections in your `FileSource` implementations.
297+
298+
For file sources that can only handle simple column selections (not computed expressions), use the `SplitProjection` and `ProjectionOpener` helpers to split the projection into pushdownable and non-pushdownable parts:
299+
300+
```rust,ignore
301+
use datafusion_datasource::projection::{SplitProjection, ProjectionOpener};
302+
303+
// In try_pushdown_projection:
304+
let split = SplitProjection::new(projection, self.table_schema())?;
305+
// Use split.file_projection() for what to push down to the file format
306+
// The ProjectionOpener wrapper will handle the rest
307+
```
308+
309+
**For `FileScanConfigBuilder` users:**
310+
311+
```diff
312+
let config = FileScanConfigBuilder::new(url, source)
313+
- .with_projection_indices(Some(vec![0, 2, 3]))
314+
+ .with_projection_indices(Some(vec![0, 2, 3]))?
315+
.build();
316+
```
317+
318+
**Handling projections in `FileSource`:**
319+
205320
## DataFusion `51.0.0`
206321

207322
### `arrow` / `parquet` updated to 57.0.0

0 commit comments

Comments
 (0)