feat(python): Add extra_columns parameter to scan_parquet#22699
feat(python): Add extra_columns parameter to scan_parquet#22699ritchie46 merged 2 commits intopola-rs:mainfrom
extra_columns parameter to scan_parquet#22699Conversation
extra_columns parameter to scan_parquetextra_columns parameter to scan_parquet
extra_columns parameter to scan_parquetextra_columns parameter to scan_parquet
extra_columns parameter to scan_parquetextra_columns parameter to scan_parquet
8ef8919 to
6d2ff40
Compare
|
@nameexhaustion @ion-elgreco Should this extra_columns parameter for scan_parquet be surfaced in scan_delta? I get exceptions at the moment where I try to load a multifile delta table where columns have been added over time. I'd like to be able to enable the extra_columns=True argument in scan_delta. |
Looks like a bug to me, providing the schema before-hand should be enough information for the reader to figure out what to do with missing columns or extra columns. |
15f4fea to
ceccfa5
Compare
This PR addresses the case where columns are removed rather than added. The case when columns are added are already handled by
This will be internally enabled by default in the future for |
8db3466 to
ce24789
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #22699 +/- ##
==========================================
- Coverage 81.01% 80.97% -0.05%
==========================================
Files 1671 1675 +4
Lines 236925 237101 +176
Branches 2792 2792
==========================================
+ Hits 191956 191998 +42
- Misses 44299 44433 +134
Partials 670 670 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
2c77789 to
37766b2
Compare
Supercedes #22695
Changes:
extra_columnstoscan_parquetextra_columns_policyis now added underUnifiedScanArgsScanOptionsPython class to consolidate input parsing of shared scan options (i.e. those inUnifiedScanArgs).Fixes:
select(all())andallow_missing_columns=True#22218extra_columnsparameter