Skip to content
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

10 changes: 10 additions & 0 deletions datafusion/datasource-parquet/src/opener.rs
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ use arrow::datatypes::{FieldRef, SchemaRef, TimeUnit};
use arrow::error::ArrowError;
use datafusion_common::{exec_err, DataFusionError, Result};
use datafusion_datasource::PartitionedFile;
use datafusion_physical_expr::simplifier::PhysicalExprSimplifier;
use datafusion_physical_expr::PhysicalExprSchemaRewriter;
use datafusion_physical_expr_common::physical_expr::{
is_dynamic_physical_expr, PhysicalExpr,
Expand Down Expand Up @@ -233,7 +234,16 @@ impl FileOpener for ParquetOpener {
)
.rewrite(p)
.map_err(ArrowError::from)
.map(|p| {
// After rewriting to the file schema, further simplifications may be possible.
// For example, if `'a' = col_that_is_missing` becomes `'a' = NULL` that can then be simplified to `FALSE`
// and we can avoid doing any more work on the file (bloom filters, loading the page index, etc.).
PhysicalExprSimplifier::new(&physical_file_schema)
.simplify(p)
.map_err(ArrowError::from)
})
})
.transpose()?
.transpose()?;

// Build predicates for this specific file
Expand Down
Loading
Loading