Skip to content

Commit afdc003

Browse files
authored
[physical-plan]: remove deprecated spill_record_batch_by_size (apache#23029)
## Which issue does this PR close? - part of apache#23080 ## Rationale for this change `datafusion_physical_plan::spill::spill_record_batch_by_size` was deprecated in DataFusion `46.0.0` in favor of `SpillManager::spill_record_batch_by_size`. The [API health policy deprecation guidelines](https://datafusion.apache.org/contributor-guide/api-health.html#deprecation-guidelines) say deprecated methods remain for 6 major versions or 6 months, whichever is longer. This API has exceeded that window, so this removes the deprecated wrapper. ## What changes are included in this PR? - Removes the deprecated `datafusion_physical_plan::spill::spill_record_batch_by_size` function. ## Are these changes tested? By CI ## Are there any user-facing changes? Yes. This removes a public Rust API that was deprecated in DataFusion `46.0.0`. Downstream users should migrate to `datafusion_physical_plan::spill::SpillManager::spill_record_batch_by_size`. This is an API change and should be labeled `api-change`.
1 parent 6792fa9 commit afdc003

2 files changed

Lines changed: 9 additions & 30 deletions

File tree

  • datafusion/physical-plan/src/spill
  • docs/source/library-user-guide/upgrading

datafusion/physical-plan/src/spill/mod.rs

Lines changed: 1 addition & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ pub use spill_manager::SpillManager;
3030

3131
use std::fs::File;
3232
use std::io::BufReader;
33-
use std::path::{Path, PathBuf};
33+
use std::path::Path;
3434
use std::pin::Pin;
3535
use std::sync::Arc;
3636
use std::task::{Context, Poll};
@@ -245,35 +245,6 @@ impl RecordBatchStream for SpillReaderStream {
245245
}
246246
}
247247

248-
/// Spill the `RecordBatch` to disk as smaller batches
249-
/// split by `batch_size_rows`
250-
#[deprecated(
251-
since = "46.0.0",
252-
note = "This method is deprecated. Use `SpillManager::spill_record_batch_by_size` instead."
253-
)]
254-
#[expect(clippy::needless_pass_by_value)]
255-
pub fn spill_record_batch_by_size(
256-
batch: &RecordBatch,
257-
path: PathBuf,
258-
schema: SchemaRef,
259-
batch_size_rows: usize,
260-
) -> Result<()> {
261-
let mut offset = 0;
262-
let total_rows = batch.num_rows();
263-
let mut writer =
264-
IPCStreamWriter::new(&path, schema.as_ref(), SpillCompression::Uncompressed)?;
265-
266-
while offset < total_rows {
267-
let length = std::cmp::min(total_rows - offset, batch_size_rows);
268-
let batch = batch.slice(offset, length);
269-
offset += batch.num_rows();
270-
writer.write(&batch)?;
271-
}
272-
writer.finish()?;
273-
274-
Ok(())
275-
}
276-
277248
/// Write in Arrow IPC Stream format to a file.
278249
///
279250
/// Stream format is used for spill because it supports dictionary replacement, and the random

docs/source/library-user-guide/upgrading/55.0.0.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,14 @@ to the main branch and are awaiting release in this version.
3030
`datafusion_common::config::Dialect::AVAILABLE` has been removed. Use
3131
`Dialect::available()` instead.
3232

33+
### `spill_record_batch_by_size` removed
34+
35+
`datafusion_physical_plan::spill::spill_record_batch_by_size` has been removed.
36+
This function was deprecated in DataFusion `46.0.0`.
37+
38+
Use `datafusion_physical_plan::spill::SpillManager::spill_record_batch_by_size`
39+
instead.
40+
3341
### Decimal scalar formatting uses human-readable values
3442

3543
Decimal scalar literals in `EXPLAIN` output, expression display strings, and

0 commit comments

Comments
 (0)