Skip to content

feat: Reuse existing file instead of reopening during shuffle write#2577

Closed
zuston wants to merge 3 commits intoapache:mainfrom
zuston:fixfile
Closed

feat: Reuse existing file instead of reopening during shuffle write#2577
zuston wants to merge 3 commits intoapache:mainfrom
zuston:fixfile

Conversation

@zuston
Copy link
Copy Markdown
Member

@zuston zuston commented Oct 15, 2025

Which issue does this PR close?

Closes #.

Rationale for this change

Reduce the system call cost to reuse the existing file

What changes are included in this PR?

How are these changes tested?

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Oct 15, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 42.19%. Comparing base (f09f8af) to head (3ec2047).
⚠️ Report is 614 commits behind head on main.

Additional details and impacted files
@@              Coverage Diff              @@
##               main    #2577       +/-   ##
=============================================
- Coverage     56.12%   42.19%   -13.94%     
- Complexity      976     1093      +117     
=============================================
  Files           119      146       +27     
  Lines         11743    13747     +2004     
  Branches       2251     2353      +102     
=============================================
- Hits           6591     5800      -791     
- Misses         4012     6978     +2966     
+ Partials       1140      969      -171     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

if let Some(spill_data) = self.partition_writers[i].spill_file.as_ref() {
let mut spill_file =
BufReader::new(File::open(spill_data.temp_file.path()).map_err(to_df_err)?);
let mut spill_file = BufReader::new(&spill_data.file);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spill_data.file is opened in write-only mode.

let spill_data = OpenOptions::new()
.write(true)
.create(true)
.truncate(true)
.open(spill_file.path())
.map_err(|e| {
DataFusionError::Execution(format!("Error occurred while spilling {e}"))
})?;
self.spill_file = Some(SpillFile {
temp_file: spill_file,
file: spill_data,
});

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch

@wForget
Copy link
Copy Markdown
Member

wForget commented Oct 15, 2025

Also, could you provide more information about the Rationale for this change?

@zuston
Copy link
Copy Markdown
Member Author

zuston commented Oct 16, 2025

Also, could you provide more information about the Rationale for this change?

updated.

@zuston zuston marked this pull request as draft October 16, 2025 02:49
@github-actions
Copy link
Copy Markdown

Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or this will be closed in 7 days.

@github-actions github-actions Bot added the Stale label Feb 12, 2026
@github-actions github-actions Bot closed this Feb 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants