Skip to content

Conversation

@Weijun-H
Copy link
Member

@Weijun-H Weijun-H commented Oct 31, 2025

Which issue does this PR close?

Rationale for this change

Without a SQL-level reset, clients that SET DataFusion options have to rebuild the session to recover defaults

What changes are included in this PR?

  • Extended the config macros/traits so every namespace knows how to restore default values
  • Added the ResetVariable

Are these changes tested?

Yes

Are there any user-facing changes?

Yes, SQL clients (including the CLI) can issue RESET

@github-actions github-actions bot added sql SQL Planner logical-expr Logical plan and expressions core Core DataFusion crate sqllogictest SQL Logic Tests (.slt) common Related to common crate labels Oct 31, 2025
Copy link
Contributor

@Omega359 Omega359 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for submitting this PR, I've left a few comments

self.return_empty_dataframe()
}

async fn reset_variable(&self, stmt: ResetVariable) -> Result<DataFrame> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems odd to always return an empty dataframe in this function. Is that really necessary vs Result<()> ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

}

/// Reset a configuration option back to its default value
pub fn reset(&mut self, key: &str) -> Result<()> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think overall it might be better to add a reset fn to ConfigField and update the config_namespace! macro rather than implement it like this. Just my opinion..

@Omega359
Copy link
Contributor

One additional thing - I didn't see anything in this PR for the unparser. It's possible that the set/unset isn't supported (and/or shouldn't be supported) there but could you take a look and see?

@Weijun-H
Copy link
Member Author

One additional thing - I didn't see anything in this PR for the unparser. It's possible that the set/unset isn't supported (and/or shouldn't be supported) there but could you take a look and see?

@Omega359 Thanks for flagging it. I double-checked the unparser: today it simply funnels LogicalPlan::Statement through the query path, so SET/RESET plans aren’t rendered back to SQL. Supporting them would also need upstream sqlparser changes, as the AST has no native RESET node. Leaving it out of this PR was intentional given those constraints.

@Weijun-H Weijun-H force-pushed the 18384-reset-config branch 2 times, most recently from be08bf3 to 645e40b Compare October 31, 2025 22:54
@Weijun-H Weijun-H requested a review from Omega359 November 1, 2025 18:40
@Omega359
Copy link
Contributor

Omega359 commented Nov 1, 2025

Thanks for continuing to work on this - I'll try and do an in-depth review tomorrow for this @Weijun-H.

Copy link
Contributor

@Omega359 Omega359 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good after a quick review. My biggest concern is the ability to reset whole sections of config - I am uncertain we want to support that.

}
_ => _config_err!(
"Config value \"{}\" not found on ConfigFileEncryptionProperties",
"Config value \"{}\" not found on ConfigFileDecryptionProperties",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch

SET datafusion.runtime.memory_limit = '1M'

statement ok
RESET datafusion.runtime.memory_limit
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where is the show to verify it was indeed reset?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that the SHOW statement does not display runtime settings.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

file #18452

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is a hack way to test, we can ensure the below query succeed after resetting.

> set datafusion.runtime.memory_limit = '1K';
0 row(s) fetched.
Elapsed 0.001 seconds.

> explain analyze select * from generate_series(1000) as t1(v1) order by v1;
Not enough memory to continue external sort. Consider increasing the memory limit, or decreasing sort_spill_reservation_bytes
caused by
Resources exhausted: Additional allocation failed for ExternalSorterMerge[0] with top memory consumers (across reservations) as:
  DataFusion-Cli#89(can spill: false) consumed 0.0 B, peak 0.0 B,
  ExternalSorter[0]#90(can spill: true) consumed 0.0 B, peak 0.0 B,
  ExternalSorterMerge[0]#91(can spill: false) consumed 0.0 B, peak 0.0 B.
Error: Failed to allocate additional 10.0 MB for ExternalSorterMerge[0] with 0.0 B already allocated for this reservation - 1024.0 B remain available for the total pool

match section {
"catalog" => {
if rem.is_empty() {
self.catalog = CatalogOptions::default();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am uncertain that this should be supported to be honest. I think that a full configuration key should be required for reset.

}

fn reset(&mut self, key: &str) -> $crate::error::Result<()> {
if key.is_empty() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems off. Why error if there is a key but don't if there isn't? Seems backwards?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For scalar config fields the caller already strips off the key for the field before invoking reset. At that point key only contains a nested suffix (if any).

  • key.is_empty() → we’re exactly at the scalar value, so resetting means “restore to default”, which we do.
  • non-empty key → someone tried to reset something below a scalar (e.g. foo.bar where foo is a scalar); that’s invalid, so we emit the helpful error.

Ok(())
} else {
$crate::error::_config_err!(
"Config value \"{}\" not found on {}",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error message doesn't make sense in this context. We also should have a test to cover this error

@2010YOUY01 2010YOUY01 self-requested a review November 3, 2025 04:19
}
"max_temp_directory_size" => {
// default is 100 GB
const DEFAULT_MAX_TEMP_DIRECTORY_SIZE: u64 = 100 * 1024 * 1024 * 1024;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see this const in disk_manager.rs, could we not just use that? Might need to change to pub(crate) though...

"metadata_cache_limit" => {
// default is 50 MB
const DEFAULT_METADATA_CACHE_LIMIT: usize = 50 * 1024 * 1024;
builder = builder.with_metadata_cache_limit(DEFAULT_METADATA_CACHE_LIMIT);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here in cache_manager.rs

@Omega359
Copy link
Contributor

Omega359 commented Nov 3, 2025

Small suggestions, but otherwise LGTM, thanks!

Copy link
Contributor

@2010YOUY01 2010YOUY01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for working on it! I haven't review the implementations yet, I'll continue soon.

SET datafusion.runtime.memory_limit = '1M'

statement ok
RESET datafusion.runtime.memory_limit
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is a hack way to test, we can ensure the below query succeed after resetting.

> set datafusion.runtime.memory_limit = '1K';
0 row(s) fetched.
Elapsed 0.001 seconds.

> explain analyze select * from generate_series(1000) as t1(v1) order by v1;
Not enough memory to continue external sort. Consider increasing the memory limit, or decreasing sort_spill_reservation_bytes
caused by
Resources exhausted: Additional allocation failed for ExternalSorterMerge[0] with top memory consumers (across reservations) as:
  DataFusion-Cli#89(can spill: false) consumed 0.0 B, peak 0.0 B,
  ExternalSorter[0]#90(can spill: true) consumed 0.0 B, peak 0.0 B,
  ExternalSorterMerge[0]#91(can spill: false) consumed 0.0 B, peak 0.0 B.
Error: Failed to allocate additional 10.0 MB for ExternalSorterMerge[0] with 0.0 B already allocated for this reservation - 1024.0 B remain available for the total pool


statement error Arrow error: Parser error: Invalid timezone "Asia/Taipei2": failed to parse timezone
SELECT '2000-01-01T00:00:00'::TIMESTAMP::TIMESTAMPTZ

Copy link
Contributor

@2010YOUY01 2010YOUY01 Nov 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add some test coverage for invalid resets like
reset dataexplosion.execution.batch_size, reset datafusion.exec.batch_size, reset datafusion.execution.batches_size and datafusion.execution.batch_size.bar

@github-actions github-actions bot added the execution Related to the execution crate label Nov 4, 2025
@Omega359
Copy link
Contributor

Omega359 commented Nov 5, 2025

Looks like some tests are failing in CI

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

common Related to common crate core Core DataFusion crate execution Related to the execution crate logical-expr Logical plan and expressions sql SQL Planner sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support resetting config options to defaults

3 participants