-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Add PhysicalOptimizerRule::optimize_plan to allow passing more context into optimizer rules #18739
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add PhysicalOptimizerRule::optimize_plan to allow passing more context into optimizer rules #18739
Conversation
|
@gabotechs do you think this will work for #18168? |
| /// Config options | ||
| config_options: Arc<ConfigOptions>, | ||
| /// Extensions | ||
| extensions: Arc<Extensions>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about just having here something like:
pub struct OptimizerContext {
session_config: SessionConfig,
}I imagine this struct to be equivalent to TaskContext but for optimization passes rather than execution, and that struct does have the session_config "as-is" as a field:
And this way, if in the future changes are made to the SessionConfig, we do not need to come back here and update also OptimizerContext
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it be a private field, i.e. we expose ConfigOptions and Extensions as methods but not the full SessionConfig?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd deliberately copy what TaskContext is doing in order to keep both structure's as consistent as possible, that way the user experience of accessing an OptimizerContext is as close as possible as accessing TaskContext.
There, the field is private, but it can be accessed through a method:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense sorry I should have just looked into TaskContext, ty for the code pointer
yes! I think so |
Awesome I'll clean it up and we can try to get some reviews! The nice thing about this approach is that it's both backwards compatible now and will make future changes more backwards compatible as well. |
…t into optimzer rules In particular, give access to Extensions Co-authored-by: Gabriel Musat Mestre <[email protected]>
0c4ee79 to
209d5ce
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR introduces a new optimize_plan method to the PhysicalOptimizerRule trait that accepts an OptimizerContext parameter, allowing optimizer rules to access more context beyond just configuration options. The optimize method is deprecated in favor of the new approach.
Key Changes:
- Added
OptimizerContextstruct that wrapsConfigOptionsandExtensions - Created
Extensionsstruct to manage typed extension data inSessionConfig - Deprecated the
optimizemethod in favor ofoptimize_planwith backward compatibility - Updated all optimizer rule implementations to use the new
optimize_planmethod - Updated all test code to construct
OptimizerContextfromSessionConfig
Reviewed changes
Copilot reviewed 33 out of 33 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
datafusion/physical-optimizer/src/optimizer.rs |
Added OptimizerContext struct and new optimize_plan trait method with default implementation for backward compatibility |
datafusion/execution/src/config.rs |
Added Extensions struct and updated SessionConfig to use Arc<Extensions> instead of AnyMap |
datafusion/physical-optimizer/src/*.rs |
Updated all optimizer rules to implement optimize_plan instead of optimize |
datafusion/core/tests/**/*.rs |
Updated test code to create OptimizerContext and call optimize_plan |
datafusion/physical-optimizer/src/lib.rs |
Exported OptimizerContext publicly |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| /// provides access to the full optimizer context including session configuration, | ||
| /// runtime environment, and extensions. |
Copilot
AI
Nov 22, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The documentation mentions 'runtime environment' but the OptimizerContext only provides access to ConfigOptions and Extensions. The runtime environment is not actually included in the context. Consider updating the documentation to accurately reflect what is provided: 'session configuration and extensions' without mentioning runtime environment.
| /// provides access to the full optimizer context including session configuration, | |
| /// runtime environment, and extensions. | |
| /// provides access to the full optimizer context, including session configuration | |
| /// and extensions. |
| } | ||
| } | ||
|
|
||
| /// `PhysicalOptimizerRule` transforms one ['ExecutionPlan'] into another which |
Copilot
AI
Nov 22, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inconsistent bracket style in documentation reference. Should use backticks with square brackets: ['ExecutionPlan'] should be changed to [ExecutionPlan] for proper rustdoc linking syntax.
| /// `PhysicalOptimizerRule` transforms one ['ExecutionPlan'] into another which | |
| /// `PhysicalOptimizerRule` transforms one [`ExecutionPlan`] into another which |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💯 +1 from me
|
Thanks @gabotechs ! I'll leave this open for a day or two to allow other feedback. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this!
Left a minor comment, looks good otherwise.
| /// Return a reference to the configuration options | ||
| /// | ||
| /// This is a convenience method that returns the [`ConfigOptions`] | ||
| /// from the [`SessionConfig`]. | ||
| pub fn options(&self) -> &Arc<ConfigOptions> { | ||
| self.session_config.options() | ||
| } | ||
|
|
||
| /// Return a reference to the extensions | ||
| pub fn extensions(&self) -> &Arc<Extensions> { | ||
| self.session_config.extensions() | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These two methods are kind of redundant after the addition of session_config() method. Might be a nice QoL addition, but we should add the same in TaskContext to keep the API consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay will update to match.
Personally I would have kept the SessionConfig hidden until someone asks for it to minimize the dependency on such a large object. I.e. I'm not sure matching TaskContext 1:1 is the best approach here. But it seems that's just me 😄.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense actually. I only need extensions from the session config. We could expose just the required fields from the inner session_config. Unless, of course, there's already a requirement for the session config.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gabotechs wdyt? I guess the question is:
- Do we want to be 1:1 with
TaskContext?
or - Do we follow the principle of exposing the smallest API surface area?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally I would have kept the SessionConfig hidden until someone asks for it to minimize the dependency on such a large object
✋ asking here, I specifically could use both session_config.extensions and session_config.options.extensions, so just having something like what we have in TaskContext is enough:
/// Return the SessionConfig associated with this [OptimizerContext]
pub fn session_config(&self) -> &SessionConfig {
&self.session_config
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK will keep it as is for now and remove the extra methods
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alamb agreed that we should just expose SessionConfig
| #[derive(Clone, Debug)] | ||
| pub struct Extensions { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs docstrings!
I opened #19030 |
|
Actually, I may have few follow up questions, as PR does not have any associated issue nor it has any description or motivation explanation, which I don't think is especially good practice for changing one of the core interfaces:
|
A pattern we've seen time and time again in this project is to assume that we won't want to make any changes in the future and then realizing we do. Having a specialized struct lets us do that with very little overhead.
We considered doing this but decided to go this route instead because of (1) the future extensibility mentioned above and (2) changing the signature of the current method would be an immediate breaking change that makes upgrades difficult (there is no deprecation period), so you end up wanting to add a new method, if you're adding a new method you might as well make it extensible in the future.
Nothing really. Before there was no |
Right, we might need it eventually so we’ve decided to make it as wide as possible to allow for a change that might never be necessary.
Ok, so we added new struct to the public API which wasn’t really necessary. |
Yep exactly 👍🏻 , this avoids more breaking changes in the future.
That introduction felt helpful to me. Looking at the code it was not clear to me what |
|
Lets leave extensions, the change is not very disruptive, I've totally missed part that previous API is still there Would you consider using |
I personally would love to try extending this |
|
Thanks for clarification @gabotechs , just few follow up question:
Please have in mind that there are product using datafusion for very long time with quite big codebases. So far datafusion managed to balance api stability and benefits of breaking the api very well. I believe we as the community should continue to maintain such a high standard in the future |
I'd say so yeah. It's during the optimization where dynamic filters are created, which leads me to think that if their state was to be registered in a global place for outside access, it's at that moment when it should happen.
There's still a lot of things I should flesh out before giving a proper answer to this, but TL;DR is the second solution exposed in #18296.
I would not say there was a rush, the original PR #18168 has been out there for more than month ago. It's true that the context is a bit scattered across PR comments rather than being properly centralized in one issue. For @adriangb and I this change felt natural as this is something we've been discussing publicly in the last months, but we might be biased as we were the ones discussing, so if you think this needs more thought we are on time to revert whatever part of this PR you think needs more discussion. |
Could this be accomplished providing your own query planner? Or keeping state in your own optimiser rule which, rules are "arced" so you could keep reference to it and access it as needed?
That is my feeling as well, this PR is merged without all the things are flashed out.
yes, comments do not help understanding the context, and problem breakdown in sense of PR/Issue description helps everybody.
An external perspective can be invaluable in developing a more effective design. I personally believe introduction of Switching to accept SessionConfig somewhat might make sense. However, I’m wondering if the optimiser rule can retain the state needed to extract from the SessionConfig extension. Since both are tied to SessionContext, I don’t see a strong reason to keep it in two places.
I would leave up to you to make such decision, you're the one approving the PR; I personally struggle to see value of this PR at this moment. |
|
@milenkovicm in order to help determine the best path forward it would be helpful to understand where the pushback against this change is coming from. Will this change cause a lot of code churn for Ballista in a way that is not easily automatable, or is this API incompatible with something you would like to do? |
|
Hey @adriangb I have been involved in a code base which has been built since DataFusion version 7, and have absorbed a lot of backward-incompatible changes over the years, and most of them had clean benefit(s) to the community. This one does not, at least it has not demonstrated, yet. My concern here is that the API has been broken without any valid reasons. Two new structures have been introduced to a public API; one of them, as you said in your comment, was clearly not necessary; the other does not demonstrate any valid reasons to be there as well, yet we're all expected to take them. Let's not take the path of least resistance and make changes to the public API if they are not really needed. So back to change,
I hope this helps , |
|
I opened a PR to make My view on the @gabotechs what do you think of @milenkovicm's proposed solution? If I understand correctly the idea is that the optimizer takes a reference to the |
|
I don't think you understand me correctly,
Also note, you have marked old method for removal (Btw, there is difference, between logical optimiser and physical, logical accepts trait not a struct, which makes difference) My point isn’t whether it’s easy to update after a change. Writing code is easy these days but we should maintain a much higher standard for ourselves than just writing a code. |
|
to make it easier to understand, please have a look, it illustrates my point (this is datafusion 50), use datafusion::config::{ConfigEntry, ConfigExtension, ConfigOptions, ExtensionOptions};
use datafusion::execution::SessionStateBuilder;
use datafusion::physical_optimizer::PhysicalOptimizerRule;
use datafusion::physical_plan::ExecutionPlan;
use datafusion::prelude::{SessionConfig, SessionContext};
use std::any::Any;
use std::fmt::Debug;
use std::sync::Arc;
#[tokio::main]
async fn main() -> datafusion::error::Result<()> {
// you may want to keep your rule_context
// accessible or not, im not sure what you want
let rule_context = Arc::new(MyRuleContext::default());
let my_rule = Arc::new(MyRule::new(rule_context.clone()));
// if for any reason you need to reconfigure your rule
// use can use SQL SET or directly on session config
let my_rule_config = MyRuleConfig::default();
// assign your rule context to a session config
// so it gets dropped when your session gets dropped
// use of session extension
let session_config = SessionConfig::new()
.with_extension(rule_context.clone())
.with_option_extension(my_rule_config);
let session_state = SessionStateBuilder::new_with_default_features()
.with_config(session_config)
.with_physical_optimizer_rule(my_rule)
.build();
let ctx = SessionContext::new_with_state(session_state);
// this should trigger rule
ctx.sql("select 1").await?.show().await?;
Ok(())
}
#[derive(Debug, Default)]
struct MyRuleContext {
// probably your stage
// need some kind of week hash
// map or similar
// if you want to track physical expressions
}
#[derive(Debug)]
struct MyRule {
context: Arc<MyRuleContext>,
}
impl MyRule {
fn new(context: Arc<MyRuleContext>) -> Self {
Self { context }
}
}
impl PhysicalOptimizerRule for MyRule {
fn optimize(
&self,
plan: Arc<dyn ExecutionPlan>,
config: &ConfigOptions,
) -> datafusion::common::Result<Arc<dyn ExecutionPlan>> {
// configure / reconfigure your rule
// any way you need
let my_config: Option<&MyRuleConfig> = config.extensions.get();
println!("MyRule config: {:?}", my_config);
println!("MyRule context: {:?}", self.context);
Ok(plan)
}
fn name(&self) -> &str {
"MyRule"
}
fn schema_check(&self) -> bool {
false
}
}
#[derive(Debug, Default, Clone)]
struct MyRuleConfig {
// whatever you need to configure your
// rule goes here
}
impl ExtensionOptions for MyRuleConfig {
fn as_any(&self) -> &dyn Any {
self
}
fn as_any_mut(&mut self) -> &mut dyn Any {
self
}
fn cloned(&self) -> Box<dyn ExtensionOptions> {
Box::new(self.clone())
}
fn set(&mut self, _key: &str, _value: &str) -> datafusion::common::Result<()> {
todo!("to be implemented later")
}
fn entries(&self) -> Vec<ConfigEntry> {
todo!("to be implemented later")
}
}
impl ConfigExtension for MyRuleConfig {
const PREFIX: &'static str = "my_rule";
}it should print: |
From #18168, which is linked at the beginning of this PR: """ This is the case that we (DataDog) have faced several times already in several of our optimization rules
I'd say not necessarily, a rule that is a simple empty struct might be constructed statically. Following this same reasoning, we might also decide that it's better to let
As we have a legitimate need for changing |
If it was only up to me, I'd say the change in this PR is appropriate and I'm good with moving forward with it. It's not only up to me though, we should also weight opinions of other contributors like you, and specially yours a as long tenured contributor to DataFusion. Would you be fine with a more contained change that uses |
I'll note that this would be an immediate breaking change and a much worse experience for downstream users. We'd at the very least need to do the dance of adding a new method, deprecating the old one, etc. |
Yeah, definitely. Did not explicitly bake that into my explanation but I was pricing that in |
Users do create their own rules, and that is very common. I have created quite few of them, hence my adamance we do not need this change. There is nothing you mentioned in #18168 that can't be done with the previous API. Requirements of #18168 are very vague as well. If you took If you like, we can take this at regular datafusion meeting, I have no problem with that. We can also take discussion about format of this PR as well, lack description or justification of breaking public API, or we can just revert the change and you do your specific use case things in your own code. Please let me know the which direction you would like to take, |
This is true, for what is worth, we can leave the I'm fine in rolling back this change and finish the discussion here, are you fine with that @adriangb? |
The lack of description in this PR is my fault, I apologize for that. I don't think there's discussion needed, a good PR description is clearly helpful and not having one in this case was frustrating to you and possibly other community members. This is not a justification but this was a follow up / alternative to #18168 and although it is linked to it in various ways (it's the first link that appears under the discussion) I should have copied the link into the description and added some more background. That's a mistake on my end and I understand why that might be frustrating, but I don't think that invalidates the entire change or requires the PR to be reverted. #18168 (comment) (from the PR mentioned above) links to code currently being used to work around this that is very similar to what you are proposing: impl ConfigExtension for DistributedConfig {
const PREFIX: &'static str = "distributed";
}While it technically works, it is not ergonomic. My view and understanding of the project is that wanting to do something with DataFusion that ends up being hard to figure out / not ergonomic is generally a valid reason for making tweaks to a public API. Beyond that I think the details of what constitutes unergonomic and what is a large or small API change are somewhat subjective. Gabriel and I do not work for the same organization, so there is no If you still are not comfortable with the change I think we should just revert it and start discussion anew with more formal information in the PR description itself or just live with the current status quo. It doesn't seem worth burdening the wider community with. Would you like me to revert the PR? |
Yes that's fine by me. I'll prepare a revert. |
…e context into optimizer rules (apache#18739)" This reverts commit b990987.
…e context into optimizer rules (apache#18739)" This reverts commit b990987.
…e context into optimizer rules (apache#18739)" This reverts commit b990987.
|
left you a comment in your original repo @gabotechs, not sure if that will help you |
No description provided.