-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-37638][PYTHON] Use existing active Spark session instead of SparkSession.getOrCreate in pandas API on Spark #34893
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -464,20 +464,12 @@ def is_testing() -> bool: | |
| return "SPARK_TESTING" in os.environ | ||
|
|
||
|
|
||
| def default_session(conf: Optional[Dict[str, Any]] = None) -> SparkSession: | ||
| if conf is None: | ||
| conf = dict() | ||
| def default_session() -> SparkSession: | ||
| spark = SparkSession.getActiveSession() | ||
| if spark is not None: | ||
| return spark | ||
|
|
||
| builder = SparkSession.builder.appName("pandas-on-Spark") | ||
| for key, value in conf.items(): | ||
| builder = builder.config(key, value) | ||
| # Currently, pandas-on-Spark is dependent on such join due to 'compute.ops_on_diff_frames' | ||
| # configuration. This is needed with Spark 3.0+. | ||
| builder.config("spark.sql.analyzer.failAmbiguousSelfJoin", False) | ||
|
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In fact, we fixed this bug in the master branch so we don't need to set this anymore |
||
|
|
||
| if is_testing(): | ||
| builder.config("spark.executor.allowSparkContext", False) | ||
|
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. and this will be set separately when SparkContext is created, not here.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. and in fact this was set when we run tests with
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Actually this was added for our test to check our code doesn't create |
||
|
|
||
| return builder.getOrCreate() | ||
|
|
||
|
|
||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure we can remove the
confargument here?I guess we should show a deprecation warning if it's not
Nonefor now and remove it in the future?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this isn't an API that's not documented so it should be fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd just leave it to you.