Skip to content

Commit 8d64cb4

Browse files
itholicHyukjinKwon
authored andcommitted
[SPARK-46360][PYTHON] Enhance error message debugging with new getMessage API
### What changes were proposed in this pull request? This PR proposes to introduce `getMessage` to provide a standardized way for users to obtain a concise and clear error message. ### Why are the changes needed? Previously, extracting a simple and informative error message in PySpark was not straightforward. The internal `ErrorClassesReader.get_error_message` method was often used, but for JVM-originated errors not defined in `error_classes.py`, obtaining a succinct error message was challenging. The new `getMessage` API harmonizes error message retrieval across PySpark, leveraging existing JVM implementations to ensure consistency and clarity in the messages presented to the users. ### Does this PR introduce _any_ user-facing change? Yes, this PR introduces a `getMessage` for directly accessing simplified error messages in PySpark. - **Before**: No official API for simplified error messages; excessive details in the error output: ```python from pyspark.sql.utils import AnalysisException try: spark.sql("""SELECT a""") except AnalysisException as e: str(e) # "[UNRESOLVED_COLUMN.WITHOUT_SUGGESTION] A column, variable, or function parameter with name `a` cannot be resolved. SQLSTATE: 42703; line 1 pos 7;\n'Project ['a]\n+- OneRowRelation\n" ``` - **After**: The `getMessage` API provides streamlined, user-friendly error messages: ```python from pyspark.sql.utils import AnalysisException try: spark.sql("""SELECT a""") except AnalysisException as e: e.getMessage() # '[UNRESOLVED_COLUMN.WITHOUT_SUGGESTION] A column, variable, or function parameter with name `a` cannot be resolved. SQLSTATE: 42703' ``` ### How was this patch tested? Added UTs. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #44292 from itholic/getMessage. Authored-by: Haejoon Lee <haejoon.lee@databricks.com> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
1 parent 2a49fee commit 8d64cb4

3 files changed

Lines changed: 44 additions & 1 deletion

File tree

python/pyspark/errors/exceptions/base.py

Lines changed: 18 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -60,6 +60,7 @@ def getErrorClass(self) -> Optional[str]:
6060
6161
See Also
6262
--------
63+
:meth:`PySparkException.getMessage`
6364
:meth:`PySparkException.getMessageParameters`
6465
:meth:`PySparkException.getSqlState`
6566
"""
@@ -74,6 +75,7 @@ def getMessageParameters(self) -> Optional[Dict[str, str]]:
7475
See Also
7576
--------
7677
:meth:`PySparkException.getErrorClass`
78+
:meth:`PySparkException.getMessage`
7779
:meth:`PySparkException.getSqlState`
7880
"""
7981
return self._message_parameters
@@ -89,13 +91,28 @@ def getSqlState(self) -> Optional[str]:
8991
See Also
9092
--------
9193
:meth:`PySparkException.getErrorClass`
94+
:meth:`PySparkException.getMessage`
9295
:meth:`PySparkException.getMessageParameters`
9396
"""
9497
return None
9598

99+
def getMessage(self) -> str:
100+
"""
101+
Returns full error message.
102+
103+
.. versionadded:: 4.0.0
104+
105+
See Also
106+
--------
107+
:meth:`PySparkException.getErrorClass`
108+
:meth:`PySparkException.getMessageParameters`
109+
:meth:`PySparkException.getSqlState`
110+
"""
111+
return f"[{self.getErrorClass()}] {self._message}"
112+
96113
def __str__(self) -> str:
97114
if self.getErrorClass() is not None:
98-
return f"[{self.getErrorClass()}] {self._message}"
115+
return self.getMessage()
99116
else:
100117
return self._message
101118

python/pyspark/errors/exceptions/captured.py

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -118,6 +118,24 @@ def getSqlState(self) -> Optional[str]:
118118
else:
119119
return None
120120

121+
def getMessage(self) -> str:
122+
assert SparkContext._gateway is not None
123+
gw = SparkContext._gateway
124+
125+
if self._origin is not None and is_instance_of(
126+
gw, self._origin, "org.apache.spark.SparkThrowable"
127+
):
128+
error_class = self._origin.getErrorClass()
129+
message_parameters = self._origin.getMessageParameters()
130+
131+
error_message = gw.jvm.org.apache.spark.SparkThrowableHelper.getMessage(
132+
error_class, message_parameters
133+
)
134+
135+
return error_message
136+
else:
137+
return ""
138+
121139

122140
def convert_exception(e: Py4JJavaError) -> CapturedException:
123141
assert e is not None

python/pyspark/sql/tests/test_utils.py

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1750,13 +1750,21 @@ def test_get_error_class_state(self):
17501750
self.assertEqual(e.getErrorClass(), "UNRESOLVED_COLUMN.WITHOUT_SUGGESTION")
17511751
self.assertEqual(e.getSqlState(), "42703")
17521752
self.assertEqual(e.getMessageParameters(), {"objectName": "`a`"})
1753+
self.assertEqual(
1754+
e.getMessage(),
1755+
(
1756+
"[UNRESOLVED_COLUMN.WITHOUT_SUGGESTION] A column, variable, or function "
1757+
"parameter with name `a` cannot be resolved. SQLSTATE: 42703"
1758+
),
1759+
)
17531760

17541761
try:
17551762
self.spark.sql("""SELECT assert_true(FALSE)""")
17561763
except AnalysisException as e:
17571764
self.assertIsNone(e.getErrorClass())
17581765
self.assertIsNone(e.getSqlState())
17591766
self.assertEqual(e.getMessageParameters(), {})
1767+
self.assertEqual(e.getMessage(), "")
17601768

17611769

17621770
if __name__ == "__main__":

0 commit comments

Comments
 (0)