Commit ccb0a59
[SPARK-23446][PYTHON] Explicitly check supported types in toPandas
## What changes were proposed in this pull request?
This PR explicitly specifies and checks the types we supported in `toPandas`. This was a hole. For example, we haven't finished the binary type support in Python side yet but now it allows as below:
```python
spark.conf.set("spark.sql.execution.arrow.enabled", "false")
df = spark.createDataFrame([[bytearray("a")]])
df.toPandas()
spark.conf.set("spark.sql.execution.arrow.enabled", "true")
df.toPandas()
```
```
_1
0 [97]
_1
0 a
```
This should be disallowed. I think the same things also apply to nested timestamps too.
I also added some nicer message about `spark.sql.execution.arrow.enabled` in the error message.
## How was this patch tested?
Manually tested and tests added in `python/pyspark/sql/tests.py`.
Author: hyukjinkwon <[email protected]>
Closes #20625 from HyukjinKwon/pandas_convertion_supported_type.
(cherry picked from commit c5857e4)
Signed-off-by: gatorsmile <[email protected]>1 parent 75bb19a commit ccb0a59
2 files changed
+17
-7
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1943 | 1943 | | |
1944 | 1944 | | |
1945 | 1945 | | |
1946 | | - | |
| 1946 | + | |
1947 | 1947 | | |
1948 | | - | |
1949 | 1948 | | |
| 1949 | + | |
| 1950 | + | |
1950 | 1951 | | |
1951 | 1952 | | |
1952 | 1953 | | |
| |||
1955 | 1956 | | |
1956 | 1957 | | |
1957 | 1958 | | |
1958 | | - | |
1959 | | - | |
1960 | | - | |
1961 | | - | |
| 1959 | + | |
| 1960 | + | |
| 1961 | + | |
| 1962 | + | |
| 1963 | + | |
| 1964 | + | |
1962 | 1965 | | |
1963 | 1966 | | |
1964 | 1967 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
3443 | 3443 | | |
3444 | 3444 | | |
3445 | 3445 | | |
3446 | | - | |
| 3446 | + | |
| 3447 | + | |
| 3448 | + | |
| 3449 | + | |
| 3450 | + | |
| 3451 | + | |
| 3452 | + | |
| 3453 | + | |
3447 | 3454 | | |
3448 | 3455 | | |
3449 | 3456 | | |
| |||
0 commit comments