Commit f44ead8
[SPARK-21538][SQL] Attribute resolution inconsistency in the Dataset API
## What changes were proposed in this pull request?
This PR contains a tiny update that removes an attribute resolution inconsistency in the Dataset API. The following example is taken from the ticket description:
```
spark.range(1).withColumnRenamed("id", "x").sort(col("id")) // works
spark.range(1).withColumnRenamed("id", "x").sort($"id") // works
spark.range(1).withColumnRenamed("id", "x").sort('id) // works
spark.range(1).withColumnRenamed("id", "x").sort("id") // fails with:
org.apache.spark.sql.AnalysisException: Cannot resolve column name "id" among (x);
```
The above `AnalysisException` happens because the last case calls `Dataset.apply()` to convert strings into columns, which triggers attribute resolution. To make the API consistent between overloaded methods, this PR defers the resolution and constructs columns directly.
Author: aokolnychyi <anton.okolnychyi@sap.com>
Closes #18740 from aokolnychyi/spark-21538.1 parent 9f5647d commit f44ead8
2 files changed
Lines changed: 14 additions & 1 deletion
File tree
- sql/core/src
- main/scala/org/apache/spark/sql
- test/scala/org/apache/spark/sql
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1108 | 1108 | | |
1109 | 1109 | | |
1110 | 1110 | | |
1111 | | - | |
| 1111 | + | |
1112 | 1112 | | |
1113 | 1113 | | |
1114 | 1114 | | |
| |||
Lines changed: 13 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1304 | 1304 | | |
1305 | 1305 | | |
1306 | 1306 | | |
| 1307 | + | |
| 1308 | + | |
| 1309 | + | |
| 1310 | + | |
| 1311 | + | |
| 1312 | + | |
| 1313 | + | |
| 1314 | + | |
| 1315 | + | |
| 1316 | + | |
| 1317 | + | |
| 1318 | + | |
| 1319 | + | |
1307 | 1320 | | |
1308 | 1321 | | |
1309 | 1322 | | |
| |||
0 commit comments