Commit 3f6a2bb
[SPARK-17515] CollectLimit.execute() should perform per-partition limits
## What changes were proposed in this pull request?
CollectLimit.execute() incorrectly omits per-partition limits, leading to performance regressions in case this case is hit (which should not happen in normal operation, but can occur in some cases (see #15068 for one example).
## How was this patch tested?
Regression test in SQLQuerySuite that asserts the number of records scanned from the input RDD.
Author: Josh Rosen <[email protected]>
Closes #15070 from JoshRosen/SPARK-17515.1 parent 46f5c20 commit 3f6a2bb
File tree
2 files changed
+11
-1
lines changed- sql/core/src
- main/scala/org/apache/spark/sql/execution
- test/scala/org/apache/spark/sql
2 files changed
+11
-1
lines changedLines changed: 2 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
39 | 39 | | |
40 | 40 | | |
41 | 41 | | |
| 42 | + | |
42 | 43 | | |
43 | 44 | | |
44 | | - | |
| 45 | + | |
45 | 46 | | |
46 | 47 | | |
47 | 48 | | |
| |||
Lines changed: 9 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2661 | 2661 | | |
2662 | 2662 | | |
2663 | 2663 | | |
| 2664 | + | |
| 2665 | + | |
| 2666 | + | |
| 2667 | + | |
| 2668 | + | |
| 2669 | + | |
| 2670 | + | |
| 2671 | + | |
| 2672 | + | |
2664 | 2673 | | |
0 commit comments