Commit dbf0b50
committed
[SPARK-35560][SQL] Remove redundant subexpression evaluation in nested subexpressions
### What changes were proposed in this pull request?
This patch proposes to improve subexpression evaluation under whole-stage codegen for the cases of nested subexpressions.
### Why are the changes needed?
In the cases of nested subexpressions, whole-stage codegen's subexpression elimination will do redundant subexpression evaluation. We should reduce it. For example, if we have two sub-exprs:
1. `simpleUDF($"id")`
2. `functions.length(simpleUDF($"id"))`
We should only evaluate `simpleUDF($"id")` once, i.e.
```java
subExpr1 = simpleUDF($"id");
subExpr2 = functions.length(subExpr1);
```
Snippets of generated codes:
Before:
```java
/* 040 */ private int project_subExpr_1(long project_expr_0_0) {
/* 041 */ boolean project_isNull_6 = false;
/* 042 */ UTF8String project_value_6 = null;
/* 043 */ if (!false) {
/* 044 */ project_value_6 = UTF8String.fromString(String.valueOf(project_expr_0_0));
/* 045 */ }
/* 046 */
/* 047 */ Object project_arg_1 = null;
/* 048 */ if (project_isNull_6) {
/* 049 */ project_arg_1 = ((scala.Function1[]) references[3] /* converters */)[0].apply(null);
/* 050 */ } else {
/* 051 */ project_arg_1 = ((scala.Function1[]) references[3] /* converters */)[0].apply(project_value_6); /* 052 */ }
/* 053 */
/* 054 */ UTF8String project_result_1 = null; /* 055 */ try { /* 056 */ project_result_1 = (UTF8String)((scala.Function1[]) references[3] /* converters */)[1].apply(((scala.Function1) references[4] /* udf */).apply(project_arg_1)
);
/* 057 */ } catch (Throwable e) {
/* 058 */ throw QueryExecutionErrors.failedExecuteUserDefinedFunctionError(
/* 059 */ "DataFrameSuite$$Lambda$6418/1507986601", "string", "string", e);
/* 060 */ }
/* 061 */
/* 062 */ boolean project_isNull_5 = project_result_1 == null;
/* 063 */ UTF8String project_value_5 = null;
/* 064 */ if (!project_isNull_5) {
/* 065 */ project_value_5 = project_result_1;
/* 066 */ }
/* 067 */ boolean project_isNull_4 = project_isNull_5;
/* 068 */ int project_value_4 = -1;
/* 069 */
/* 070 */ if (!project_isNull_5) {
/* 071 */ project_value_4 = (project_value_5).numChars();
/* 072 */ }
/* 073 */ project_subExprIsNull_1 = project_isNull_4;
/* 074 */ return project_value_4;
/* 075 */ }
...
/* 149 */ private UTF8String project_subExpr_0(long project_expr_0_0) {
/* 150 */ boolean project_isNull_2 = false;
/* 151 */ UTF8String project_value_2 = null;
/* 152 */ if (!false) {
/* 153 */ project_value_2 = UTF8String.fromString(String.valueOf(project_expr_0_0));
/* 154 */ }
/* 155 */
/* 156 */ Object project_arg_0 = null;
/* 157 */ if (project_isNull_2) {
/* 158 */ project_arg_0 = ((scala.Function1[]) references[1] /* converters */)[0].apply(null);
/* 159 */ } else {
/* 160 */ project_arg_0 = ((scala.Function1[]) references[1] /* converters */)[0].apply(project_value_2);
/* 161 */ }
/* 162 */
/* 163 */ UTF8String project_result_0 = null;
/* 164 */ try {
/* 165 */ project_result_0 = (UTF8String)((scala.Function1[]) references[1] /* converters */)[1].apply(((scala.Function1) references[2] /* udf */).apply(project_arg_0)
);
/* 166 */ } catch (Throwable e) {
/* 167 */ throw QueryExecutionErrors.failedExecuteUserDefinedFunctionError(
/* 168 */ "DataFrameSuite$$Lambda$6418/1507986601", "string", "string", e);
/* 169 */ }
/* 170 */
/* 171 */ boolean project_isNull_1 = project_result_0 == null; /* 172 */ UTF8String project_value_1 = null; /* 173 */ if (!project_isNull_1) { /* 174 */ project_value_1 = project_result_0;
/* 175 */ }
/* 176 */ project_subExprIsNull_0 = project_isNull_1;
/* 177 */ return project_value_1;
/* 178 */ }
```
After:
```java
/* 041 */ private void project_subExpr_1(long project_expr_0_0) {
/* 042 */ boolean project_isNull_8 = project_subExprIsNull_0;
/* 043 */ int project_value_8 = -1;
/* 044 */
/* 045 */ if (!project_subExprIsNull_0) {
/* 046 */ project_value_8 = (project_mutableStateArray_0[0]).numChars();
/* 047 */ }
/* 048 */ project_subExprIsNull_1 = project_isNull_8;
/* 049 */ project_subExprValue_0 = project_value_8;
/* 050 */ }
/* 056 */
...
/* 123 */
/* 124 */ private void project_subExpr_0(long project_expr_0_0) {
/* 125 */ boolean project_isNull_6 = false;
/* 126 */ UTF8String project_value_6 = null;
/* 127 */ if (!false) {
/* 128 */ project_value_6 = UTF8String.fromString(String.valueOf(project_expr_0_0));
/* 129 */ }
/* 130 */
/* 131 */ Object project_arg_1 = null;
/* 132 */ if (project_isNull_6) {
/* 133 */ project_arg_1 = ((scala.Function1[]) references[3] /* converters */)[0].apply(null);
/* 134 */ } else {
/* 135 */ project_arg_1 = ((scala.Function1[]) references[3] /* converters */)[0].apply(project_value_6);
/* 136 */ }
/* 137 */
/* 138 */ UTF8String project_result_1 = null;
/* 139 */ try {
/* 140 */ project_result_1 = (UTF8String)((scala.Function1[]) references[3] /* converters */)[1].apply(((scala.Function1) references[4] /* udf */).apply(project_arg_1)
);
/* 141 */ } catch (Throwable e) {
/* 142 */ throw QueryExecutionErrors.failedExecuteUserDefinedFunctionError(
/* 143 */ "DataFrameSuite$$Lambda$6430/2004847941", "string", "string", e);
/* 144 */ }
/* 145 */
/* 146 */ boolean project_isNull_5 = project_result_1 == null;
/* 147 */ UTF8String project_value_5 = null;
/* 148 */ if (!project_isNull_5) {
/* 149 */ project_value_5 = project_result_1;
/* 150 */ }
/* 151 */ project_subExprIsNull_0 = project_isNull_5;
/* 152 */ project_mutableStateArray_0[0] = project_value_5;
/* 153 */ }
```
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
Unit test.
Closes #32699 from viirya/improve-subexpr.
Authored-by: Liang-Chi Hsieh <[email protected]>
Signed-off-by: Liang-Chi Hsieh <[email protected]>1 parent 9d0d4ed commit dbf0b50
File tree
2 files changed
+55
-19
lines changed- sql
- catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen
- core/src/test/scala/org/apache/spark/sql
2 files changed
+55
-19
lines changedLines changed: 30 additions & 19 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1039 | 1039 | | |
1040 | 1040 | | |
1041 | 1041 | | |
1042 | | - | |
| 1042 | + | |
| 1043 | + | |
1043 | 1044 | | |
1044 | 1045 | | |
1045 | 1046 | | |
1046 | 1047 | | |
1047 | 1048 | | |
1048 | 1049 | | |
1049 | 1050 | | |
1050 | | - | |
1051 | 1051 | | |
1052 | | - | |
1053 | | - | |
1054 | | - | |
1055 | | - | |
1056 | | - | |
| 1052 | + | |
| 1053 | + | |
| 1054 | + | |
| 1055 | + | |
| 1056 | + | |
| 1057 | + | |
| 1058 | + | |
| 1059 | + | |
| 1060 | + | |
1057 | 1061 | | |
1058 | 1062 | | |
1059 | 1063 | | |
| |||
1068 | 1072 | | |
1069 | 1073 | | |
1070 | 1074 | | |
1071 | | - | |
| 1075 | + | |
| 1076 | + | |
1072 | 1077 | | |
1073 | | - | |
| 1078 | + | |
| 1079 | + | |
| 1080 | + | |
| 1081 | + | |
1074 | 1082 | | |
1075 | | - | |
| 1083 | + | |
| 1084 | + | |
| 1085 | + | |
| 1086 | + | |
| 1087 | + | |
1076 | 1088 | | |
1077 | 1089 | | |
1078 | 1090 | | |
| |||
1090 | 1102 | | |
1091 | 1103 | | |
1092 | 1104 | | |
1093 | | - | |
1094 | 1105 | | |
1095 | 1106 | | |
1096 | | - | |
| 1107 | + | |
1097 | 1108 | | |
1098 | 1109 | | |
1099 | | - | |
| 1110 | + | |
1100 | 1111 | | |
1101 | 1112 | | |
1102 | 1113 | | |
1103 | | - | |
1104 | | - | |
| 1114 | + | |
1105 | 1115 | | |
1106 | 1116 | | |
1107 | | - | |
| 1117 | + | |
1108 | 1118 | | |
| 1119 | + | |
1109 | 1120 | | |
1110 | 1121 | | |
1111 | 1122 | | |
1112 | 1123 | | |
1113 | 1124 | | |
1114 | | - | |
| 1125 | + | |
1115 | 1126 | | |
1116 | 1127 | | |
1117 | 1128 | | |
1118 | | - | |
| 1129 | + | |
1119 | 1130 | | |
1120 | | - | |
| 1131 | + | |
1121 | 1132 | | |
1122 | 1133 | | |
1123 | 1134 | | |
| |||
Lines changed: 25 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2882 | 2882 | | |
2883 | 2883 | | |
2884 | 2884 | | |
| 2885 | + | |
| 2886 | + | |
| 2887 | + | |
| 2888 | + | |
| 2889 | + | |
| 2890 | + | |
| 2891 | + | |
| 2892 | + | |
| 2893 | + | |
| 2894 | + | |
| 2895 | + | |
| 2896 | + | |
| 2897 | + | |
| 2898 | + | |
| 2899 | + | |
| 2900 | + | |
| 2901 | + | |
| 2902 | + | |
| 2903 | + | |
| 2904 | + | |
| 2905 | + | |
| 2906 | + | |
| 2907 | + | |
| 2908 | + | |
| 2909 | + | |
2885 | 2910 | | |
2886 | 2911 | | |
2887 | 2912 | | |
| |||
0 commit comments