[SPARK-25850][SQL] Make the split threshold for the code generated function configurable #22847

yucai · 2018-10-26T08:11:09Z

What changes were proposed in this pull request?

As per the discussion in #22823, add a new configuration to make the split threshold for the code generated function configurable.

When the generated Java function source code exceeds spark.sql.codegen.methodSplitThreshold, it will be split into multiple small functions.

How was this patch tested?

manual tests

…thod configurable

yucai · 2018-10-26T08:16:59Z

@cloud-fan @dongjoon-hyun @gengliangwang Kindly help review.

cloud-fan · 2018-10-26T10:47:10Z

sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala

+    .internal()
+    .doc("The maximum source code length of a single Java function by codegen. When the " +
+      "generated Java function source code exceeds this threshold, it will be split into " +
+      "multiple small functions, each function length is spark.sql.codegen.methodSplitThreshold." +


each function length is spark.sql.codegen.methodSplitThreshold this is not true, the method size is always larger than the threshold. cc @kiszk any idea about the naming and description of this config?

IMHO, spark.sql.codegen.methodSplitThreshold can be used, but the description should be changed. For example,
The threshold of source code length without comment of a single Java function by codegen to be split. When the generated Java function source code exceeds this threshold, it will be split into multiple small functions. ...

SparkQA · 2018-10-26T11:47:02Z

Test build #98080 has finished for PR 22847 at commit 188a947.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

gengliangwang · 2018-10-26T17:07:29Z

sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala

+      "generated Java function source code exceeds this threshold, it will be split into " +
+      "multiple small functions, each function length is spark.sql.codegen.methodSplitThreshold." +
+      " A function's bytecode should not go beyond 8KB, otherwise it will not be JITted, should " +
+      "also not be too small, or we will have many function calls. We can't know how many " +


it will not be JITted; it also should not not be too small, otherwise there will be many function calls.

SparkQA · 2018-10-27T18:00:39Z

Test build #98129 has finished for PR 22847 at commit ee986cc.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2018-10-27T18:04:28Z

Test build #98130 has finished for PR 22847 at commit b578dd4.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2018-10-27T18:32:43Z

Test build #98132 has finished for PR 22847 at commit 0db224f.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

dongjoon-hyun · 2018-10-28T16:42:13Z

sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala

+      "be generated, so use the code length as metric. A function's bytecode should not go " +
+      "beyond 8KB, otherwise it will not be JITted; it also should not be too small, otherwise " +
+      "there will be many function calls.")
+    .intConf


According to the description, it seems that we had better have checkValue here. Could you recommend the reasonable min/max values, @kiszk ?

To be more accurately, I think I should add When running on HotSpot, a function's bytecode should not go beyond 8KB....

Yep. That could be max.

1000 is conservative. But, there is no recommendation since the bytecode size depends on the content (e.g. 0's byte code length is 1. 9's byte code lengh is 2).

@kiszk agree, 1000 might be not the best, see my benchmark for the wide table, 2048 is better.

================================================================================================ projection on wide table ================================================================================================ Java HotSpot(TM) 64-Bit Server VM 1.8.0_162-b12 on Mac OS X 10.13.6 Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz projection on wide table: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------ split threshold 10 8464 / 8737 0.1 8072.0 1.0X split threshold 100 5959 / 6251 0.2 5683.4 1.4X split threshold 1024 3202 / 3248 0.3 3053.2 2.6X split threshold 2048 3009 / 3097 0.3 2869.2 2.8X split threshold 4096 3414 / 3458 0.3 3256.1 2.5X split threshold 8196 4095 / 4112 0.3 3905.5 2.1X split threshold 65536 28800 / 29705 0.0 27465.8 0.3X

Could you try some very long alias names or complex expressions? You will get different number, right?

Seems like long alias names have no influence.

[info] Java HotSpot(TM) 64-Bit Server VM 1.8.0_162-b12 on Mac OS X 10.13.6 [info] Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz [info] projection on wide table: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------ [info] split threshold 10 6512 / 6736 0.2 6210.4 1.0X [info] split threshold 100 5730 / 6329 0.2 5464.9 1.1X [info] split threshold 1024 3119 / 3184 0.3 2974.6 2.1X [info] split threshold 2048 2981 / 3100 0.4 2842.9 2.2X [info] split threshold 4096 3289 / 3379 0.3 3136.6 2.0X [info] split threshold 8196 4307 / 4338 0.2 4108.0 1.5X [info] split threshold 65536 29147 / 30212 0.0 27797.0 0.2X

No averylongprefixrepeatedmultipletimes in the expression code gen:

/* 047 */ private void createExternalRow_0_8(InternalRow i, Object[] values_0) { /* 048 */ /* 049 */ // input[80, bigint, false] /* 050 */ long value_81 = i.getLong(80); /* 051 */ if (false) { /* 052 */ values_0[80] = null; /* 053 */ } else { /* 054 */ values_0[80] = value_81; /* 055 */ } /* 056 */ /* 057 */ // input[81, bigint, false] /* 058 */ long value_82 = i.getLong(81); /* 059 */ if (false) { /* 060 */ values_0[81] = null; /* 061 */ } else { /* 062 */ values_0[81] = value_82; /* 063 */ } /* 064 */ /* 065 */ // input[82, bigint, false] /* 066 */ long value_83 = i.getLong(82); /* 067 */ if (false) { /* 068 */ values_0[82] = null; /* 069 */ } else { /* 070 */ values_0[82] = value_83; /* 071 */ } /* 072 */ ...

My benchmark:

object WideTableBenchmark extends SqlBasedBenchmark { override def runBenchmarkSuite(mainArgs: Array[String]): Unit = { runBenchmark("projection on wide table") { val N = 1 << 20 val df = spark.range(N) val columns = (0 until 400).map{ i => s"id as averylongprefixrepeatedmultipletimes_id$i"} val benchmark = new Benchmark("projection on wide table", N, output = output) Seq("10", "100", "1024", "2048", "4096", "8196", "65536").foreach { n => benchmark.addCase(s"split threshold $n", numIters = 5) { iter => withSQLConf("spark.testing.codegen.splitThreshold" -> n) { df.selectExpr(columns: _*).foreach(identity(_)) } } } benchmark.run() } } }

Will keep benchmarking for the complex expression.

The "freshNamePrefix" prefix is only applied in whole-stage codegen,
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala#L87
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala#L169

It doesn't take any effect in non-whole-stage codegen.

If you intend to stress test expression codegen but don't see the prefix being prepended, you're probably not adding it in the right place. Where did you add it?

Oh I see, you're using the column name...that's not the right place to put the "prefix". Column names are almost never carried over to the generated code in the current framework (the only exception is the lambda variable name).

@rednaxelafx the wide table benchmark I used has 400 columns, whole stage codegen is disabled by default.

HyukjinKwon · 2018-10-29T03:33:34Z

sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala

+    .internal()
+    .doc("The threshold of source code length without comment of a single Java function by " +
+      "codegen to be split. When the generated Java function source code exceeds this threshold" +
+      ", it will be split into multiple small functions. We can't know how many bytecode will " +


Not a big deal but I would avoid abbreviation in documentation. can't -> cannot

SparkQA · 2018-10-29T07:05:02Z

Test build #98182 has finished for PR 22847 at commit b0ce2ca.

This patch fails due to an unknown error code, -9.
This patch merges cleanly.
This patch adds no public classes.

yucai · 2018-10-29T07:22:40Z

retest this please

SparkQA · 2018-10-29T10:58:57Z

Test build #98190 has finished for PR 22847 at commit b0ce2ca.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

cloud-fan · 2018-10-31T01:14:37Z

sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala

+      "bytecode should not go beyond 8KB, otherwise it will not be JITted; it also should not " +
+      "be too small, otherwise there will be many function calls.")
+    .intConf
+    .createWithDefault(1024)


let's add a check value to make sure the value is positive. We can figure out a lower and upper bound later.

yucai · 2018-10-31T01:59:19Z

@cloud-fan @dongjoon-hyun @kiszk I just add a negative check, maybe we need another PR to figure out better value later if it is not easy to decide now.

dongjoon-hyun

Thanks! For me, it looks enough to proceed for now.

SparkQA · 2018-10-31T05:17:25Z

Test build #98288 has finished for PR 22847 at commit ba392c8.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

gatorsmile · 2018-10-31T06:04:25Z

sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala

+    .internal()
+    .doc("The threshold of source code length without comment of a single Java function by " +
+      "codegen to be split. When the generated Java function source code exceeds this threshold" +
+      ", it will be split into multiple small functions. We cannot know how many bytecode will " +


The threshold of source-code splitting in the codegen. When the number of characters in a single JAVA function (without comment) exceeds the threshold, the function will be automatically split to multiple smaller ones.

kiszk · 2018-10-31T06:38:28Z

sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala

+  val CODEGEN_METHOD_SPLIT_THRESHOLD = buildConf("spark.sql.codegen.methodSplitThreshold")
+    .internal()
+    .doc("The threshold of source-code splitting in the codegen. When the number of characters " +
+      "in a single JAVA function (without comment) exceeds the threshold, the function will be " +


nit: JAVA -> Java

SparkQA · 2018-10-31T07:05:01Z

Test build #98295 has finished for PR 22847 at commit 2fc6417.

This patch fails due to an unknown error code, -9.
This patch merges cleanly.
This patch adds no public classes.

dilipbiswal · 2018-10-31T07:07:11Z

retest this please

SparkQA · 2018-10-31T10:45:13Z

Test build #98298 has finished for PR 22847 at commit 2fc6417.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2018-10-31T10:46:53Z

Test build #98301 has finished for PR 22847 at commit 8eeb538.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

rednaxelafx · 2018-10-31T20:27:30Z

Using source code length will have to be a very coarse-grained, "fuzzy" heuristic. It's not meant to be accurate. So just pick some number that makes sense.
2048 might be a good enough choice for now.

Tidbit: before this PR, since the 1024 split threshold is hard coded, the way I stress test the code splitting path is to hack into CodegenContext.freshName() and artificially prepend a very long prefix to the symbol names.
e.g. "project_value_1" becomes "averylongprefixrepeatedmultipletimes_project_value_1". That inflates the source code size a lot which triggers code splitting a lot more frequently under the existing heuristics.
I'm mentioning this here to show that source code length based heuristics are inherently imprecise. It'll actually behave slightly differently for expressions in different physical operators because of the prefix length is different — "serializefromobject_" is much longer than "project_", which is longer than "agg_", right?

Side notes for those curious:
(details are way too complicated to be addressed by a source code length based heuristic, but I'm making this note just to let people know what's involved)

Performance-wise, it might be very interesting to note that, running on HotSpot, sometimes it might be better to split into more smaller methods, and sometimes it might be better to split into less but larger methods.

The theory is this: if a method is really hot, we'd like it to be eventually inlined by the JIT to reduce method invocation overhead and get better optimization. In HotSpot, hot methods will eventually be compiled by C2, and it uses multiple bytecode size based threshold to determine whether or not a callee is fit for inlining.

MaxInlineSize = 35 // in bytes of bytecode. "The maximum bytecode size of a method to be inlined"
FreqInlineSize = 325 // in bytes of bytecode. "The maximum bytecode size of a frequent method to be inlined"
MaxTrivialSize = 6 // in bytes of bytecode. "The maximum bytecode size of a trivial method to be inlined"
InlineFrequencyRatio = 20 // "Ratio of call site execution to caller method invocation"
// there's also a InlineFrequencyCount which we'll omit here for simplicity

(values shown above are default values for C2 on x86.)
c.f. http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/file/tip/src/share/vm/runtime/globals.hpp

Don't be confused by the name, though.
MaxTrivialSize is meant for regular getters/setters, and they're considered to be always inlineable (unless explicitly configured as dontinline in a compiler directive option).
MaxInlineSize is for regular small methods, with more checks than the "trivial" case, without checking the invocation frequency.
InlineFrequencyRatio is the callee-to-caller invocation ratio for a call site to be considered as "hot" (or "frequent"). The default is 20:1. This is usually for call sites inside of loops, and HotSpot can collect loop counts / invocation counts in profiling, to figure out the ratio.
Similarly, there's a InlineFrequencyCount that if a call site (after scaling according to inline depth etc) is invoked more than this many times, it can also be considered as "hot".
FreqInlineSize is the max size a "hot" call site can be considered for inlining. It's 325 bytes of bytecode, obviously larger than MaxInlineSize so it might sound confusing at first ;-)

So the deal is: on HotSpot, if you want to save bytecode code size by

Outlining a utility method that's <= 6 bytes, there's no overhead at all. But in practice this is not really feasible, because at the call site of this utility method, "invokespecial" itself is 3 bytes, and you'd at least need to pass in "this" which is 1 byte, and if you need to pass in arguments (e.g. INPUT_ROW) that's at least 1 or 2 more bytes, which already adds up to 4 to 5 bytes at the call site, whereas the callee's method body can only be <= 6 bytes, there's not much you can do…
Outlining a utility method that's <= 35 bytes. That's the easiest for both HotSpot C1 and C2 to inline. You can do somethings but not a lot in this method. If you want to shoot for this case, then don't make the split threshold too big.
Outlining a utility method that's <= 325 bytes. In HotSpot, only C2 gets to inline big methods like this, and only "hot call sites" are considered.

So, if a method is compute-intensive and is small enough, it'd better to strive to keep the split-off methods within the thresholds mentioned above.
Otherwise, if it's already bigger than the inline thresholds, then it makes more sense to make the split-off method as big as possible (but below 8000 bytes of bytecode) so that the invocation overhead is amortized.

rednaxelafx · 2018-10-31T20:28:59Z

Can / should we apply the same threshold conf to Expression.reduceCodeSize()?
cc @yucai @cloud-fan @kiszk

cloud-fan · 2018-11-01T01:57:41Z

@rednaxelafx ah good point! It's hardcoded as 1024 too, and it's also doing method splitting. Let's apply the config there too.

rednaxelafx · 2018-11-01T05:33:44Z

Just in case people wonder, the following is the hack patch that I used for stress testing code splitting before this PR:

--- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala
+++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala
@@ -647,11 +647,13 @@ class CodegenContext(val useStreamlining: Boolean) {
    * Returns a term name that is unique within this instance of a `CodegenContext`.
    */
   def freshName(name: String): String = synchronized {
-    val fullName = if (freshNamePrefix == "") {
+    // hack: intentionally add a very long prefix (length=300 characters) to
+    // trigger code splitting more frequently
+    val fullName = ("averylongprefix" * 20) + (if (freshNamePrefix == "") {
       name
     } else {
       s"${freshNamePrefix}_$name"
-    }
+    })
     if (freshNameIds.contains(fullName)) {
       val id = freshNameIds(fullName)
       freshNameIds(fullName) = id + 1

Of course, now with this PR, we can simply set the split threshold to a very low value (e.g. 1) to force split.

yucai · 2018-11-02T01:26:26Z

I used the WideTableBenchmark to test this configuration.
4 scenarioes are tested, 2048 is always better than 1024, overall it is also good and looks more safe to avoid hitting 8KB limitaion.

Scenario

projection on wide table: simple

val N = 1 << 20
val df = spark.range(N)
val columns = (0 until 400).map{ i => s"id as id$i"}
df.selectExpr(columns: _*).foreach(identity(_))

projection on wide table: long alias names

val longName = "averylongaliasname" * 20
val columns = (0 until 400).map{ i => s"id as ${longName}_id$i"}
df.selectExpr(columns: _*).foreach(identity(_))

projection on wide table: many complex expressions

// 400 columns, whole stage codegen is disabled for spark.sql.codegen.maxFields
val columns = (0 until 400).map{ i => s"case when id = $i then $i else 800 end as id$i"}
df.selectExpr(columns: _*).foreach(identity(_))

projection on wide table: a big complex expressions

// Because of spark.sql.subexpressionElimination.enabled,
// the whole case when codes will be put into one function,
// and it will be invoked once only.
val columns = (0 until 400).map{ i =>
    s"case when id = ${N + 1} then 1
           when id = ${N + 2} then 1
           ...
           when id = ${N + 6} then 1
           else sqrt(N) end as id$i"}
df.selectExpr(columns: _*).foreach(identity(_))

Perf Results

================================================================================================
projection on wide table: simple
================================================================================================

Java HotSpot(TM) 64-Bit Server VM 1.8.0_162-b12 on Mac OS X 10.13.6
Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
projection on wide table: simple:        Best/Avg Time(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------
split threshold 10                            7553 / 7676          0.1        7202.7       1.0X
split threshold 100                           5463 / 5504          0.2        5210.0       1.4X
split threshold 1024                          2981 / 3017          0.4        2843.0       2.5X
split threshold 2048                          2857 / 2897          0.4        2724.2       2.6X
split threshold 4096                          3128 / 3187          0.3        2983.3       2.4X
split threshold 8196                          3755 / 3793          0.3        3581.3       2.0X
split threshold 65536                       27616 / 27685          0.0       26336.2       0.3X


================================================================================================
projection on wide table: long alias names
================================================================================================

Java HotSpot(TM) 64-Bit Server VM 1.8.0_162-b12 on Mac OS X 10.13.6
Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
projection on wide table: long alias names: Best/Avg Time(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------
split threshold 10                            7513 / 7566          0.1        7164.6       1.0X
split threshold 100                           5363 / 5410          0.2        5114.4       1.4X
split threshold 1024                          2966 / 2998          0.4        2828.3       2.5X
split threshold 2048                          2840 / 2864          0.4        2708.0       2.6X
split threshold 4096                          3126 / 3166          0.3        2981.2       2.4X
split threshold 8196                          3756 / 3823          0.3        3582.3       2.0X
split threshold 65536                       27542 / 27729          0.0       26266.4       0.3X


================================================================================================
projection on wide table: many complex expressions
================================================================================================

Java HotSpot(TM) 64-Bit Server VM 1.8.0_162-b12 on Mac OS X 10.13.6
Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
projection on wide table: complex expressions 1: Best/Avg Time(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------
split threshold 10                            8758 / 9007          0.1        8352.3       1.0X
split threshold 100                           8675 / 8754          0.1        8272.9       1.0X
split threshold 1024                          3696 / 3797          0.3        3524.7       2.4X
split threshold 2048                          3349 / 3419          0.3        3193.9       2.6X
split threshold 4096                          2967 / 3019          0.4        2829.1       3.0X
split threshold 8196                          3757 / 3841          0.3        3582.5       2.3X
split threshold 65536                       39805 / 40309          0.0       37960.7       0.2X


================================================================================================
projection on wide table: a big complex expressions
================================================================================================

Java HotSpot(TM) 64-Bit Server VM 1.8.0_162-b12 on Mac OS X 10.13.6
Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
projection on wide table: complex expressions 2: Best/Avg Time(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------
split threshold 10                            6349 / 6523          0.2        6054.7       1.0X
split threshold 100                           4804 / 4912          0.2        4581.4       1.3X
split threshold 1024                          3145 / 3193          0.3        2999.4       2.0X
split threshold 2048                          3089 / 3124          0.3        2945.9       2.1X
split threshold 4096                          2987 / 3060          0.4        2848.3       2.1X
split threshold 8196                          3705 / 3718          0.3        3533.4       1.7X
split threshold 65536                       17102 / 17156          0.1       16310.1       0.4X

Complete benchmark source: WideTableBenchmark.scala

And I have another finding.
In above testing, the wide table's columns are more than 100, so whole stage code gen is disabled, because of spark.sql.codegen.maxFields.
If the wide table's columns are less than 100, whole stage code gen is enabled, but the expressions in ProjectExec's could not be split. Like below:

/* 133 */   private void project_doConsume_0(long project_expr_0_0) throws java.io.IOException {
/* 134 */     // CONSUME: WholeStageCodegen
/* 135 */     // CASE WHEN (input[0, bigint, false] = 0) THEN 0 ELSE 800 END
/* 136 */     byte project_caseWhenResultState_0 = -1;
/* 137 */     do {
/* 138 */       // (input[0, bigint, false] = 0)
/* 139 */       boolean project_value_2 = false;
/* 140 */       project_value_2 = project_expr_0_0 == 0L;
/* 141 */       if (!false && project_value_2) {
/* 142 */         project_caseWhenResultState_0 = (byte)(false ? 1 : 0);
/* 143 */         project_project_value_1_0 = 0;
/* 144 */         continue;
...

project_doConsume_0 is almost 2000 lines, whose byte code are probably more than 8KB.

Not sure if it is a known issue?

yucai · 2018-11-02T02:25:40Z

@cloud-fan @gatorsmile How about merging this PR first? And then we can dissuss those performance issue in other PR?

One PR to improve WideTableBenchmark [SPARK-25676][SQL][TEST] Rename and refactor BenchmarkWideTable to use main method #22823 WIP.
One PR to add more tests in WideTableBenchmark.
If we can figure out good split threshold based on 2, another PR to update that value.

cloud-fan · 2018-11-02T04:13:44Z

did you address #22847 (comment) ?

yucai · 2018-11-02T05:43:15Z

@cloud-fan @rednaxelafx I missed that! Please help review.

SparkQA · 2018-11-02T07:05:01Z

Test build #98384 has finished for PR 22847 at commit 65a5355.

This patch fails due to an unknown error code, -9.
This patch merges cleanly.
This patch adds no public classes.

yucai · 2018-11-02T07:08:26Z

retest this please

SparkQA · 2018-11-02T10:42:20Z

Test build #98386 has finished for PR 22847 at commit 65a5355.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

cloud-fan · 2018-11-05T12:10:22Z

thanks, merging to master!

…nction configurable ## What changes were proposed in this pull request? As per the discussion in [apache#22823](https://github.com/apache/spark/pull/22823/files#r228400706), add a new configuration to make the split threshold for the code generated function configurable. When the generated Java function source code exceeds `spark.sql.codegen.methodSplitThreshold`, it will be split into multiple small functions. ## How was this patch tested? manual tests Closes apache#22847 from yucai/splitThreshold. Authored-by: yucai <[email protected]> Signed-off-by: Wenchen Fan <[email protected]>

[SPARK-25850][SQL] Make the split threshold for the code generated me…

188a947

…thod configurable

yucai changed the title ~~[SPARK-25850][SQL] Make the split threshold for the code generated method configurable~~ [SPARK-25850][SQL] Make the split threshold for the code generated function configurable Oct 26, 2018

cloud-fan reviewed Oct 26, 2018

View reviewed changes

gengliangwang reviewed Oct 26, 2018

View reviewed changes

yucai added 3 commits October 27, 2018 22:27

improve desc

ee986cc

improve desc

b578dd4

address comments

0db224f

yucai mentioned this pull request Oct 27, 2018

[SPARK-25676][SQL][TEST] Rename and refactor BenchmarkWideTable to use main method #22823

Closed

dongjoon-hyun reviewed Oct 28, 2018

View reviewed changes

HyukjinKwon reviewed Oct 29, 2018

View reviewed changes

address comments

b0ce2ca

cloud-fan reviewed Oct 31, 2018

View reviewed changes

address comments

ba392c8

dongjoon-hyun approved these changes Oct 31, 2018

View reviewed changes

gatorsmile reviewed Oct 31, 2018

View reviewed changes

address comments

2fc6417

kiszk reviewed Oct 31, 2018

View reviewed changes

minor

8eeb538

Expression.reduceCodeSize

65a5355

asfgit closed this in e017cb3 Nov 5, 2018

[SPARK-25850][SQL] Make the split threshold for the code generated function configurable #22847

[SPARK-25850][SQL] Make the split threshold for the code generated function configurable #22847

Uh oh!

Conversation

yucai commented Oct 26, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

yucai commented Oct 26, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Oct 26, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Oct 27, 2018

Uh oh!

SparkQA commented Oct 27, 2018

Uh oh!

SparkQA commented Oct 27, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Oct 29, 2018

Uh oh!

yucai commented Oct 29, 2018

Uh oh!

SparkQA commented Oct 29, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yucai commented Oct 31, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dongjoon-hyun left a comment

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Oct 31, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Oct 31, 2018

Uh oh!

dilipbiswal commented Oct 31, 2018

Uh oh!

SparkQA commented Oct 31, 2018

Uh oh!

SparkQA commented Oct 31, 2018

Uh oh!

rednaxelafx commented Oct 31, 2018

Uh oh!

rednaxelafx commented Oct 31, 2018

Uh oh!

cloud-fan commented Nov 1, 2018

yucai commented Oct 26, 2018 •

edited

Loading

yucai commented Oct 31, 2018 •

edited

Loading