Skip to content
Closed
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
216 changes: 122 additions & 94 deletions sql/core/benchmarks/WideSchemaBenchmark-results.txt
Original file line number Diff line number Diff line change
@@ -1,117 +1,145 @@
Java HotSpot(TM) 64-Bit Server VM 1.8.0_92-b14 on Mac OS X 10.11.6
Intel(R) Core(TM) i7-4980HQ CPU @ 2.80GHz
================================================================================================
parsing large select expressions
================================================================================================

Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
parsing large select: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
1 select expressions 2 / 4 0.0 2050147.0 1.0X
100 select expressions 6 / 7 0.0 6123412.0 0.3X
2500 select expressions 135 / 141 0.0 134623148.0 0.0X
1 select expressions 2 / 4 0.0 1934953.0 1.0X
100 select expressions 4 / 5 0.0 3659399.0 0.5X
2500 select expressions 68 / 76 0.0 68278937.0 0.0X

Java HotSpot(TM) 64-Bit Server VM 1.8.0_92-b14 on Mac OS X 10.11.6
Intel(R) Core(TM) i7-4980HQ CPU @ 2.80GHz

================================================================================================
many column field read and write
================================================================================================

Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
many column field r/w: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
1 cols x 100000 rows (read in-mem) 16 / 18 6.3 158.6 1.0X
1 cols x 100000 rows (exec in-mem) 17 / 19 6.0 166.7 1.0X
1 cols x 100000 rows (read parquet) 24 / 26 4.3 235.1 0.7X
1 cols x 100000 rows (write parquet) 81 / 85 1.2 811.3 0.2X
100 cols x 1000 rows (read in-mem) 17 / 19 6.0 166.2 1.0X
100 cols x 1000 rows (exec in-mem) 25 / 27 4.0 249.2 0.6X
100 cols x 1000 rows (read parquet) 23 / 25 4.4 226.0 0.7X
100 cols x 1000 rows (write parquet) 83 / 87 1.2 831.0 0.2X
2500 cols x 40 rows (read in-mem) 132 / 137 0.8 1322.9 0.1X
2500 cols x 40 rows (exec in-mem) 326 / 330 0.3 3260.6 0.0X
2500 cols x 40 rows (read parquet) 831 / 839 0.1 8305.8 0.0X
2500 cols x 40 rows (write parquet) 237 / 245 0.4 2372.6 0.1X

Java HotSpot(TM) 64-Bit Server VM 1.8.0_92-b14 on Mac OS X 10.11.6
Intel(R) Core(TM) i7-4980HQ CPU @ 2.80GHz
1 cols x 100000 rows (read in-mem) 22 / 25 4.6 219.4 1.0X
1 cols x 100000 rows (exec in-mem) 22 / 28 4.5 223.8 1.0X
1 cols x 100000 rows (read parquet) 45 / 49 2.2 449.6 0.5X
1 cols x 100000 rows (write parquet) 204 / 223 0.5 2044.4 0.1X
Copy link
Member

@dongjoon-hyun dongjoon-hyun Oct 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the difference on ratio, this might be a little regression on Parquet writer from Spark 2.1.0 (SPARK-17335).

cc @cloud-fan and @gatorsmile , @rdblue

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no idea how this happens. Can you create a JIRA ticket to investigate this regression?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be a parquet issue. I found that the binary write performance is a little worse after upgrading to parquet 1.10.0: apache/parquet-java#505. I will verify it later.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The following EC2 result shows the consistent ratio like Spark 2.1.0. The result on Mac seemed to be unstable for some unknown reason like #22501 (comment).

1 cols x 100000 rows (read parquet)             61 /   70          1.6         610.2       0.6X
1 cols x 100000 rows (write parquet)           209 /  233          0.5        2086.1       0.2X

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dongjoon-hyun, so you are saying that it doesn't appear that there is a performance regression, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this part, right, @rdblue . I guess so.
After merging EC2 result to @wangyum 's PR, I'll compare the numbers one by one once again.

100 cols x 1000 rows (read in-mem) 26 / 28 3.9 255.8 0.9X
100 cols x 1000 rows (exec in-mem) 32 / 35 3.1 319.3 0.7X
100 cols x 1000 rows (read parquet) 45 / 52 2.2 445.9 0.5X
100 cols x 1000 rows (write parquet) 275 / 536 0.4 2746.1 0.1X
2500 cols x 40 rows (read in-mem) 261 / 434 0.4 2607.3 0.1X
2500 cols x 40 rows (exec in-mem) 624 / 701 0.2 6240.5 0.0X
2500 cols x 40 rows (read parquet) 196 / 301 0.5 1963.4 0.1X
2500 cols x 40 rows (write parquet) 687 / 1049 0.1 6870.6 0.0X
Copy link
Member

@dongjoon-hyun dongjoon-hyun Oct 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The gap between best and average is too high in line 32 and line 33.
I'll try to run this on EC2, too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, this large gap was removed at EC2 result.



================================================================================================
wide shallowly nested struct field read and write
================================================================================================

Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
wide shallowly nested struct field r/w: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
1 wide x 100000 rows (read in-mem) 15 / 17 6.6 151.0 1.0X
1 wide x 100000 rows (exec in-mem) 20 / 22 5.1 196.6 0.8X
1 wide x 100000 rows (read parquet) 59 / 63 1.7 592.8 0.3X
1 wide x 100000 rows (write parquet) 81 / 87 1.2 814.6 0.2X
100 wide x 1000 rows (read in-mem) 21 / 25 4.8 208.7 0.7X
100 wide x 1000 rows (exec in-mem) 72 / 81 1.4 718.5 0.2X
100 wide x 1000 rows (read parquet) 75 / 85 1.3 752.6 0.2X
100 wide x 1000 rows (write parquet) 88 / 95 1.1 876.7 0.2X
2500 wide x 40 rows (read in-mem) 28 / 34 3.5 282.2 0.5X
2500 wide x 40 rows (exec in-mem) 1269 / 1284 0.1 12688.1 0.0X
2500 wide x 40 rows (read parquet) 549 / 578 0.2 5493.4 0.0X
2500 wide x 40 rows (write parquet) 96 / 104 1.0 959.1 0.2X

Java HotSpot(TM) 64-Bit Server VM 1.8.0_92-b14 on Mac OS X 10.11.6
Intel(R) Core(TM) i7-4980HQ CPU @ 2.80GHz
1 wide x 100000 rows (read in-mem) 23 / 42 4.4 226.2 1.0X
1 wide x 100000 rows (exec in-mem) 29 / 53 3.5 288.5 0.8X
1 wide x 100000 rows (read parquet) 93 / 102 1.1 928.2 0.2X
1 wide x 100000 rows (write parquet) 201 / 222 0.5 2009.6 0.1X
100 wide x 1000 rows (read in-mem) 42 / 55 2.4 421.8 0.5X
100 wide x 1000 rows (exec in-mem) 55 / 113 1.8 547.0 0.4X
100 wide x 1000 rows (read parquet) 139 / 263 0.7 1390.6 0.2X
100 wide x 1000 rows (write parquet) 245 / 338 0.4 2450.9 0.1X
2500 wide x 40 rows (read in-mem) 51 / 72 2.0 511.7 0.4X
2500 wide x 40 rows (exec in-mem) 265 / 303 0.4 2654.8 0.1X
2500 wide x 40 rows (read parquet) 1285 / 1339 0.1 12845.1 0.0X
2500 wide x 40 rows (write parquet) 238 / 262 0.4 2378.8 0.1X


================================================================================================
deeply nested struct field read and write
================================================================================================

Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
deeply nested struct field r/w: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
1 deep x 100000 rows (read in-mem) 14 / 16 7.0 143.8 1.0X
1 deep x 100000 rows (exec in-mem) 17 / 19 5.9 169.7 0.8X
1 deep x 100000 rows (read parquet) 33 / 35 3.1 327.0 0.4X
1 deep x 100000 rows (write parquet) 79 / 84 1.3 786.9 0.2X
100 deep x 1000 rows (read in-mem) 21 / 24 4.7 211.3 0.7X
100 deep x 1000 rows (exec in-mem) 221 / 235 0.5 2214.5 0.1X
100 deep x 1000 rows (read parquet) 1928 / 1952 0.1 19277.1 0.0X
100 deep x 1000 rows (write parquet) 91 / 96 1.1 909.5 0.2X
250 deep x 400 rows (read in-mem) 57 / 61 1.8 567.1 0.3X
250 deep x 400 rows (exec in-mem) 1329 / 1385 0.1 13291.8 0.0X
250 deep x 400 rows (read parquet) 36563 / 36750 0.0 365630.2 0.0X
250 deep x 400 rows (write parquet) 126 / 130 0.8 1262.0 0.1X

Java HotSpot(TM) 64-Bit Server VM 1.8.0_92-b14 on Mac OS X 10.11.6
Intel(R) Core(TM) i7-4980HQ CPU @ 2.80GHz
1 deep x 100000 rows (read in-mem) 20 / 24 5.1 197.9 1.0X
1 deep x 100000 rows (exec in-mem) 23 / 28 4.4 227.8 0.9X
1 deep x 100000 rows (read parquet) 50 / 58 2.0 500.1 0.4X
1 deep x 100000 rows (write parquet) 195 / 219 0.5 1945.1 0.1X
100 deep x 1000 rows (read in-mem) 39 / 57 2.5 393.1 0.5X
100 deep x 1000 rows (exec in-mem) 480 / 556 0.2 4795.7 0.0X
100 deep x 1000 rows (read parquet) 7943 / 7950 0.0 79427.5 0.0X
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ur, @wangyum . Is this 4 times slower than before?

cc @dbtsai .

100 deep x 1000 rows (write parquet) 227 / 245 0.4 2267.6 0.1X
250 deep x 400 rows (read in-mem) 150 / 168 0.7 1500.1 0.1X
250 deep x 400 rows (exec in-mem) 2925 / 2979 0.0 29247.3 0.0X
250 deep x 400 rows (read parquet) 121815 / 128302 0.0 1218145.9 0.0X
250 deep x 400 rows (write parquet) 335 / 362 0.3 3351.9 0.1X


================================================================================================
bushy struct field read and write
================================================================================================

Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
bushy struct field r/w: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
1 x 1 deep x 100000 rows (read in-mem) 13 / 15 7.8 127.7 1.0X
1 x 1 deep x 100000 rows (exec in-mem) 15 / 17 6.6 151.5 0.8X
1 x 1 deep x 100000 rows (read parquet) 20 / 23 5.0 198.3 0.6X
1 x 1 deep x 100000 rows (write parquet) 77 / 82 1.3 770.4 0.2X
128 x 8 deep x 1000 rows (read in-mem) 12 / 14 8.2 122.5 1.0X
128 x 8 deep x 1000 rows (exec in-mem) 124 / 140 0.8 1241.2 0.1X
128 x 8 deep x 1000 rows (read parquet) 69 / 74 1.4 693.9 0.2X
128 x 8 deep x 1000 rows (write parquet) 78 / 83 1.3 777.7 0.2X
1024 x 11 deep x 100 rows (read in-mem) 25 / 29 4.1 246.1 0.5X
1024 x 11 deep x 100 rows (exec in-mem) 1197 / 1223 0.1 11974.6 0.0X
1024 x 11 deep x 100 rows (read parquet) 426 / 433 0.2 4263.7 0.0X
1024 x 11 deep x 100 rows (write parquet) 91 / 98 1.1 913.5 0.1X

Java HotSpot(TM) 64-Bit Server VM 1.8.0_92-b14 on Mac OS X 10.11.6
Intel(R) Core(TM) i7-4980HQ CPU @ 2.80GHz
1 x 1 deep x 100000 rows (read in-mem) 23 / 27 4.4 229.0 1.0X
1 x 1 deep x 100000 rows (exec in-mem) 25 / 30 4.0 249.3 0.9X
1 x 1 deep x 100000 rows (read parquet) 35 / 40 2.8 351.1 0.7X
1 x 1 deep x 100000 rows (write parquet) 193 / 213 0.5 1929.8 0.1X
128 x 8 deep x 1000 rows (read in-mem) 18 / 21 5.6 179.2 1.3X
128 x 8 deep x 1000 rows (exec in-mem) 54 / 61 1.8 544.4 0.4X
128 x 8 deep x 1000 rows (read parquet) 195 / 212 0.5 1950.2 0.1X
128 x 8 deep x 1000 rows (write parquet) 195 / 203 0.5 1952.2 0.1X
1024 x 11 deep x 100 rows (read in-mem) 47 / 51 2.1 468.4 0.5X
1024 x 11 deep x 100 rows (exec in-mem) 210 / 219 0.5 2102.0 0.1X
1024 x 11 deep x 100 rows (read parquet) 1332 / 1367 0.1 13323.4 0.0X
1024 x 11 deep x 100 rows (write parquet) 223 / 241 0.4 2230.3 0.1X


================================================================================================
wide array field read and write
================================================================================================

Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
wide array field r/w: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
1 wide x 100000 rows (read in-mem) 14 / 16 7.0 143.2 1.0X
1 wide x 100000 rows (exec in-mem) 17 / 19 5.9 170.9 0.8X
1 wide x 100000 rows (read parquet) 43 / 46 2.3 434.1 0.3X
1 wide x 100000 rows (write parquet) 78 / 83 1.3 777.6 0.2X
100 wide x 1000 rows (read in-mem) 11 / 13 9.0 111.5 1.3X
100 wide x 1000 rows (exec in-mem) 13 / 15 7.8 128.3 1.1X
100 wide x 1000 rows (read parquet) 24 / 27 4.1 245.0 0.6X
100 wide x 1000 rows (write parquet) 74 / 80 1.4 740.5 0.2X
2500 wide x 40 rows (read in-mem) 11 / 13 9.1 109.5 1.3X
2500 wide x 40 rows (exec in-mem) 13 / 15 7.7 129.4 1.1X
2500 wide x 40 rows (read parquet) 24 / 26 4.1 241.3 0.6X
2500 wide x 40 rows (write parquet) 75 / 81 1.3 751.8 0.2X

Java HotSpot(TM) 64-Bit Server VM 1.8.0_92-b14 on Mac OS X 10.11.6
Intel(R) Core(TM) i7-4980HQ CPU @ 2.80GHz
1 wide x 100000 rows (read in-mem) 19 / 21 5.3 188.9 1.0X
1 wide x 100000 rows (exec in-mem) 23 / 29 4.3 232.0 0.8X
1 wide x 100000 rows (read parquet) 59 / 65 1.7 588.8 0.3X
1 wide x 100000 rows (write parquet) 200 / 217 0.5 1998.0 0.1X
100 wide x 1000 rows (read in-mem) 16 / 18 6.2 162.5 1.2X
100 wide x 1000 rows (exec in-mem) 19 / 21 5.4 185.2 1.0X
100 wide x 1000 rows (read parquet) 42 / 45 2.4 415.6 0.5X
100 wide x 1000 rows (write parquet) 193 / 216 0.5 1928.5 0.1X
2500 wide x 40 rows (read in-mem) 16 / 19 6.2 162.4 1.2X
2500 wide x 40 rows (exec in-mem) 18 / 21 5.4 184.0 1.0X
2500 wide x 40 rows (read parquet) 40 / 44 2.5 398.7 0.5X
2500 wide x 40 rows (write parquet) 194 / 211 0.5 1943.6 0.1X


================================================================================================
wide map field read and write
================================================================================================

Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
wide map field r/w: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
1 wide x 100000 rows (read in-mem) 16 / 18 6.2 162.6 1.0X
1 wide x 100000 rows (exec in-mem) 21 / 23 4.8 208.2 0.8X
1 wide x 100000 rows (read parquet) 54 / 59 1.8 543.6 0.3X
1 wide x 100000 rows (write parquet) 80 / 86 1.2 804.5 0.2X
100 wide x 1000 rows (read in-mem) 11 / 13 8.7 114.5 1.4X
100 wide x 1000 rows (exec in-mem) 14 / 16 7.0 143.5 1.1X
100 wide x 1000 rows (read parquet) 30 / 32 3.3 300.4 0.5X
100 wide x 1000 rows (write parquet) 75 / 80 1.3 749.9 0.2X
2500 wide x 40 rows (read in-mem) 13 / 15 7.8 128.1 1.3X
2500 wide x 40 rows (exec in-mem) 15 / 18 6.5 153.6 1.1X
2500 wide x 40 rows (read parquet) 30 / 33 3.3 304.4 0.5X
2500 wide x 40 rows (write parquet) 77 / 83 1.3 768.5 0.2X
1 wide x 100000 rows (read in-mem) 17 / 20 6.0 165.5 1.0X
1 wide x 100000 rows (exec in-mem) 21 / 25 4.7 214.3 0.8X
1 wide x 100000 rows (read parquet) 79 / 105 1.3 785.8 0.2X
1 wide x 100000 rows (write parquet) 196 / 240 0.5 1957.0 0.1X
100 wide x 1000 rows (read in-mem) 12 / 13 8.6 115.7 1.4X
100 wide x 1000 rows (exec in-mem) 15 / 17 6.8 147.8 1.1X
100 wide x 1000 rows (read parquet) 46 / 52 2.2 460.9 0.4X
100 wide x 1000 rows (write parquet) 184 / 202 0.5 1843.1 0.1X
2500 wide x 40 rows (read in-mem) 13 / 15 7.4 134.7 1.2X
2500 wide x 40 rows (exec in-mem) 17 / 19 6.0 167.5 1.0X
2500 wide x 40 rows (read parquet) 46 / 51 2.2 461.0 0.4X
2500 wide x 40 rows (write parquet) 189 / 206 0.5 1887.0 0.1X


Loading