-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-25492][TEST] Refactor WideSchemaBenchmark to use main method #22501
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 5 commits
f56b732
7b2cb55
e6f39f3
085226c
82e2367
64e5ede
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,117 +1,145 @@ | ||
| Java HotSpot(TM) 64-Bit Server VM 1.8.0_92-b14 on Mac OS X 10.11.6 | ||
| Intel(R) Core(TM) i7-4980HQ CPU @ 2.80GHz | ||
| ================================================================================================ | ||
| parsing large select expressions | ||
| ================================================================================================ | ||
|
|
||
| Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6 | ||
| Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz | ||
| parsing large select: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative | ||
| ------------------------------------------------------------------------------------------------ | ||
| 1 select expressions 2 / 4 0.0 2050147.0 1.0X | ||
| 100 select expressions 6 / 7 0.0 6123412.0 0.3X | ||
| 2500 select expressions 135 / 141 0.0 134623148.0 0.0X | ||
| 1 select expressions 2 / 4 0.0 1934953.0 1.0X | ||
| 100 select expressions 4 / 5 0.0 3659399.0 0.5X | ||
| 2500 select expressions 68 / 76 0.0 68278937.0 0.0X | ||
|
|
||
| Java HotSpot(TM) 64-Bit Server VM 1.8.0_92-b14 on Mac OS X 10.11.6 | ||
| Intel(R) Core(TM) i7-4980HQ CPU @ 2.80GHz | ||
|
|
||
| ================================================================================================ | ||
| many column field read and write | ||
| ================================================================================================ | ||
|
|
||
| Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6 | ||
| Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz | ||
| many column field r/w: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative | ||
| ------------------------------------------------------------------------------------------------ | ||
| 1 cols x 100000 rows (read in-mem) 16 / 18 6.3 158.6 1.0X | ||
| 1 cols x 100000 rows (exec in-mem) 17 / 19 6.0 166.7 1.0X | ||
| 1 cols x 100000 rows (read parquet) 24 / 26 4.3 235.1 0.7X | ||
| 1 cols x 100000 rows (write parquet) 81 / 85 1.2 811.3 0.2X | ||
| 100 cols x 1000 rows (read in-mem) 17 / 19 6.0 166.2 1.0X | ||
| 100 cols x 1000 rows (exec in-mem) 25 / 27 4.0 249.2 0.6X | ||
| 100 cols x 1000 rows (read parquet) 23 / 25 4.4 226.0 0.7X | ||
| 100 cols x 1000 rows (write parquet) 83 / 87 1.2 831.0 0.2X | ||
| 2500 cols x 40 rows (read in-mem) 132 / 137 0.8 1322.9 0.1X | ||
| 2500 cols x 40 rows (exec in-mem) 326 / 330 0.3 3260.6 0.0X | ||
| 2500 cols x 40 rows (read parquet) 831 / 839 0.1 8305.8 0.0X | ||
| 2500 cols x 40 rows (write parquet) 237 / 245 0.4 2372.6 0.1X | ||
|
|
||
| Java HotSpot(TM) 64-Bit Server VM 1.8.0_92-b14 on Mac OS X 10.11.6 | ||
| Intel(R) Core(TM) i7-4980HQ CPU @ 2.80GHz | ||
| 1 cols x 100000 rows (read in-mem) 22 / 25 4.6 219.4 1.0X | ||
| 1 cols x 100000 rows (exec in-mem) 22 / 28 4.5 223.8 1.0X | ||
| 1 cols x 100000 rows (read parquet) 45 / 49 2.2 449.6 0.5X | ||
| 1 cols x 100000 rows (write parquet) 204 / 223 0.5 2044.4 0.1X | ||
| 100 cols x 1000 rows (read in-mem) 26 / 28 3.9 255.8 0.9X | ||
| 100 cols x 1000 rows (exec in-mem) 32 / 35 3.1 319.3 0.7X | ||
| 100 cols x 1000 rows (read parquet) 45 / 52 2.2 445.9 0.5X | ||
| 100 cols x 1000 rows (write parquet) 275 / 536 0.4 2746.1 0.1X | ||
| 2500 cols x 40 rows (read in-mem) 261 / 434 0.4 2607.3 0.1X | ||
| 2500 cols x 40 rows (exec in-mem) 624 / 701 0.2 6240.5 0.0X | ||
| 2500 cols x 40 rows (read parquet) 196 / 301 0.5 1963.4 0.1X | ||
| 2500 cols x 40 rows (write parquet) 687 / 1049 0.1 6870.6 0.0X | ||
|
||
|
|
||
|
|
||
| ================================================================================================ | ||
| wide shallowly nested struct field read and write | ||
| ================================================================================================ | ||
|
|
||
| Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6 | ||
| Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz | ||
| wide shallowly nested struct field r/w: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative | ||
| ------------------------------------------------------------------------------------------------ | ||
| 1 wide x 100000 rows (read in-mem) 15 / 17 6.6 151.0 1.0X | ||
| 1 wide x 100000 rows (exec in-mem) 20 / 22 5.1 196.6 0.8X | ||
| 1 wide x 100000 rows (read parquet) 59 / 63 1.7 592.8 0.3X | ||
| 1 wide x 100000 rows (write parquet) 81 / 87 1.2 814.6 0.2X | ||
| 100 wide x 1000 rows (read in-mem) 21 / 25 4.8 208.7 0.7X | ||
| 100 wide x 1000 rows (exec in-mem) 72 / 81 1.4 718.5 0.2X | ||
| 100 wide x 1000 rows (read parquet) 75 / 85 1.3 752.6 0.2X | ||
| 100 wide x 1000 rows (write parquet) 88 / 95 1.1 876.7 0.2X | ||
| 2500 wide x 40 rows (read in-mem) 28 / 34 3.5 282.2 0.5X | ||
| 2500 wide x 40 rows (exec in-mem) 1269 / 1284 0.1 12688.1 0.0X | ||
| 2500 wide x 40 rows (read parquet) 549 / 578 0.2 5493.4 0.0X | ||
| 2500 wide x 40 rows (write parquet) 96 / 104 1.0 959.1 0.2X | ||
|
|
||
| Java HotSpot(TM) 64-Bit Server VM 1.8.0_92-b14 on Mac OS X 10.11.6 | ||
| Intel(R) Core(TM) i7-4980HQ CPU @ 2.80GHz | ||
| 1 wide x 100000 rows (read in-mem) 23 / 42 4.4 226.2 1.0X | ||
| 1 wide x 100000 rows (exec in-mem) 29 / 53 3.5 288.5 0.8X | ||
| 1 wide x 100000 rows (read parquet) 93 / 102 1.1 928.2 0.2X | ||
| 1 wide x 100000 rows (write parquet) 201 / 222 0.5 2009.6 0.1X | ||
| 100 wide x 1000 rows (read in-mem) 42 / 55 2.4 421.8 0.5X | ||
| 100 wide x 1000 rows (exec in-mem) 55 / 113 1.8 547.0 0.4X | ||
| 100 wide x 1000 rows (read parquet) 139 / 263 0.7 1390.6 0.2X | ||
| 100 wide x 1000 rows (write parquet) 245 / 338 0.4 2450.9 0.1X | ||
| 2500 wide x 40 rows (read in-mem) 51 / 72 2.0 511.7 0.4X | ||
| 2500 wide x 40 rows (exec in-mem) 265 / 303 0.4 2654.8 0.1X | ||
| 2500 wide x 40 rows (read parquet) 1285 / 1339 0.1 12845.1 0.0X | ||
| 2500 wide x 40 rows (write parquet) 238 / 262 0.4 2378.8 0.1X | ||
|
|
||
|
|
||
| ================================================================================================ | ||
| deeply nested struct field read and write | ||
| ================================================================================================ | ||
|
|
||
| Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6 | ||
| Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz | ||
| deeply nested struct field r/w: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative | ||
| ------------------------------------------------------------------------------------------------ | ||
| 1 deep x 100000 rows (read in-mem) 14 / 16 7.0 143.8 1.0X | ||
| 1 deep x 100000 rows (exec in-mem) 17 / 19 5.9 169.7 0.8X | ||
| 1 deep x 100000 rows (read parquet) 33 / 35 3.1 327.0 0.4X | ||
| 1 deep x 100000 rows (write parquet) 79 / 84 1.3 786.9 0.2X | ||
| 100 deep x 1000 rows (read in-mem) 21 / 24 4.7 211.3 0.7X | ||
| 100 deep x 1000 rows (exec in-mem) 221 / 235 0.5 2214.5 0.1X | ||
| 100 deep x 1000 rows (read parquet) 1928 / 1952 0.1 19277.1 0.0X | ||
| 100 deep x 1000 rows (write parquet) 91 / 96 1.1 909.5 0.2X | ||
| 250 deep x 400 rows (read in-mem) 57 / 61 1.8 567.1 0.3X | ||
| 250 deep x 400 rows (exec in-mem) 1329 / 1385 0.1 13291.8 0.0X | ||
| 250 deep x 400 rows (read parquet) 36563 / 36750 0.0 365630.2 0.0X | ||
| 250 deep x 400 rows (write parquet) 126 / 130 0.8 1262.0 0.1X | ||
|
|
||
| Java HotSpot(TM) 64-Bit Server VM 1.8.0_92-b14 on Mac OS X 10.11.6 | ||
| Intel(R) Core(TM) i7-4980HQ CPU @ 2.80GHz | ||
| 1 deep x 100000 rows (read in-mem) 20 / 24 5.1 197.9 1.0X | ||
| 1 deep x 100000 rows (exec in-mem) 23 / 28 4.4 227.8 0.9X | ||
| 1 deep x 100000 rows (read parquet) 50 / 58 2.0 500.1 0.4X | ||
| 1 deep x 100000 rows (write parquet) 195 / 219 0.5 1945.1 0.1X | ||
| 100 deep x 1000 rows (read in-mem) 39 / 57 2.5 393.1 0.5X | ||
| 100 deep x 1000 rows (exec in-mem) 480 / 556 0.2 4795.7 0.0X | ||
| 100 deep x 1000 rows (read parquet) 7943 / 7950 0.0 79427.5 0.0X | ||
|
||
| 100 deep x 1000 rows (write parquet) 227 / 245 0.4 2267.6 0.1X | ||
| 250 deep x 400 rows (read in-mem) 150 / 168 0.7 1500.1 0.1X | ||
| 250 deep x 400 rows (exec in-mem) 2925 / 2979 0.0 29247.3 0.0X | ||
| 250 deep x 400 rows (read parquet) 121815 / 128302 0.0 1218145.9 0.0X | ||
| 250 deep x 400 rows (write parquet) 335 / 362 0.3 3351.9 0.1X | ||
|
|
||
|
|
||
| ================================================================================================ | ||
| bushy struct field read and write | ||
| ================================================================================================ | ||
|
|
||
| Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6 | ||
| Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz | ||
| bushy struct field r/w: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative | ||
| ------------------------------------------------------------------------------------------------ | ||
| 1 x 1 deep x 100000 rows (read in-mem) 13 / 15 7.8 127.7 1.0X | ||
| 1 x 1 deep x 100000 rows (exec in-mem) 15 / 17 6.6 151.5 0.8X | ||
| 1 x 1 deep x 100000 rows (read parquet) 20 / 23 5.0 198.3 0.6X | ||
| 1 x 1 deep x 100000 rows (write parquet) 77 / 82 1.3 770.4 0.2X | ||
| 128 x 8 deep x 1000 rows (read in-mem) 12 / 14 8.2 122.5 1.0X | ||
| 128 x 8 deep x 1000 rows (exec in-mem) 124 / 140 0.8 1241.2 0.1X | ||
| 128 x 8 deep x 1000 rows (read parquet) 69 / 74 1.4 693.9 0.2X | ||
| 128 x 8 deep x 1000 rows (write parquet) 78 / 83 1.3 777.7 0.2X | ||
| 1024 x 11 deep x 100 rows (read in-mem) 25 / 29 4.1 246.1 0.5X | ||
| 1024 x 11 deep x 100 rows (exec in-mem) 1197 / 1223 0.1 11974.6 0.0X | ||
| 1024 x 11 deep x 100 rows (read parquet) 426 / 433 0.2 4263.7 0.0X | ||
| 1024 x 11 deep x 100 rows (write parquet) 91 / 98 1.1 913.5 0.1X | ||
|
|
||
| Java HotSpot(TM) 64-Bit Server VM 1.8.0_92-b14 on Mac OS X 10.11.6 | ||
| Intel(R) Core(TM) i7-4980HQ CPU @ 2.80GHz | ||
| 1 x 1 deep x 100000 rows (read in-mem) 23 / 27 4.4 229.0 1.0X | ||
| 1 x 1 deep x 100000 rows (exec in-mem) 25 / 30 4.0 249.3 0.9X | ||
| 1 x 1 deep x 100000 rows (read parquet) 35 / 40 2.8 351.1 0.7X | ||
| 1 x 1 deep x 100000 rows (write parquet) 193 / 213 0.5 1929.8 0.1X | ||
| 128 x 8 deep x 1000 rows (read in-mem) 18 / 21 5.6 179.2 1.3X | ||
| 128 x 8 deep x 1000 rows (exec in-mem) 54 / 61 1.8 544.4 0.4X | ||
| 128 x 8 deep x 1000 rows (read parquet) 195 / 212 0.5 1950.2 0.1X | ||
| 128 x 8 deep x 1000 rows (write parquet) 195 / 203 0.5 1952.2 0.1X | ||
| 1024 x 11 deep x 100 rows (read in-mem) 47 / 51 2.1 468.4 0.5X | ||
| 1024 x 11 deep x 100 rows (exec in-mem) 210 / 219 0.5 2102.0 0.1X | ||
| 1024 x 11 deep x 100 rows (read parquet) 1332 / 1367 0.1 13323.4 0.0X | ||
| 1024 x 11 deep x 100 rows (write parquet) 223 / 241 0.4 2230.3 0.1X | ||
|
|
||
|
|
||
| ================================================================================================ | ||
| wide array field read and write | ||
| ================================================================================================ | ||
|
|
||
| Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6 | ||
| Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz | ||
| wide array field r/w: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative | ||
| ------------------------------------------------------------------------------------------------ | ||
| 1 wide x 100000 rows (read in-mem) 14 / 16 7.0 143.2 1.0X | ||
| 1 wide x 100000 rows (exec in-mem) 17 / 19 5.9 170.9 0.8X | ||
| 1 wide x 100000 rows (read parquet) 43 / 46 2.3 434.1 0.3X | ||
| 1 wide x 100000 rows (write parquet) 78 / 83 1.3 777.6 0.2X | ||
| 100 wide x 1000 rows (read in-mem) 11 / 13 9.0 111.5 1.3X | ||
| 100 wide x 1000 rows (exec in-mem) 13 / 15 7.8 128.3 1.1X | ||
| 100 wide x 1000 rows (read parquet) 24 / 27 4.1 245.0 0.6X | ||
| 100 wide x 1000 rows (write parquet) 74 / 80 1.4 740.5 0.2X | ||
| 2500 wide x 40 rows (read in-mem) 11 / 13 9.1 109.5 1.3X | ||
| 2500 wide x 40 rows (exec in-mem) 13 / 15 7.7 129.4 1.1X | ||
| 2500 wide x 40 rows (read parquet) 24 / 26 4.1 241.3 0.6X | ||
| 2500 wide x 40 rows (write parquet) 75 / 81 1.3 751.8 0.2X | ||
|
|
||
| Java HotSpot(TM) 64-Bit Server VM 1.8.0_92-b14 on Mac OS X 10.11.6 | ||
| Intel(R) Core(TM) i7-4980HQ CPU @ 2.80GHz | ||
| 1 wide x 100000 rows (read in-mem) 19 / 21 5.3 188.9 1.0X | ||
| 1 wide x 100000 rows (exec in-mem) 23 / 29 4.3 232.0 0.8X | ||
| 1 wide x 100000 rows (read parquet) 59 / 65 1.7 588.8 0.3X | ||
| 1 wide x 100000 rows (write parquet) 200 / 217 0.5 1998.0 0.1X | ||
| 100 wide x 1000 rows (read in-mem) 16 / 18 6.2 162.5 1.2X | ||
| 100 wide x 1000 rows (exec in-mem) 19 / 21 5.4 185.2 1.0X | ||
| 100 wide x 1000 rows (read parquet) 42 / 45 2.4 415.6 0.5X | ||
| 100 wide x 1000 rows (write parquet) 193 / 216 0.5 1928.5 0.1X | ||
| 2500 wide x 40 rows (read in-mem) 16 / 19 6.2 162.4 1.2X | ||
| 2500 wide x 40 rows (exec in-mem) 18 / 21 5.4 184.0 1.0X | ||
| 2500 wide x 40 rows (read parquet) 40 / 44 2.5 398.7 0.5X | ||
| 2500 wide x 40 rows (write parquet) 194 / 211 0.5 1943.6 0.1X | ||
|
|
||
|
|
||
| ================================================================================================ | ||
| wide map field read and write | ||
| ================================================================================================ | ||
|
|
||
| Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6 | ||
| Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz | ||
| wide map field r/w: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative | ||
| ------------------------------------------------------------------------------------------------ | ||
| 1 wide x 100000 rows (read in-mem) 16 / 18 6.2 162.6 1.0X | ||
| 1 wide x 100000 rows (exec in-mem) 21 / 23 4.8 208.2 0.8X | ||
| 1 wide x 100000 rows (read parquet) 54 / 59 1.8 543.6 0.3X | ||
| 1 wide x 100000 rows (write parquet) 80 / 86 1.2 804.5 0.2X | ||
| 100 wide x 1000 rows (read in-mem) 11 / 13 8.7 114.5 1.4X | ||
| 100 wide x 1000 rows (exec in-mem) 14 / 16 7.0 143.5 1.1X | ||
| 100 wide x 1000 rows (read parquet) 30 / 32 3.3 300.4 0.5X | ||
| 100 wide x 1000 rows (write parquet) 75 / 80 1.3 749.9 0.2X | ||
| 2500 wide x 40 rows (read in-mem) 13 / 15 7.8 128.1 1.3X | ||
| 2500 wide x 40 rows (exec in-mem) 15 / 18 6.5 153.6 1.1X | ||
| 2500 wide x 40 rows (read parquet) 30 / 33 3.3 304.4 0.5X | ||
| 2500 wide x 40 rows (write parquet) 77 / 83 1.3 768.5 0.2X | ||
| 1 wide x 100000 rows (read in-mem) 17 / 20 6.0 165.5 1.0X | ||
| 1 wide x 100000 rows (exec in-mem) 21 / 25 4.7 214.3 0.8X | ||
| 1 wide x 100000 rows (read parquet) 79 / 105 1.3 785.8 0.2X | ||
| 1 wide x 100000 rows (write parquet) 196 / 240 0.5 1957.0 0.1X | ||
| 100 wide x 1000 rows (read in-mem) 12 / 13 8.6 115.7 1.4X | ||
| 100 wide x 1000 rows (exec in-mem) 15 / 17 6.8 147.8 1.1X | ||
| 100 wide x 1000 rows (read parquet) 46 / 52 2.2 460.9 0.4X | ||
| 100 wide x 1000 rows (write parquet) 184 / 202 0.5 1843.1 0.1X | ||
| 2500 wide x 40 rows (read in-mem) 13 / 15 7.4 134.7 1.2X | ||
| 2500 wide x 40 rows (exec in-mem) 17 / 19 6.0 167.5 1.0X | ||
| 2500 wide x 40 rows (read parquet) 46 / 51 2.2 461.0 0.4X | ||
| 2500 wide x 40 rows (write parquet) 189 / 206 0.5 1887.0 0.1X | ||
|
|
||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the difference on ratio, this might be a little regression on Parquet writer from Spark 2.1.0 (SPARK-17335).
cc @cloud-fan and @gatorsmile , @rdblue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no idea how this happens. Can you create a JIRA ticket to investigate this regression?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May be a parquet issue. I found that the binary write performance is a little worse after upgrading to parquet 1.10.0: apache/parquet-java#505. I will verify it later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The following EC2 result shows the consistent ratio like Spark 2.1.0. The result on Mac seemed to be unstable for some unknown reason like #22501 (comment).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dongjoon-hyun, so you are saying that it doesn't appear that there is a performance regression, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this part, right, @rdblue . I guess so.
After merging EC2 result to @wangyum 's PR, I'll compare the numbers one by one once again.