Commit de7ba3b
[SPARK-53631][CORE] Optimize memory and perf on SHS bootstrap
### What changes were proposed in this pull request?
Core ideas:
1. Change the log replay thread pool to have a bounded queue, and block task submission when the queue is full.
Currently, the log replay thread pool uses an unbounded queue, when there are a large number (e.g., millions) of event logs under `spark.history.fs.logDirectory`, all tasks will be queued at the thread pool queue without blocking the scanning thread, and in the next schedule, enqueue again ...
https://stackoverflow.com/questions/4521983/java-executorservice-that-blocks-on-submission-after-a-certain-queue-size
2. Move log compaction to a dedicated thread pool.
Replaying and compaction are different types of workloads, isolating them from each other could improve the resource utilization.
### Why are the changes needed?
Improve performance and reduce memory usage on the SHS bootstrap with empty KV cache, when there are tons of event logs.
### Does this PR introduce _any_ user-facing change?
No functionality changes, but brings a new config `spark.history.fs.numCompactThreads`
### How was this patch tested?
Tested on an internal cluster, starting SHS with an empty `spark.history.store.path` and ~650k event logs under `spark.history.fs.logDirectory`, the related configs are
```
spark.history.fs.cleaner.maxNum 650000
spark.history.fs.logDirectory hdfs://foo/spark2-history
spark.history.fs.update.interval 5s
spark.history.provider org.apache.spark.deploy.history.FsHistoryProvider
spark.history.store.maxDiskUsage 100GB
spark.history.store.path /foo/bar/historyStore
spark.history.fs.numReplayThreads 64
spark.history.fs.numCompactThreads 4
spark.history.store.hybridStore.enabled true
spark.history.store.hybridStore.maxMemoryUsage 16g
spark.history.store.hybridStore.diskBackend ROCKSDB
```
- `spark.history.store.path` is configured to an HDD path
- we disable `spark.eventLog.rolling.enabled` so `numCompactThreads` has no heavy work
It's much faster than before, and metrics show better CPU utilization and lower memory usage.
<img width="2546" height="480" alt="bf51f797a11527ce82036669f96cf50b" src="https://github.com/user-attachments/assets/4db521b0-cf1c-4b93-a06d-27fdaf1ccec4" />
(before vs. after, the 3rd figure is "memory used")
### Was this patch authored or co-authored using generative AI tooling?
No.
Closes #52382 from pan3793/SPARK-53631.
Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>1 parent 06f7ad2 commit de7ba3b
7 files changed
Lines changed: 218 additions & 11 deletions
File tree
- common/utils-java/src/main/java/org/apache/spark/internal
- core/src
- main/scala/org/apache/spark
- deploy/history
- internal/config
- util
- test/scala/org/apache/spark/util
- docs
Lines changed: 1 addition & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
328 | 328 | | |
329 | 329 | | |
330 | 330 | | |
| 331 | + | |
331 | 332 | | |
332 | 333 | | |
333 | 334 | | |
| |||
Lines changed: 28 additions & 11 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
102 | 102 | | |
103 | 103 | | |
104 | 104 | | |
105 | | - | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
106 | 108 | | |
107 | 109 | | |
108 | 110 | | |
| |||
209 | 211 | | |
210 | 212 | | |
211 | 213 | | |
212 | | - | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
213 | 228 | | |
214 | 229 | | |
215 | 230 | | |
| |||
431 | 446 | | |
432 | 447 | | |
433 | 448 | | |
434 | | - | |
| 449 | + | |
435 | 450 | | |
436 | 451 | | |
437 | 452 | | |
| |||
487 | 502 | | |
488 | 503 | | |
489 | 504 | | |
490 | | - | |
| 505 | + | |
| 506 | + | |
491 | 507 | | |
492 | 508 | | |
493 | 509 | | |
494 | 510 | | |
495 | 511 | | |
496 | 512 | | |
497 | 513 | | |
498 | | - | |
| 514 | + | |
499 | 515 | | |
500 | 516 | | |
501 | 517 | | |
| |||
612 | 628 | | |
613 | 629 | | |
614 | 630 | | |
615 | | - | |
| 631 | + | |
616 | 632 | | |
617 | 633 | | |
618 | 634 | | |
619 | | - | |
| 635 | + | |
620 | 636 | | |
621 | 637 | | |
622 | 638 | | |
| |||
788 | 804 | | |
789 | 805 | | |
790 | 806 | | |
791 | | - | |
| 807 | + | |
792 | 808 | | |
793 | 809 | | |
794 | 810 | | |
| |||
1456 | 1472 | | |
1457 | 1473 | | |
1458 | 1474 | | |
1459 | | - | |
| 1475 | + | |
| 1476 | + | |
1460 | 1477 | | |
1461 | 1478 | | |
1462 | | - | |
| 1479 | + | |
1463 | 1480 | | |
1464 | 1481 | | |
1465 | | - | |
| 1482 | + | |
1466 | 1483 | | |
1467 | 1484 | | |
1468 | 1485 | | |
| |||
Lines changed: 6 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
228 | 228 | | |
229 | 229 | | |
230 | 230 | | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
231 | 237 | | |
232 | 238 | | |
233 | 239 | | |
| |||
Lines changed: 124 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
Lines changed: 17 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
188 | 188 | | |
189 | 189 | | |
190 | 190 | | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
191 | 208 | | |
192 | 209 | | |
193 | 210 | | |
| |||
Lines changed: 34 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
98 | 98 | | |
99 | 99 | | |
100 | 100 | | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
101 | 135 | | |
102 | 136 | | |
103 | 137 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
329 | 329 | | |
330 | 330 | | |
331 | 331 | | |
| 332 | + | |
| 333 | + | |
| 334 | + | |
| 335 | + | |
| 336 | + | |
| 337 | + | |
| 338 | + | |
| 339 | + | |
332 | 340 | | |
333 | 341 | | |
334 | 342 | | |
| |||
0 commit comments