-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-15391] [SQL] manage the temporary memory of timsort #13318
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
cc @ericl |
|
Can we add a unit test for this behavior? |
|
FYI "This PR also change the loadFactor of BytesToBytesMap to 0.5 (it was 0.75)" this is a pretty low load factor. |
|
It was 0.70 (corrected), it's 30% lower after this patch. For the simplest aggregate (one integer key and one integer value), the key-value pair need 40 bytes, the pointer array need 22+ bytes, becomes 32 bytes after the patch, it will need 15% more memory. Before this patch, TimSort still allocate some memory from heap (could OOM), so the difference will be even smaller. |
| newArray.getBaseObject(), | ||
| newArray.getBaseOffset(), | ||
| array.size() * (8 / memoryAllocationFactor) | ||
| pos * 8 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should keep the memoryAllocationFactor variable, to make it more clear we are overallocating on purpose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 on that.
Is there any particular reason for removing the memoryAllocationFactor? Without this we might highly underutilize memory for TimSort.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In theory, memoryAllocationFactor should be 2 for radix sort and 1.5 for tim sort. For simplicity, I'd like to trade a little bit memory efficiency for simplicity. memoryAllocationFactor is only used in hasSpaceForAnotherRecord(), if you feel strong on this tradeoff, I could update hasSpaceForAnotherRecord() to use 1.5 for timsort.
For here, we don't need memoryAllocationFactor actually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we also use 1/1.5 in BytesToBytesMap as load factor when radix sort could not be used? I feel that's too complicated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, it will be better to use the memoryAllocationFactor. Because otherwise, we might see unnecessary spilling due to memory under utilization. If we change the memory allocation factor from 2 to 1.5 for TimSort, that is a significant amount of memory considering the size of the pointer array can be in the order of GBs for large workload.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
|
Test build #59350 has finished for PR 13318 at commit
|
|
Thank you for making this fix 👍 |
|
Test build #59405 has finished for PR 13318 at commit
|
|
Test build #59418 has finished for PR 13318 at commit
|
|
Test build #3021 has finished for PR 13318 at commit
|
|
Test build #3022 has finished for PR 13318 at commit
|
|
Test build #59432 has finished for PR 13318 at commit
|
| * How many records could be inserted, because part of the array should be left for sorting. | ||
| */ | ||
| private int pos = 0; | ||
| private int capacity = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: usableCapacity?
|
Test build #59465 has finished for PR 13318 at commit
|
|
Test build #59670 has finished for PR 13318 at commit
|
|
@ericl Comments addressed, could you take another look? |
|
|
||
| private int calcCapacity() { | ||
| // Radix sort requires same amount of used memory as buffer, Tim sort requires | ||
| // half of the used memory as buffer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we also update the comments above while instantiating array to better indicate the contents of the array (and talk about the additional buffer that's needed for sorting)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
by the way, out of curiosity, what's the worst case scenario in TimSort that requires 0.5x more buffer space?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sameeragarwal I think you already had a test case for the worst case (there are two ordered part in the array, the shortest part will be copied into a temporary buffer, os it need 0.5 buffer space), it's doced in TimSort.
|
Looks great overall, just few questions and a couple of minor comments. |
|
LGTM |
|
Test build #59949 has finished for PR 13318 at commit
|
## What changes were proposed in this pull request? Currently, the memory for temporary buffer used by TimSort is always allocated as on-heap without bookkeeping, it could cause OOM both in on-heap and off-heap mode. This PR will try to manage that by preallocate it together with the pointer array, same with RadixSort. It both works for on-heap and off-heap mode. This PR also change the loadFactor of BytesToBytesMap to 0.5 (it was 0.70), it enables use to radix sort also makes sure that we have enough memory for timsort. ## How was this patch tested? Existing tests. Author: Davies Liu <[email protected]> Closes #13318 from davies/fix_timsort. (cherry picked from commit 3074f57) Signed-off-by: Davies Liu <[email protected]>
What changes were proposed in this pull request?
Currently, the memory for temporary buffer used by TimSort is always allocated as on-heap without bookkeeping, it could cause OOM both in on-heap and off-heap mode.
This PR will try to manage that by preallocate it together with the pointer array, same with RadixSort. It both works for on-heap and off-heap mode.
This PR also change the loadFactor of BytesToBytesMap to 0.5 (it was 0.70), it enables use to radix sort also makes sure that we have enough memory for timsort.
How was this patch tested?
Existing tests.