-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-19839][Core]release longArray in BytesToBytesMap #17180
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Probably one for @JoshRosen |
|
Test build #74036 has finished for PR 17180 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: i would do this after the memory is cleaned up
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line seems to explain the behavior of current code (before applying this PR). It would be good to explain the behavior of the new code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity, what does this short-circuiting achieve?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIK, spill is called when other consumers want to allocate in-memory pages but can't get any memory. If we are able to release ample memory by free'ing up the longArray, then the purpose of spill is achieved. Hence the short circuit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC we can only free this pointer array if we spill its related data pages. Is this short circuit safe to do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cloud-fan, I was wondering this, too: I think that this is okay because we're in a destructive iterator over the map, so we can just walk through the data pages without needing to worry about hash-style lookups. If this is the case, though, then we should just destroy the pointer array earlier (as I commented downthread on the PR).
|
If we truly don't need the |
|
Hi @zhzhan is this PR active? if so, would you answer or address the review comment? |
|
per review comments, release the longArray on destructive iterator creation. |
|
Test build #79918 has finished for PR 17180 at commit
|
|
The test failure us caused by call method on the map after We should remove this test case as it does not follow the restriction. Please let me know the feedback. Stack trace of the failure. |
|
Is it better to fix this test instead of remove it? |
|
Will fix the unit test. |
|
Test build #79989 has finished for PR 17180 at commit
|
|
retest it please. |
|
retest this please |
|
Test build #80048 has finished for PR 17180 at commit
|
|
LGTM Wait for @JoshRosen for final sign off. |
|
This seems fine to me. That said, the updated test case is a bit confusing,
but I don't think the original test was too clear to begin with. The
original test was using the `iterator()` method to make an assertion about
the internal state of the map, then was checking whether the pattern of
getting a buffer from the map, updating it, then getting it again from the
map would reflect the update. After your update to the test, the comment
doesn't quite align with the test anymore. The right way to fix this
involves splitting up the affected test into two separate tests. We can do
that in a followup, though, since I think that we still have sufficient
code coverage via other tests and this PR has already been under review for
months now so it'd be better to get it merged and move on.
…On Sat, Jul 29, 2017 at 6:19 PM Xiao Li ***@***.***> wrote:
LGTM Wait for @JoshRosen <https://github.com/joshrosen> for final sign
off.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#17180 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AADGPLGZbKkiLGK6qsrunDEpRj_Vhs0mks5sS9ojgaJpZM4MUoYt>
.
|
|
Thanks! Merging to master. @zhzhan Could you address the comments about the test case in the follow-up PR? Thanks! |
What changes were proposed in this pull request?
When BytesToBytesMap spills, its longArray should be released. Otherwise, it may not released until the task complete. This array may take a significant amount of memory, which cannot be used by later operator, such as UnsafeShuffleExternalSorter, resulting in more frequent spill in sorter. This patch release the array as destructive iterator will not use this array anymore.
How was this patch tested?
Manual test in production