-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Fix issue in decommit_ephemeral_segment_pages (segment case). #59989
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix issue in decommit_ephemeral_segment_pages (segment case). #59989
Conversation
We assume that we can use half the free list space in gen 0 for new allocation. If that is too optimistic, we may allocate into decommitted memory and crash in the allocator. That is because there is a race condition between the allocating thread and the decommitting thread - we decided to avoid that by making sure we would never decommit memory that we may allocate in gen 0. There are two reasons why assuming we can use half the free list space for new allocations may be too optimistic: - if we allocate large objects in gen 0, we may not have free spaces of the necessary size available. - when background GC goes into background_ephemeral_sweep, it deletes and rebuilds the free list for gen 0. A thread trying to allocate during that time may see a completely empty free list.
|
Tagging subscribers to this area: @dotnet/gc Issue DetailsWe assume that we can use half the free list space in gen 0 for new allocation. If that is too optimistic, we may allocate into decommitted memory and crash in the allocator. That is because there is a race condition between the allocating thread and the decommitting thread - we decided to avoid that by making sure we would never decommit memory that we may allocate in gen 0. There are two reasons why assuming we can use half the free list space for new allocations may be too optimistic:
We used to do this correctly by not taking into account the free list space in gen 0 at all. But as we made changes for regions, that seemed unnecessarily pessimistic, but as we changed the logic, we failed to honor the invariants the allocator relies on. The fix essentially goes back to the older logic where the free list space in gen 0 is not taken into account. We can do better than this, but it's more complicated.
|
|
I am wondering if we have considered the
This is just my suspicion, I haven't been able to reproduce that yet. |
|
@cshung that is indeed a problem. since |
|
I think we already hit this issue and fixed it (PR #41441 )- the fix is to immediately turn off gradual_decommit_in_progress_p when the server GC threads start running. Code from gc_heap::gc_thread_function():
I think that on server GC, set_allocations_for_no_gc is always called on a GC thread, so we should be fine. Or am I overlooking something? |
|
here's the scenario I'm thinking about - when we are in a NoGC region, we could trigger a GC, the special thing we do is we adjust the budget after we've done the logic to decide whether we want to gradual commit or not. in |
|
oh actually NM.... I was wrong - I forgot that we do not go into the code path that adjusts |
|
I looked at the code some more and concluded that you are right, @Maoni0 - the code is in fact correct regarding the potential problem raised by @cshung. To summarize, the reasons are as follows:
|
Maoni0
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM and we'll need to port this back to 6.0
cshung
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me!
|
/backport to release/6.0 |
|
Started backporting to release/6.0: https://github.com/dotnet/runtime/actions/runs/1320244166 |
We assume that we can use half the free list space in gen 0 for new allocation. If that is too optimistic, we may allocate into decommitted memory and crash in the allocator. That is because there is a race condition between the allocating thread and the decommitting thread - we decided to avoid that by making sure we would never decommit memory that we may allocate in gen 0.
There are two reasons why assuming we can use half the free list space for new allocations may be too optimistic:
We used to do this correctly by not taking into account the free list space in gen 0 at all. But as we made changes for regions, that seemed unnecessarily pessimistic, but as we changed the logic, we failed to honor the invariants the allocator relies on.
The fix essentially goes back to the older logic where the free list space in gen 0 is not taken into account. We can do better than this, but it's more complicated.