Skip to content

Fix debugger hangs due to runtime deadlock by using patch DJI#123651

Merged
steveisok merged 4 commits intomainfrom
copilot/fix-debugger-deadlock-issue
Feb 5, 2026
Merged

Fix debugger hangs due to runtime deadlock by using patch DJI#123651
steveisok merged 4 commits intomainfrom
copilot/fix-debugger-deadlock-issue

Conversation

Copy link
Contributor

Copilot AI commented Jan 27, 2026

Description

This PR implements a targeted fix for the debugger deadlock issue where the debugger hangs while waiting for a deadlocked debuggee process. The deadlock occurs when DebuggerController::BindPatch calls GetJitInfo while holding the DebuggerController lock. GetJitInfo can trigger HashMap lookups that perform GC mode transitions, leading to lock ordering issues.

The fix modifies BindPatch to check if the patch already has a DebuggerJitInfo (DJI) stored on it via patch->HasDJI(). When the DJI is available, it uses patch->GetDJI() instead of calling the problematic GetJitInfo method. This approach is cleaner than adding a new parameter since the patch already stores the DJI.

Changes Made

  • Updated BindPatch implementation to check patch->HasDJI() before calling GetJitInfo
  • When DJI exists on the patch, use patch->GetDJI() to avoid the deadlock-prone GetJitInfo call
  • Maintains backward compatibility - patches without DJI continue to call GetJitInfo as before

The implementation uses a ternary operator for concise, idiomatic C++ code:

DebuggerJitInfo *info = patch->HasDJI() ? patch->GetDJI() : g_pDebugger->GetJitInfo(pMD, (const BYTE *)startAddr);

Customer Impact

Without this fix, customers can experience debugger hangs when debugging applications with ReadyToRun methods under specific timing conditions. This occurs when:

  1. A breakpoint is set on a ReadyToRun method that hasn't executed yet
  2. One thread is stepping through code while another thread first executes the ReadyToRun method with the breakpoint

The hang requires the debugger and IDE to be force-closed, resulting in loss of debugging session state and developer productivity.

Regression

This is not a regression from the most recent release. The root cause has existed for a while, but timing changes in the code may have impacted the likelihood of hitting the race condition.

Testing

  • Full CoreCLR build completed successfully with no errors or warnings
  • Code review passed with no issues identified
  • No security vulnerabilities introduced

The changes are minimal (1 file changed, 3 insertions, 1 deletion) and surgical to reduce regression risk for potential backporting to prior .NET releases.

Risk

Low Risk. The changes are minimal and highly targeted:

  • No API changes - the BindPatch signature remains unchanged
  • The logic flow remains unchanged - we're just avoiding an unnecessary lookup when the DJI is already available on the patch
  • The patch already stores the DJI via AddPatchForMethodDef, so we're using existing data
  • Full build validation confirms no compilation issues
  • The fix addresses the specific deadlock scenario without affecting other debugger functionality
  • Maintains backward compatibility - patches without DJI still call GetJitInfo as before
Original prompt

This section details on the original issue you should resolve

<issue_title>Debugger hangs due to runtime deadlock. HashMap takes ThreadStore out of order</issue_title>
<issue_description>### Description

The debugger is blocked in mscordbi code at:

      [Inline Frame] mscordbi.dll!SafeWaitForSingleObject(CordbProcess *) Line 257  C++
>     mscordbi.dll!CordbProcess::StopInternal(unsigned long dwTimeout, VMPTR_Base<AppDomain,void> pAppDomainToken) Line 3611  C++
      [Inline Frame] mscordbi.dll!StopContinueHolder::Init(CordbProcess *) Line 208 C++
      mscordbi.dll!CordbModule::GetFunctionFromToken(unsigned int token, ICorDebugFunction * * ppFunction) Line 1387    C++

mscordbi is waiting on the debuggee to respond which it won't do because it is deadlocked.

Thread 0x9314 in the debuggee (trying to acquire DebuggerController lock):

      [Inline Frame] coreclr.dll!CrstBase::AcquireLock(CrstBase *) Line 174   C++
      [Inline Frame] coreclr.dll!FunctionBase<CrstBase *,&CrstBase::AcquireLock,&CrstBase::ReleaseLock>::DoAcquire() Line 694 C++
      [Inline Frame] coreclr.dll!BaseHolder<CrstBase *,FunctionBase<CrstBase *,&CrstBase::AcquireLock,&CrstBase::ReleaseLock>,0,&CompareDefault<CrstBase *>>::Acquire() Line 266    C++
      [Inline Frame] coreclr.dll!BaseHolder<CrstBase *,FunctionBase<CrstBase *,&CrstBase::AcquireLock,&CrstBase::ReleaseLock>,0,&CompareDefault<CrstBase *>>::{ctor}(CrstBase *) Line 233 C++
      [Inline Frame] coreclr.dll!Holder<CrstBase *,&CrstBase::AcquireLock,&CrstBase::ReleaseLock,0,&CompareDefault<CrstBase *>,1>::{ctor}(CrstBase *) Line 729    C++
>     coreclr.dll!DebuggerController::DispatchPatchOrSingleStep(Thread * thread, _CONTEXT * context, const unsigned char * address, SCAN_TRIGGER which, DebuggerSteppingInfo * pDebuggerSteppingInfo) Line 3063   C++
      coreclr.dll!DebuggerController::DispatchNativeException(_EXCEPTION_RECORD * pException, _CONTEXT * pContext, unsigned long dwCode, Thread * pCurThread, DebuggerSteppingInfo * pDebuggerSteppingInfo) Line 4618   C++
      coreclr.dll!Debugger::FirstChanceNativeException(_EXCEPTION_RECORD * exception, _CONTEXT * context, unsigned long code, Thread * thread, int fIsVEH) Line 5470    C++
      coreclr.dll!IsDebuggerFault(_EXCEPTION_RECORD * pExceptionRecord, _CONTEXT * pContext, unsigned long exceptionCode, Thread * pThread) Line 5811 C++

And the DebuggerController lock is held by thread 0x1E38 which is here:

     [Inline Frame] coreclr.dll!CLREventBase::Wait(unsigned long) Line 412   C++
     coreclr.dll!Thread::WaitSuspendEventsHelper() Line 4458     C++
     coreclr.dll!Thread::RareDisablePreemptiveGC() Line 2185     C++
     [Inline Frame] coreclr.dll!Thread::DisablePreemptiveGC() Line 1288      C++
     [Inline Frame] coreclr.dll!GCHolderBase::EnterInternalCoop_HackNoThread(bool) Line 4619   C++
     [Inline Frame] coreclr.dll!GCCoopHackNoThread::{ctor}(bool) Line 4834   C++
     [Inline Frame] coreclr.dll!HashMap::LookupValue(unsigned __int64) Line 552    C++
     [Inline Frame] coreclr.dll!PtrHashMap::LookupValue(unsigned __int64 key, void *) Line 607 C++
     [Inline Frame] coreclr.dll!ReadyToRunInfo::GetMethodDescForEntryPointInNativeImage(unsigned __int64) Line 376     C++
     [Inline Frame] coreclr.dll!ReadyToRunInfo::GetMethodDescForEntryPoint(unsigned __int64 entryPoint) Line 104 C++
     coreclr.dll!ReadyToRunJitManager::JitCodeToMethodInfo(RangeSection * pRangeSection, unsigned __int64 currentPC, MethodDesc * * ppMethodDesc, EECodeInfo * pCodeInfo) Line 6451      C++
     coreclr.dll!EECodeInfo::Init(unsigned __int64 codeAddress, ExecutionManager::ScanFlag scanFlag) Line 15025  C++
     [Inline Frame] coreclr.dll!EECodeInfo::{ctor}(unsigned __int64) Line 2877     C++
     [Inline Frame] coreclr.dll!ExecutionManager::GetCodeStartAddress(unsigned __int64) Line 4980    C++
     coreclr.dll!EEDbgInterfaceImpl::GetNativeCodeStartAddress(unsigned __int64 address) Line 416    C++
     coreclr.dll!Debugger::GetJitInfoWorker(MethodDesc * fd, const unsigned char * pbAddr, DebuggerMethodInfo * * pMethInfo) Line 2711   C++
     [Inline Frame] coreclr.dll!Debugger::GetJitInfo(MethodDesc *) Line 2650 C++
     coreclr.dll!DebuggerController::BindPatch(DebuggerControllerPatch * patch, MethodDesc * pMD, const unsigned char *) Line 1432 C++

     coreclr.dll!DebuggerController::AddBindAndActivatePatchForMethodDesc(MethodDesc * fd, DebuggerJitInfo * dji, unsigned __int64 nativeOffset, DebuggerPatchKind kind, FramePointer fp, AppDomain * pAppDomain) Line 2242  C++
     coreclr.dll!DebuggerController::AddBindAndActivateILReplicaPatch(DebuggerControllerPatch * primary, DebuggerJitInfo * dji) Line 2006      C++
     coreclr.dll!Debugger::MapPatchToDJI(DebuggerControllerPatch * dcp, DebuggerJitInfo * djiTo) Line 4963 C++
     coreclr.dll!Debugger::MapAndBindFunctionPatches(DebuggerJitInfo * djiNew, MethodDesc * fd, const unsigned char *) Line 4879   C++
     coreclr.dll!Debugger::JIT...


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

@noahfalk
Copy link
Member

@copilot - it looks like you had an error earlier and no files have changed. Can you please try again?

Modified BindPatch to accept an optional DebuggerJitInfo parameter.
When DJI is provided by the caller, use it instead of calling GetJitInfo,
which can trigger a deadlock through HashMap lookups with GC mode transitions.

Updated two call sites that already have the DJI available:
- AddBindAndActivatePatchForMethodDesc
- MapPatchToDJI

This is a targeted fix to avoid the deadlock described in issue where
DebuggerController lock is held while calling GetJitInfo, which can
try to acquire locks in the wrong order.

Co-authored-by: noahfalk <[email protected]>
Copy link
Contributor Author

Copilot AI commented Jan 27, 2026

@copilot - it looks like you had an error earlier and no files have changed. Can you please try again?

I've implemented the targeted fix. The changes pass DJI to BindPatch to avoid the deadlock-causing GetJitInfo call. Build completed successfully with no errors. Commit: 2c6edbe

@AaronRobinsonMSFT
Copy link
Member

@copilot Instead of passing in a new DebuggerJitInfo argument, you should use the one that is on the patch itself. It can retrieved by calling GetDJI() on the DebuggerControllerPatch. You can check if the DebuggerJitInfo field is valid by calling HasDJI().

Copilot AI and others added 2 commits February 4, 2026 00:26
Per code review feedback, simplified the fix to use the DJI already
stored on the patch rather than passing it as a new parameter.

This achieves the same goal of avoiding the deadlock-causing GetJitInfo
call while being cleaner and more aligned with existing code patterns.

Co-authored-by: AaronRobinsonMSFT <[email protected]>
Per code review feedback, use a ternary operator instead of
if-else block for more concise and idiomatic C++ code.

Co-authored-by: AaronRobinsonMSFT <[email protected]>
Copy link
Contributor Author

Copilot AI commented Feb 4, 2026

@copilot Instead of passing in a new DebuggerJitInfo argument, you should use the one that is on the patch itself. It can retrieved by calling GetDJI() on the DebuggerControllerPatch. You can ch...

Updated to use patch->GetDJI() instead of passing DJI as a parameter. The implementation now checks patch->HasDJI() and uses patch->GetDJI() when available. Commits: 581a7bb, 6a6a4d6

Copilot AI changed the title Fix debugger hangs due to runtime deadlock by passing DJI to BindPatch Fix debugger hangs due to runtime deadlock by using patch DJI Feb 4, 2026
Copy link
Member

@noahfalk noahfalk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@steveisok steveisok marked this pull request as ready for review February 4, 2026 13:35
Copilot AI review requested due to automatic review settings February 4, 2026 13:35
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a debugger deadlock issue where the runtime hangs when DebuggerController::BindPatch calls GetJitInfo while holding the DebuggerController lock. The GetJitInfo method can trigger HashMap lookups that perform GC mode transitions, leading to lock ordering violations and deadlocks.

Changes:

  • Modified BindPatch to check if the patch already has a DebuggerJitInfo (DJI) stored via patch->HasDJI() and use patch->GetDJI() instead of calling the deadlock-prone GetJitInfo method
  • Added explanatory comments documenting the deadlock issue and why the optimization is necessary
  • Maintained backward compatibility by falling back to GetJitInfo when no DJI is available on the patch

Copy link
Member

@max-charlamb max-charlamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested locally and looks good to me

@steveisok steveisok enabled auto-merge (squash) February 5, 2026 00:58
@steveisok
Copy link
Member

/ba-g Known issues #123796 #122874

@steveisok steveisok merged commit 2e76447 into main Feb 5, 2026
104 of 107 checks passed
@steveisok
Copy link
Member

/backport to release/10.0

@steveisok steveisok deleted the copilot/fix-debugger-deadlock-issue branch February 5, 2026 01:00
@github-actions
Copy link
Contributor

github-actions bot commented Feb 5, 2026

Started backporting to release/10.0 (link to workflow run)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Debugger hangs due to runtime deadlock. HashMap takes ThreadStore out of order

5 participants