Skip to content

Conversation

@pvelesko
Copy link
Collaborator

@pvelesko pvelesko commented May 5, 2024

  • Fix the issue where not all events were being recycled by Level Zero event collector
  • Refactor CHIPEventLevel0::wait() to get rid of race conditions reported by valgrind. The API states that calls to zeEventHostSynchronize and zeEventQuery are thread safe but valgrind reports race conditions.
  • Implement a global shared mutex - ApiMtx which is to be locked by every HIP API call. This prevents multiple HIP commands from executing at the same time. Pretty coarse lock which we can relax over time since performance is affected only for multithreaded HIP applications of which I haven't seen any yet.

@pvelesko pvelesko force-pushed the thread-safety branch 2 times, most recently from 67ec73e to 54e85b3 Compare May 5, 2024 22:36
@pvelesko pvelesko changed the title EventPool - cleanup Level Zero - Fix OOM & Improve Thread Safety May 5, 2024
@pvelesko pvelesko marked this pull request as ready for review May 5, 2024 22:38
@pvelesko pvelesko requested a review from linehill May 5, 2024 22:46
@pvelesko pvelesko requested a review from linehill May 7, 2024 09:58
@pvelesko pvelesko merged commit 994afd4 into main May 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants