Skip to content

Unset Python Endpoint close callback upon destruction#206

Merged
rapids-bot[bot] merged 13 commits intorapidsai:branch-0.37from
pentschev:python-remove-close-callback-upon-ep-destruction
Mar 14, 2024
Merged

Unset Python Endpoint close callback upon destruction#206
rapids-bot[bot] merged 13 commits intorapidsai:branch-0.37from
pentschev:python-remove-close-callback-upon-ep-destruction

Conversation

@pentschev
Copy link
Member

In some cases, particularly with distributed-ucxx, the C++ Endpoint outlives the Python object that owns a close callback registered with set_close_callback(), making it an invalid reference and causing segfaults. To prevent that it is necessary to remove the close callback, preventing it from being called after it's not valid anymore.

In some cases, particularly with `distributed-ucxx`, the C++ Endpoint
outlives the Python object that owns a close callback registered with
`set_close_callback()`, making it an invalid reference and causing
segfaults. To prevent that it is necessary to remove the close callback,
preventing it from being called after it's not valid anymore.
@pentschev pentschev requested a review from a team as a code owner March 13, 2024 11:44


@gen_test()
@gen_test(timeout=60)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With some local experimentation it looks like teardown failures are in essence a gen_test timeout, but it doesn't show up as such. I tried increasing the timeout here but one of the tests nevertheless failed. Locally though I see this test completing in under 10 seconds or in 60 seconds with this change even though it ultimately passes after the full 60 seconds. With that I'm trying to understand whether this is a problem with UCXX or if this is something with the Distributed test suite (e.g., gen_test) as I would expect it to fail every time if it times out.

Copy link
Contributor

@wence- wence- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor tidying suggestions

pentschev and others added 3 commits March 14, 2024 13:30
Co-authored-by: Lawrence Mitchell <[email protected]>
…n-ep-destruction' into python-remove-close-callback-upon-ep-destruction
@pentschev
Copy link
Member Author

I just reverted the commit increasing timeout for test_transpose as that doesn't have any effect. If there are no objections I'll merge it shortly after CI passes.

@pentschev
Copy link
Member Author

Thanks for taking the time for reviewing @wence- !

@pentschev
Copy link
Member Author

/merge

@rapids-bot rapids-bot bot merged commit ea7d18b into rapidsai:branch-0.37 Mar 14, 2024
@pentschev pentschev deleted the python-remove-close-callback-upon-ep-destruction branch March 15, 2024 14:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants