Skip to content

Conversation

@ngoldbaum
Copy link

What does this PR do?

Adds a test based on the script I used to find pytorch/pytorch#158071.

The test serializes a dict of tensors in many threads simultaneously. Currently this is enough to segfault pytorch on the free-threaded build - no mutation necessary.

IMO having a multithreaded stress test like this is worth doing, especially looking torward adding support for the free-threaded build.

Copy link
Contributor

@Narsil Narsil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice addition indeed. Can you remove the parametrize everywhere and instead just use simple code everywhere ?

Just duplicate the tests for numpy and torch, it's perfectly fine

@ngoldbaum
Copy link
Author

Much simpler now, thanks for the code review.

@Narsil
Copy link
Contributor

Narsil commented Aug 8, 2025

Great, it's working and I can reproduce the segfault. Let's probably wait for it to be fixed in torch before merging this, no ?

@Narsil
Copy link
Contributor

Narsil commented Aug 8, 2025

Clippy errors are fixed on main.

@ngoldbaum
Copy link
Author

ngoldbaum commented Aug 8, 2025

IMO it's still worth running this test on the GIL-enabled build and for NumPy. I'll skip the test if the GIL is disabled for now and add a link to the PyTorch issue.

@ngoldbaum ngoldbaum force-pushed the multithreaded-tests-2 branch from 43a28b7 to 671f6e6 Compare August 8, 2025 15:21
@Narsil
Copy link
Contributor

Narsil commented Aug 11, 2025

I don't like skipping a test because it fails. The point of tests is to let them fail when something is wrong. If we simply skip them when something is working, then nothing will ever check that finally they are working again and we should switch them.

That's why I suggest we simply postpone this. The fact that the PR is open, means it will be revisited at some point.

@ngoldbaum
Copy link
Author

The fact that the PR is open, means it will be revisited at some point.

Sounds good, I've added a note to myself to revisit this once the PyTorch issue is fixed upstream and we can at least set up CI with a nightly build.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants