Skip to content

Conversation

@Flamefire
Copy link
Contributor

@Flamefire Flamefire commented May 20, 2025

(created using eb --new-pr)

This fixes 2 blocking issues:

  1. Test counts can be off when unittest.subTest is used. E.g.
<?xml version="1.0"?>
<testsuites>
  <testsuite name="pytest" errors="0" failures="1" skipped="0" tests="2" time="13.590" timestamp="2025-05-08T02:18:36.745086" hostname="c68">
    <testcase classname="MiscTests" name="test_pytree_tree_leaves" time="13.493" file="dynamo/test_misc.py">
      <failure message="torch._dynamo.exc.Unsupported: 'skip function isclass in file [snip]">Traceback (most recent call last):
[snip]</failure>
      <system-out>inline_call [snip]
</system-out>
    </testcase>
  </testsuite>
</testsuites>

--> tests=2 vs 1 <testcase>

  1. A single test might have multiple <skipped> elements:
    <testcase classname="TestNestedTensorOpInfoCPU" name="test_compile_backward_xlogy_cpu_float32" time="9.881" file="test_nestedtensor.py">
      <skipped type="pytest.skip" message="Skipped!">PyTorch/2.7.0/foss-2024a/pytorch-v2.7.0/test/test_nestedtensor.py:8798: Skipped!</skipped>
      <skipped type="pytest.skip" message="Skipped!">PyTorch/2.7.0/foss-2024a/pytorch-v2.7.0/test/test_nestedtensor.py:8798: Skipped!</skipped>
    </testcase>

Additionally I added a try-catch in parse_test_result_file to report the failing file on error. So view the diff with ignored whitespace as except for added comments only if len(test_cases) != num_tests: was removed from that function

@lexming
Copy link
Contributor

lexming commented May 20, 2025

Thanks for the quick PR, testing it...

@boegel boegel added the bug fix label May 21, 2025
@boegel boegel added this to the next release (5.1.0) milestone May 21, 2025
@boegel
Copy link
Member

boegel commented May 21, 2025

@Flamefire For which PyTorch versions is this "blocking"?

@Flamefire
Copy link
Contributor Author

At least 2.6+, but IIRC I've seen it for 2.3 too in one occasion.

@boegel
Copy link
Member

boegel commented May 21, 2025

Test report by @boegel

Overview of tested easyconfigs (in order)

  • SUCCESS PyTorch-2.1.2-foss-2023a.eb

Build succeeded (with --ignore-test-failure) for 1 out of 1 (1 easyconfigs in total)
node3505.doduo.os - Linux RHEL 9.4, x86_64, AMD EPYC 7552 48-Core Processor (zen2), Python 3.9.18
See https://gist.github.com/boegel/5a64b169d935099ba7a1b7d2c7f1aab7 for a full test report.

Copy link
Contributor

@lexming lexming left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lexming
Copy link
Contributor

lexming commented Jun 16, 2025

Merging, thanks @Flamefire !

@lexming lexming merged commit bacc8b3 into easybuilders:develop Jun 16, 2025
17 checks passed
@Flamefire Flamefire deleted the 20250520130609_new_pr_pytorch branch June 16, 2025 14:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants