Skip to content

Conversation

@MFraters
Copy link
Member

@MFraters MFraters commented May 9, 2024

Macos 14 is the latest release, and also a arm version. So it would be good to see how well the tester works for this.

@MFraters MFraters added the testing enhancement Requests for improvements to the testing of the GWB label May 9, 2024
@MFraters MFraters force-pushed the add_macos14_tester branch 2 times, most recently from 0477611 to 6ae0556 Compare May 9, 2024 14:05
@MFraters
Copy link
Member Author

MFraters commented May 9, 2024

16 out of the 132 tests have issues on mac 14 arm. It seems to me mostly either small difference in the output numbers, or some small difference in the computed position caused by round off error causing it to fall on the other side of a line. It may be possible to make some the tests less sensitive to these round off errors, but I am not sure it is the case for all of them. Another option is to disable these specific tests on arm testers.

@gassmoeller @tjhei opinions?

@github-actions
Copy link

github-actions bot commented May 9, 2024

Benchmark Main Feature Difference (99.9% CI)
Slab interpolation simple none 1.119 ± 0.010 (s=434) 1.132 ± 0.011 (s=369) +1.0% .. +1.4%
Slab interpolation curved simple none 1.119 ± 0.011 (s=428) 1.135 ± 0.014 (s=373) +1.2% .. +1.7%
Spherical slab interpolation simple none 1.114 ± 0.010 (s=383) 1.125 ± 0.010 (s=423) +0.8% .. +1.2%
Slab interpolation simple curved CMS 1.166 ± 0.010 (s=386) 1.181 ± 0.014 (s=384) +1.0% .. +1.5%
Spherical slab interpolation simple CMS 1.444 ± 0.012 (s=317) 1.456 ± 0.014 (s=306) +0.5% .. +1.0%
Spherical fault interpolation simple none 1.102 ± 0.009 (s=396) 1.113 ± 0.011 (s=419) +0.8% .. +1.2%
Cartesian min max surface 2.529 ± 0.024 (s=181) 2.562 ± 0.019 (s=175) +1.0% .. +1.6%
Spherical min max surface 7.211 ± 0.073 (s=73) 7.313 ± 0.071 (s=54) +0.8% .. +2.0%

@gassmoeller
Copy link
Contributor

Disabling tests for ARM seems a bit problematic, what if they are actual bugs that only surface on arm architecture? Making the tests less sensitive to roundoff or understanding why they fail seems like the safer path (and may also have benefits for future x86 tests on new operating systems). It is more work though, and maybe not the highest priority?

@tjhei
Copy link
Contributor

tjhei commented May 17, 2024

Do you use numdiff for comparing test output right now like we do for ASPECT?

@tjhei
Copy link
Contributor

tjhei commented May 17, 2024

I would also separate Linux and osx testing.

@MFraters
Copy link
Member Author

Thanks for the feedback. Yes, numdiff is being used.

I will take the longer route than in this case, although it may take a bit before I have time to address this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

testing enhancement Requests for improvements to the testing of the GWB

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants