TST: add render testing, mark as xslow#39
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
🚀 New features to boost your workflow:
|
| # red should be the dominant channel | ||
| assert red.sum() > (green.sum() + blue.sum()) | ||
| # image size should be 50 x 50 x 4 channels | ||
| assert img_array.size == 10_000 |
There was a problem hiding this comment.
Based on our last call I was under the impression that neither ggmolvis nor MolecularNodes did render testing in GHA CI, but after investigating on my fork (tylerjereddy#1), it became clear that ggmolvis has a small number of render tests. They don't use composition, and no assertions are made about the correctness of their results, so they basically just sniff test that there is no crash.
From the durations output over there:
============================= slowest 10 durations =============================
93.19s call ggmolvis/tests/test_molecules.py::test_render[render_options1]
16.77s call ggmolvis/tests/test_molecules.py::test_render_scene
16.25s call ggmolvis/tests/test_molecules.py::test_render[render_options2]
16.05s call ggmolvis/tests/test_molecules.py::test_render[render_options0]
10.78s call ggmolvis/tests/test_render.py::test_render_basic_bg
<snip>
In that case I temporarily disabled the composition in my new test here so that it would run (i.e., temporarily turned off composite_bg_rgba).
If I check the durations locally on this branch via GGMOLVIS_XSLOW=1 python -m pytest -n 8 --durations=10
======================================================================================================================== slowest 10 durations ========================================================================================================================
3.65s call ggmolvis/tests/test_molecules.py::test_render[render_options1]
2.03s call ggmolvis/tests/test_render.py::test_render_basic_bg
1.84s call ggmolvis/tests/test_molecules.py::test_render_scene
1.68s call ggmolvis/tests/test_molecules.py::test_render[render_options0]
1.52s call ggmolvis/tests/test_molecules.py::test_render[render_options2]
0.36s call ggmolvis/tests/test_molecules.py::test_show_molecule_with_style[cartoon]
0.22s call ggmolvis/tests/test_molecules.py::test_show_molecule_with_style[preset_2]
<snip>
Also, unrelated, but looks like the exit segfault is indeed gone locally now with bpy 4.4.0 released 3 days ago.
|
The test failures related to |
The error is related to MN 4.2.12 not pinning the correct version of |
|
In our current testing setup, objects added during tests are not automatically removed from the scene afterward. To ensure a clean scene when testing pixel content, we should implement a mechanism to remove these objects. This may also help improve the rendering test speed. On the other note, currently, the tests are forced to pass (because of the segfault error), you may need to check if the test actually passes in the CI. |
|
Update: An update has been made to include a cleanup decorator function at: Use this decorator before any tests that add The rendering speed in the CI is clearly faster after introducing the cleanup (or solely because I reduced the rendering resolution). We probably should add xslow mark for movie rendering. ============================= slowest 50 durations ============================= |
* This adds the first test that actually makes assertions about the pixel content of `ggmolvis`-rendered images. It turns out that regular render calls do "ok" in GitHub actions, but if you use the compositor there seem to be issues. Since the test added here uses the compositor to set the background color, it is marked `xslow` so that the user/developer must opt in to run it. * This approach has the advantage that we can start marking assertions about the expected appearance of renders, but of course it has the drawback that we can only run these tests locally for now (or using one of the services that offers GPU CI runners, which some larger OSS projects do, but usually has a cost). I think it is still better than nothing given that we could at least bisect on newly-introduced problems locally even if it is unfortunately rather easy for bugs to slip past the CI this way. This is a common approach upstream for tests that are not CI tractable (i.e., incredibly slow, or require unreasonably large amounts of memory to reproduce the original problem). * For a bit of a dissection on composition causing crashes in GHA CI, see recent testing on my fork at: #1
* Switch to usage of `ggmv` fixture for the new `test_render_basic_bg()` test, because this is required for the `cleanup` decorator to work properly.
* Update the testing utility _clean_up() function to additionally use `read_homefile()` for a cleaner blender state between tests. This is similar to the approach that Brady uses upstream.
* It seems that using `open_mainfile()` rather than `read_homefile` in the `_clean_up()` function is more robust for our testsuite (doesn't cause failures for `test_show_trajectory_with_material` for example). One point of caution is that running locally with multiple threads with `pytest-xdist` does seem to run into issues with this approach, perhaps because of the reading of `bpy.data.filepath`.
bb05a06 to
a01013c
Compare
| ggmv._session.remove(artist.trajectory.uuid) | ||
| except Exception: | ||
| pass | ||
| bpy.ops.wm.open_mainfile(filepath=bpy.data.filepath) |
There was a problem hiding this comment.
This was added based on the discussion at gh-41, and seems to help isolate (from cross-test pollution) the scenes used in render tests locally. It is slightly different than the approach Brady uses upstream at https://github.com/BradyAJohnston/MolecularNodes/blob/main/tests/conftest.py#L16 with bpy.ops.wm.read_homefile(app_template=""). When I used the latter, I ran into a failure with test_show_trajectory_with_material. Probably some preservation of state is needed that Yuxuan understands and I don't?
One problem with this change is that, at least in local testing, this seems to reduce "thread safety" with pytest-xdist a bit. For example, the following incantation sometimes has issues parsing .blend files:
GGMOLVIS_XSLOW=1 python -m pytest -n 8 (to reproduce locally on this branch, it can help to increase -n to something like 32)
Details
``` =============================================================================================================================== ERRORS =============================================================================================================================== __________________________________________________________________________________________________________________ ERROR collecting ggmolvis/tests ___________________________________________________________________________________________________________________ ../spack/opt/spack/darwin-sonoma-m1/apple-clang-15.0.0/python-3.11.7-g6qcoukv4k4emoygj4zxqsig2zqs3wlj/lib/python3.11/importlib/__init__.py:126: in import_module return _bootstrap._gcd_import(name[level:], package, level) :1204: in _gcd_import ??? :1176: in _find_and_load ??? :1126: in _find_and_load_unlocked ??? :241: in _call_with_frames_removed ??? :1204: in _gcd_import ??? :1176: in _find_and_load ??? :1126: in _find_and_load_unlocked ??? :241: in _call_with_frames_removed ??? :1204: in _gcd_import ??? :1176: in _find_and_load ??? :1147: in _find_and_load_unlocked ??? :690: in _load_unlocked ??? :940: in exec_module ??? :241: in _call_with_frames_removed ??? ggmolvis/__init__.py:55: in bpy.ops.wm.open_mainfile(filepath=dest_path) ../../python_venvs/py_311_mda_dev/lib/python3.11/site-packages/bpy/4.4/scripts/modules/bpy/ops.py:109: in __call__ ret = _op_call(self.idname_py(), kw) E RuntimeError: Error: File format is not supported in file "/var/folders/5_/hm0ft57n6dn2ksgg2p0bx5h0000w2g/T/ggmolvis.blend" ```Spot checking the ubuntu-latest main CI job, I see the testsuite finishing with 38 passed, 1 skipped, 3 xfailed, 8 warnings in 51.18s, which suggests that the new xslow test is correctly being skipped in the CI.
That said, I'm not sure where we stand on the tradeoffs here if this reduces thread safety. Maybe we could use a lock/mutex somehow, or even just suggest avoiding pytest-xdist in the short term. IIRC I can even observe this type of failure without pytest-xdist usage, but it is much rarer.
I think the issue is a race condition in The patch below should fix this: --- a/ggmolvis/__init__.py
+++ b/ggmolvis/__init__.py
@@ -6,6 +6,7 @@ Molecular visualization with Blender
import os
import shutil
import tempfile
+import uuid
from importlib.metadata import version
import bpy
from bpy.app.handlers import frame_change_pre
@@ -45,7 +46,8 @@ base_name = 'ggmolvis.blend'
#count += 1
# we will save the current session to the temporary directory
-dest_dir = tempfile.gettempdir()
+dest_dir = f"{tempfile.gettempdir()}{uuid.uuid4()}"
+os.makedirs(dest_dir, exist_ok=True)
dest_path = os.path.join(dest_dir, base_name)
logger.debug(f"Blend file stored at {dest_path}") |
* Fix a thread safety issue in ggmolvis initialization. Co-authored-by: Yuxuan Zhuang <[email protected]>
|
@yuxuanzhuang thanks, I pulled in your patch (with you as co-author) and the thread safety issue with |
Co-authored-by: Yuxuan Zhuang <[email protected]>
|
@yuxuanzhuang the remaining CI failures are unrelated--is anything else needed here? I suppose an argument could be made to switch from |
yuxuanzhuang
left a comment
There was a problem hiding this comment.
Thanks! It also looks good!
This adds the first test that actually makes assertions about the pixel content of
ggmolvis-rendered images. It turns out that regular render calls do "ok" (slow, but works) in GitHub actions, but if you use the compositor there seem to be issues (hard crashing). Since the test added here uses the compositor to set the background color, it is markedxslowso that the user/developer must opt in to run it (and it doesn't run in CI, by default).This approach has the advantage that we can start marking assertions about the expected appearance of renders, but of course it has the drawback that we can only run these tests locally for now (or using one of the services that offers GPU CI runners, which some larger OSS projects do, but usually has a cost). I think it is still better than nothing given that we could at least bisect on newly-introduced problems locally even if it is unfortunately rather easy for bugs to slip past the CI this way. This is a common approach upstream for tests that are not CI tractable (i.e., incredibly slow, or require unreasonably large amounts of memory to reproduce the original problem).
For a bit of a dissection on composition causing crashes in GHA CI, see recent testing on my fork at:
TST: render test demo tylerjereddy/ggmolvis#1
PR Checklist