Skip to content

Comments

TST: add render testing, mark as xslow#39

Merged
yuxuanzhuang merged 6 commits intoyuxuanzhuang:mainfrom
tylerjereddy:treddy_render_test_demo_xslow
Mar 30, 2025
Merged

TST: add render testing, mark as xslow#39
yuxuanzhuang merged 6 commits intoyuxuanzhuang:mainfrom
tylerjereddy:treddy_render_test_demo_xslow

Conversation

@tylerjereddy
Copy link
Contributor

  • This adds the first test that actually makes assertions about the pixel content of ggmolvis-rendered images. It turns out that regular render calls do "ok" (slow, but works) in GitHub actions, but if you use the compositor there seem to be issues (hard crashing). Since the test added here uses the compositor to set the background color, it is marked xslow so that the user/developer must opt in to run it (and it doesn't run in CI, by default).

  • This approach has the advantage that we can start marking assertions about the expected appearance of renders, but of course it has the drawback that we can only run these tests locally for now (or using one of the services that offers GPU CI runners, which some larger OSS projects do, but usually has a cost). I think it is still better than nothing given that we could at least bisect on newly-introduced problems locally even if it is unfortunately rather easy for bugs to slip past the CI this way. This is a common approach upstream for tests that are not CI tractable (i.e., incredibly slow, or require unreasonably large amounts of memory to reproduce the original problem).

  • For a bit of a dissection on composition causing crashes in GHA CI, see recent testing on my fork at:
    TST: render test demo tylerjereddy/ggmolvis#1

PR Checklist

  • Tests?
  • Docs?
  • CHANGELOG updated?
  • Issue raised/referenced?

@codecov
Copy link

codecov bot commented Mar 21, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 75.42%. Comparing base (ab278b7) to head (fffe15c).
Report is 2 commits behind head on main.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

# red should be the dominant channel
assert red.sum() > (green.sum() + blue.sum())
# image size should be 50 x 50 x 4 channels
assert img_array.size == 10_000
Copy link
Contributor Author

@tylerjereddy tylerjereddy Mar 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on our last call I was under the impression that neither ggmolvis nor MolecularNodes did render testing in GHA CI, but after investigating on my fork (tylerjereddy#1), it became clear that ggmolvis has a small number of render tests. They don't use composition, and no assertions are made about the correctness of their results, so they basically just sniff test that there is no crash.

From the durations output over there:

============================= slowest 10 durations =============================
93.19s call     ggmolvis/tests/test_molecules.py::test_render[render_options1]
16.77s call     ggmolvis/tests/test_molecules.py::test_render_scene
16.25s call     ggmolvis/tests/test_molecules.py::test_render[render_options2]
16.05s call     ggmolvis/tests/test_molecules.py::test_render[render_options0]
10.78s call     ggmolvis/tests/test_render.py::test_render_basic_bg
<snip>

In that case I temporarily disabled the composition in my new test here so that it would run (i.e., temporarily turned off composite_bg_rgba).

If I check the durations locally on this branch via GGMOLVIS_XSLOW=1 python -m pytest -n 8 --durations=10

======================================================================================================================== slowest 10 durations ========================================================================================================================
3.65s call     ggmolvis/tests/test_molecules.py::test_render[render_options1]
2.03s call     ggmolvis/tests/test_render.py::test_render_basic_bg
1.84s call     ggmolvis/tests/test_molecules.py::test_render_scene
1.68s call     ggmolvis/tests/test_molecules.py::test_render[render_options0]
1.52s call     ggmolvis/tests/test_molecules.py::test_render[render_options2]
0.36s call     ggmolvis/tests/test_molecules.py::test_show_molecule_with_style[cartoon]
0.22s call     ggmolvis/tests/test_molecules.py::test_show_molecule_with_style[preset_2]
<snip>

Also, unrelated, but looks like the exit segfault is indeed gone locally now with bpy 4.4.0 released 3 days ago.

@tylerjereddy
Copy link
Contributor Author

The test failures related to ImportError: cannot import name 'Domains' from 'databpy' (/Users/runner/miniconda3/envs/ggmolvis-test/lib/python3.11/site-packages/databpy/__init__.py) are clearly unrelated, and were already showing up in gh-36.

@yuxuanzhuang
Copy link
Owner

The test failures related to ImportError: cannot import name 'Domains' from 'databpy' (/Users/runner/miniconda3/envs/ggmolvis-test/lib/python3.11/site-packages/databpy/__init__.py) are clearly unrelated, and were already showing up in gh-36.

The error is related to MN 4.2.12 not pinning the correct version of databpy. For now, we will ignore backward compatibility issues and focus on supporting the latest versions of everything.

@yuxuanzhuang
Copy link
Owner

In our current testing setup, objects added during tests are not automatically removed from the scene afterward. To ensure a clean scene when testing pixel content, we should implement a mechanism to remove these objects. This may also help improve the rendering test speed.

On the other note, currently, the tests are forced to pass (because of the segfault error), you may need to check if the test actually passes in the CI.

@yuxuanzhuang
Copy link
Owner

yuxuanzhuang commented Mar 22, 2025

Update:

An update has been made to include a cleanup decorator function at:
https://github.com/yuxuanzhuang/ggmolvis/blob/ab278b754d313fb83dcf1dd2ff66f7b28e130545/ggmolvis/tests/utils.py#L5C5-L5C12

Use this decorator before any tests that add ggmolvis objects to Blender. For composition, an extra cleanup step is probably needed.

The rendering speed in the CI is clearly faster after introducing the cleanup (or solely because I reduced the rendering resolution). We probably should add xslow mark for movie rendering.

============================= slowest 50 durations =============================
31.17s call ggmolvis/tests/test_render.py::test_render_mp4[render_mp4_options1]
24.84s call ggmolvis/tests/test_render.py::test_render_mp4[render_mp4_options0]
4.61s call ggmolvis/tests/test_render.py::test_render_png[render_image_options1]
4.38s call ggmolvis/tests/test_render.py::test_render_png[render_image_options0]
3.67s call ggmolvis/tests/test_render.py::test_render_scene

* This adds the first test that actually makes assertions about
the pixel content of `ggmolvis`-rendered images. It turns out
that regular render calls do "ok" in GitHub actions, but if you
use the compositor there seem to be issues. Since the test added here
uses the compositor to set the background color, it is marked `xslow`
so that the user/developer must opt in to run it.

* This approach has the advantage that we can start marking assertions
about the expected appearance of renders, but of course it has the
drawback that we can only run these tests locally for now (or using
one of the services that offers GPU CI runners, which some larger OSS
projects do, but usually has a cost). I think it is still better than
nothing given that we could at least bisect on newly-introduced problems
locally even if it is unfortunately rather easy for bugs to slip past
the CI this way. This is a common approach upstream for tests that are
not CI tractable (i.e., incredibly slow, or require unreasonably large
amounts of memory to reproduce the original problem).

* For a bit of a dissection on composition causing crashes in GHA CI,
see recent testing on my fork at:
#1
* Switch to usage of `ggmv` fixture for the new
`test_render_basic_bg()` test, because this is
required for the `cleanup` decorator to work properly.
* Update the testing utility _clean_up() function
to additionally use `read_homefile()` for a cleaner
blender state between tests. This is similar to the
approach that Brady uses upstream.
* It seems that using `open_mainfile()` rather than
`read_homefile` in the `_clean_up()` function is more robust
for our testsuite (doesn't cause failures for
`test_show_trajectory_with_material` for example). One point
of caution is that running locally with multiple threads with
`pytest-xdist` does seem to run into issues with this approach,
perhaps because of the reading of `bpy.data.filepath`.
@tylerjereddy tylerjereddy force-pushed the treddy_render_test_demo_xslow branch from bb05a06 to a01013c Compare March 24, 2025 20:21
ggmv._session.remove(artist.trajectory.uuid)
except Exception:
pass
bpy.ops.wm.open_mainfile(filepath=bpy.data.filepath)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was added based on the discussion at gh-41, and seems to help isolate (from cross-test pollution) the scenes used in render tests locally. It is slightly different than the approach Brady uses upstream at https://github.com/BradyAJohnston/MolecularNodes/blob/main/tests/conftest.py#L16 with bpy.ops.wm.read_homefile(app_template=""). When I used the latter, I ran into a failure with test_show_trajectory_with_material. Probably some preservation of state is needed that Yuxuan understands and I don't?

One problem with this change is that, at least in local testing, this seems to reduce "thread safety" with pytest-xdist a bit. For example, the following incantation sometimes has issues parsing .blend files:

GGMOLVIS_XSLOW=1 python -m pytest -n 8 (to reproduce locally on this branch, it can help to increase -n to something like 32)

Details ``` =============================================================================================================================== ERRORS =============================================================================================================================== __________________________________________________________________________________________________________________ ERROR collecting ggmolvis/tests ___________________________________________________________________________________________________________________ ../spack/opt/spack/darwin-sonoma-m1/apple-clang-15.0.0/python-3.11.7-g6qcoukv4k4emoygj4zxqsig2zqs3wlj/lib/python3.11/importlib/__init__.py:126: in import_module return _bootstrap._gcd_import(name[level:], package, level) :1204: in _gcd_import ??? :1176: in _find_and_load ??? :1126: in _find_and_load_unlocked ??? :241: in _call_with_frames_removed ??? :1204: in _gcd_import ??? :1176: in _find_and_load ??? :1126: in _find_and_load_unlocked ??? :241: in _call_with_frames_removed ??? :1204: in _gcd_import ??? :1176: in _find_and_load ??? :1147: in _find_and_load_unlocked ??? :690: in _load_unlocked ??? :940: in exec_module ??? :241: in _call_with_frames_removed ??? ggmolvis/__init__.py:55: in bpy.ops.wm.open_mainfile(filepath=dest_path) ../../python_venvs/py_311_mda_dev/lib/python3.11/site-packages/bpy/4.4/scripts/modules/bpy/ops.py:109: in __call__ ret = _op_call(self.idname_py(), kw) E RuntimeError: Error: File format is not supported in file "/var/folders/5_/hm0ft57n6dn2ksgg2p0bx5h0000w2g/T/ggmolvis.blend" ```

Spot checking the ubuntu-latest main CI job, I see the testsuite finishing with 38 passed, 1 skipped, 3 xfailed, 8 warnings in 51.18s, which suggests that the new xslow test is correctly being skipped in the CI.

That said, I'm not sure where we stand on the tradeoffs here if this reduces thread safety. Maybe we could use a lock/mutex somehow, or even just suggest avoiding pytest-xdist in the short term. IIRC I can even observe this type of failure without pytest-xdist usage, but it is much rarer.

@yuxuanzhuang
Copy link
Owner

One problem with this change is that, at least in local testing, this seems to reduce "thread safety" with pytest-xdist a bit.

I think the issue is a race condition in __init__.py, where all processes are trying to copy the start file to the same fixed temporary location.

The patch below should fix this:

--- a/ggmolvis/__init__.py
+++ b/ggmolvis/__init__.py
@@ -6,6 +6,7 @@ Molecular visualization with Blender
 import os
 import shutil
 import tempfile
+import uuid
 from importlib.metadata import version
 import bpy
 from bpy.app.handlers import frame_change_pre
@@ -45,7 +46,8 @@ base_name = 'ggmolvis.blend'
 #count += 1

 # we will save the current session to the temporary directory
-dest_dir = tempfile.gettempdir()
+dest_dir = f"{tempfile.gettempdir()}{uuid.uuid4()}"
+os.makedirs(dest_dir, exist_ok=True)
 dest_path = os.path.join(dest_dir, base_name)
 logger.debug(f"Blend file stored at {dest_path}")

* Fix a thread safety issue in ggmolvis
initialization.

Co-authored-by: Yuxuan Zhuang <[email protected]>
@tylerjereddy
Copy link
Contributor Author

@yuxuanzhuang thanks, I pulled in your patch (with you as co-author) and the thread safety issue with pytest-xdist seems resolved locally

Co-authored-by: Yuxuan Zhuang <[email protected]>
@tylerjereddy
Copy link
Contributor Author

@yuxuanzhuang the remaining CI failures are unrelated--is anything else needed here?

I suppose an argument could be made to switch from molecule to trajectory (things are moving very fast in this project..), though not sure it is a big problem for now.

Copy link
Owner

@yuxuanzhuang yuxuanzhuang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! It also looks good!

@yuxuanzhuang yuxuanzhuang merged commit c1f91f9 into yuxuanzhuang:main Mar 30, 2025
9 of 14 checks passed
@tylerjereddy tylerjereddy mentioned this pull request Mar 30, 2025
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants