Bazel + Python unit testing: macro design, dep granularity, and pytest bootstrap patterns #2867

clanghans · 2026-05-04T11:54:49Z

clanghans
May 4, 2026

We are building a unit test infrastructure on top of rules_python and py_test for a project that also has a heavier integration test
framework (py_itf_test). A concrete example of what we have so far is in this PR: eclipse-score/itf#94

We'd like to hear how others approach three recurring design questions.

Should a unit test macro exist at all?

We introduced a py_itf_unittest macro (https://github.com/eclipse-score/itf/pull/94/files#diff-...) as a thin wrapper around py_test that
sets up pytest, injects pytest-mock, and provides a default pytest.ini. The alternative is to have test authors call py_test directly and
declare their pytest dep explicitly.

The macro adds convenience and consistency, but it hides what is actually happening and creates an implicit contract with macro consumers.
How do others draw this line? Is a macro justified, or does it create more problems than it solves as the project grows?

Dep granularity vs. maintenance cost

For unit tests to be truly atomic, each test target should only depend on the specific module it exercises. In practice this means
splitting coarse Bazel targets. For example, we split //score/itf/plugins/qemu into :config (just config.py + pydantic) and :qemu (the
full plugin) — visible in https://github.com/eclipse-score/itf/pull/94/files. This keeps the coverage denominator honest and avoids
pulling unrelated code into the test sandbox.

The tension is maintenance: as the codebase grows, keeping Bazel targets fine-grained requires discipline. What strategies do others use?
Do you enforce granularity via linting/visibility rules, or accept coarser targets and live with the coverage noise?

pytest bootstrap pattern

Both our macro and score_py_pytest from @score_tooling use a shared main.py that calls pytest.main(args) as the py_test entry point. This
works, but it is a workaround — py_test expects a unittest-style entry point, not a pytest runner.

Is there a community-standard way to run pytest as a py_test target in Bazel? Should this come from rules_python directly? We are aware of
third-party solutions but curious whether there is an emerging standard or a direction from the rules_python maintainers.

For more context on our setup, see the full PR: eclipse-score/itf#94. Interested in how others have solved these.
Happy to share more details.

ltekieli · 2026-05-05T09:03:23Z

ltekieli
May 5, 2026
Collaborator

1. Should a unit test macro exist at all?

Yes — keep it, keep it thin. The macro wires up exactly the ceremony every test author would otherwise duplicate: main.py bootstrap, pytest.ini, -p no:cacheprovider, --junitxml, PYTHONDONOTWRITEBYTECODE, and pytest-mock. Getting any of those wrong causes subtle failures. The macro doesn't hide behavior, it hides boilerplate.

The risk of an implicit contract is manageable: the macro is ~50 lines, has no conditional logic or plugin injection, and anyone who needs something it doesn't support falls back to raw py_test — the macro doesn't prevent that.

Why not reuse score_py_pytest from @score_tooling? It looks like the obvious candidate since the bootstrap pattern is identical, but it has four problems for the unit test case:

Dep bloat: score_py_pytest injects all_requirements (every pip package from @pip_tooling). Unit tests need minimal deps — :test_qemu_config_schema should only pull in pydantic, not the entire tooling pip universe. This directly undermines the fine-grained dep strategy from question init: initial content #2.
Wrong pip hub: score_py_pytest resolves from @pip_tooling; ITF uses @itf_pip. Mixing hubs risks version conflicts or missing packages.
Hardcoded attribute_plugin: Always injected via -p attribute_plugin with a dep on @score_tooling//python_basics/score_pytest:attribute_plugin. Irrelevant for pure unit tests and adds unnecessary deps.
No pytest-mock: Not included. The proposed py_itf_unittest injects it by default, which is a deliberate convenience for mocker-based unit tests.

The main.py bootstrap is the same 3-line pytest.main(args) in both — no reuse value in importing it from @score_tooling.

Long-term, the right move might be to contribute a more configurable score_py_pytest back to @score_tooling — one that doesn't force all_requirements, lets callers opt out of attribute_plugin, and allows pip hub selection. Then both ITF macros could delegate to it. But today, score_py_pytest is too opinionated for the unit test case.

2. Dep granularity vs. maintenance cost

The :config split in PR #94 is the right pattern — apply it surgically, not universally. Split a Bazel target when a unit test only needs a leaf module and the unsplit target drags in heavy or hard-to-satisfy transitive deps (network, process management, hardware). Don't pre-split everything "just in case."

For sustainability:

Visibility rules are the best enforcement tool — visibility = ["//test/unit/..."] on fine-grained targets prevents production code from accidentally depending on internal splits.
A buildifier or linting check that flags unit test targets depending on more than N transitive deps can catch drift without manual discipline.
Accept coarser targets for integration tests — the coverage denominator matters less there.

3. Pytest bootstrap pattern

The main.py calling pytest.main() approach is the de facto standard in the Bazel + pytest ecosystem. There is no first-party rules_python pytest rule and the maintainers have indicated they consider it out of scope. The main alternatives:

rules_python_pytest (third-party) — wraps the same pattern but adds features like per-file test targets via a pytest_test macro. Worth evaluating if the project grows, but it's another dependency.
pytest-bazel plugin — makes pytest aware of Bazel runfiles, but still needs the bootstrap main.py.

The current approach matches what score_py_pytest already does and what the broader community uses. Don't change it unless rules_python officially ships a pytest rule.

Bottom line

The PR's approach is sound across all three questions. The macro is justified as long as it stays thin. The :config target split is a good surgical example to repeat selectively. The pytest bootstrap is the community standard. Using score_py_pytest from @score_tooling isn't viable today due to its all_requirements injection, wrong pip hub, and forced plugin — but aligning the two macros upstream would be a worthwhile follow-up.

0 replies

clanghans · 2026-05-11T09:41:03Z

clanghans
May 11, 2026
Author

We align with all three points raised here.

We have documented the design decisions in a Decision Record following the Eclipse S-CORE DR convention:

PR: docs(decisions): add DR-001 for unit test design itf#105 — DR-001-Infra: Unit Test Infrastructure Design

The DR covers the macro design (dedicated py_itf_unittest, no plugin machinery), surgical Bazel target splitting for atomic dep scoping, the shared main.py bootstrap, and the pytest-mock preference. The implementation it documents is in eclipse-score/itf#94.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Eclipse S-CORE

Bazel + Python unit testing: macro design, dep granularity, and pytest bootstrap patterns #2867

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Eclipse S-CORE

Bazel + Python unit testing: macro design, dep granularity, and pytest bootstrap patterns #2867

Uh oh!

clanghans May 4, 2026

Replies: 2 comments

Uh oh!

ltekieli May 5, 2026 Collaborator

1. Should a unit test macro exist at all?

2. Dep granularity vs. maintenance cost

3. Pytest bootstrap pattern

Bottom line

Uh oh!

clanghans May 11, 2026 Author

clanghans
May 4, 2026

ltekieli
May 5, 2026
Collaborator

clanghans
May 11, 2026
Author