Skip to content

ci: parallelize coverage, stop running tests twice per PR#377

Merged
santoshkumarradha merged 2 commits intomainfrom
ci/coverage-dedupe
Apr 9, 2026
Merged

ci: parallelize coverage, stop running tests twice per PR#377
santoshkumarradha merged 2 commits intomainfrom
ci/coverage-dedupe

Conversation

@santoshkumarradha
Copy link
Copy Markdown
Member

Summary

  • Problem. The Coverage Summary required check runs ~5m34s on every PR, and most of that wall-clock is spent re-running test suites that the per-surface workflows (control-plane.yml, sdk-go.yml, sdk-python.yml, sdk-typescript.yml) already executed on the same commit. Tests were effectively running twice per PR, serially, in one job.
  • Fix. Restructure coverage.yml as a 5-entry parallel matrix — one per surface — plus an aggregator job that downloads the matrix artifacts and runs the existing gates unchanged. Delete the duplicate go test steps from control-plane.yml and sdk-go.yml.
  • Expected result. Coverage wall-clock drops from ~5m30s (serial, single job) to approximately the slowest single surface (~2m, usually control-plane), plus ~30s aggregator. control-plane.yml → linux-tests also drops since it no longer runs tests. Full test coverage is preserved — coverage-summary stays a required status check so any regression still blocks merge.

What changed

New files

  • scripts/coverage-surface.sh — single source of truth for "what commands run for surface X with coverage". Takes one arg (control-plane | sdk-go | sdk-python | sdk-typescript | web-ui), runs that surface's tests, writes all the expected filenames under test-reports/coverage/. Called by both the local runner and CI.
  • scripts/coverage-aggregate.py — extracted from the trailing Python block of the old coverage-summary.sh so aggregation can run independently after CI flattens per-surface artifacts into one directory.

Refactored

  • scripts/coverage-summary.sh — thin orchestrator that loops over all five surfaces calling coverage-surface.sh, then runs coverage-aggregate.py. Local behavior unchanged.
  • .github/workflows/coverage.yml — rewritten as `per-surface` matrix (5 parallel jobs, each running `coverage-surface.sh` and uploading a `coverage-` artifact) + `coverage-summary` aggregator job with `needs: per-surface`. Aggregator downloads all 5, flattens into `test-reports/coverage/`, runs `coverage-aggregate.py`, then the existing `coverage-gate.py` + `patch-coverage-gate.sh` + sticky PR comments + badge gist steps unchanged. Job name stays `coverage-summary` so branch protection rules targeting that check name continue to match with no reconfiguration.

Modified

  • .github/workflows/control-plane.yml — delete the `Run tests` step from `linux-tests`. Tests now live only in `coverage.yml`. `linux-tests` keeps building the binary (needed for `compile-matrix`'s `needs:` dependency) and linting.
  • .github/workflows/sdk-go.yml — delete the `Test` step from `build-and-test` for the same reason.

Intentionally unchanged

  • `sdk-python.yml` and `sdk-typescript.yml`. Their matrices test cross-version compatibility (Python 3.8-3.12, Node 18+20) independent of coverage measurement, which only runs on 3.11 / Node 20. Coverage still runs through `coverage.yml`.

Design notes

  • Why the job name is preserved. If the old check name `coverage-summary` is a required status check in branch protection, renaming it would soft-brick every PR. Keeping the aggregator job named exactly `coverage-summary` means zero branch protection changes are needed.
  • Single source of truth for commands. Both the local runner (`coverage-summary.sh`) and CI delegate to `coverage-surface.sh`, so local and CI runs cannot drift apart.
  • Artifact flattening. Per-surface artifacts each contain files nested under their own subdirectory after download. The aggregator flattens them with `find | cp` into `test-reports/coverage/` because the gate scripts (`coverage-gate.py`, `patch-coverage-gate.sh`) hardcode expected filenames in that single directory. Each surface writes uniquely prefixed filenames (e.g. `control-plane-cobertura.xml`, `sdk-go-cobertura.xml`, ...), so there are no collisions when flattened.
  • Byte-equivalent aggregation. Smoke-tested `coverage-aggregate.py` locally against the exact values in `coverage-baseline.json` — produces aggregate `89.01%`, matches the baseline, and `coverage-gate.py` accepts the output shape unchanged.

Test plan

  • `Coverage Summary / coverage (control-plane)` passes
  • `Coverage Summary / coverage (sdk-go)` passes
  • `Coverage Summary / coverage (sdk-python)` passes
  • `Coverage Summary / coverage (sdk-typescript)` passes
  • `Coverage Summary / coverage (web-ui)` passes
  • `Coverage Summary / coverage-summary` passes (this is the existing required check — name preserved)
  • Sticky `coverage-gate` PR comment posts with the same table format as before
  • Sticky `patch-coverage-gate` PR comment posts
  • `Control Plane CI / linux-tests` still passes without the `Run tests` step
  • `Go SDK CI / build-and-test` still passes without the `Test` step
  • Total coverage wall-clock on this PR is noticeably lower than the ~5m34s reference from PR fix: block revoked-tag agent calls + friendly empty logs state (TC-034, TC-035) #373
  • Run `./scripts/coverage-summary.sh` locally and confirm it still produces identical `test-reports/coverage/summary.json` shape

Validation reference

Baseline run used for comparison: PR #373 — `coverage-summary` = 5m34s (the single serialized job this PR splits up).

The Coverage Summary job was running every test suite in the repo a
second time, serially, in one ~5.5min job. Tests ran once in the
per-surface workflows (control-plane.yml, sdk-go.yml, ...) and then
again inside coverage-summary.sh just to produce coverage numbers.

This restructures coverage.yml as a 5-job parallel matrix (one per
surface) plus an aggregator that downloads their artifacts and runs
the existing coverage-gate.py + patch-coverage-gate.sh unchanged.

Changes:

* scripts/coverage-surface.sh (new)
  Single source of truth for "what commands run for surface X with
  coverage". Takes one arg (control-plane|sdk-go|sdk-python|
  sdk-typescript|web-ui), runs that surface's tests, writes all the
  expected filenames under test-reports/coverage/.

* scripts/coverage-aggregate.py (new)
  Extracted from the trailing Python block of coverage-summary.sh so
  the aggregation can run independently after CI flattens per-surface
  artifacts into one directory. Byte-equivalent output to the old
  inline block — verified against coverage-baseline.json.

* scripts/coverage-summary.sh (refactored)
  Now a thin orchestrator that calls coverage-surface.sh for each of
  the five surfaces then runs coverage-aggregate.py. Local behavior
  is unchanged.

* .github/workflows/coverage.yml (rewritten)
  Matrix of 5 parallel per-surface jobs, each running
  coverage-surface.sh and uploading a coverage-<surface> artifact.
  Aggregator job (needs: per-surface) downloads them, flattens into
  test-reports/coverage/, runs coverage-aggregate.py, then the
  existing coverage-gate.py + patch-coverage-gate.sh + sticky PR
  comments + badge gist steps, all unchanged. Job name stays
  `coverage-summary` so branch protection rules targeting that check
  continue to match without reconfiguration.

* .github/workflows/control-plane.yml
  Delete the `Run tests` step from linux-tests. Tests now run only in
  coverage.yml; linux-tests keeps building the binary (needed for
  compile-matrix's needs: dependency) and linting.

* .github/workflows/sdk-go.yml
  Delete the `Test` step from build-and-test for the same reason.

sdk-python.yml and sdk-typescript.yml are intentionally unchanged:
their matrices test cross-version compatibility (Python 3.8-3.12,
Node 18+20) which is independent of the coverage measurement
(3.11 / Node 20 only). The coverage workflow remains a required
status check, so a regression on any surface still blocks merge.
@santoshkumarradha santoshkumarradha requested review from a team and AbirAbbas as code owners April 9, 2026 18:38
`go tool cover -func=<coverprofile>` resolves package paths in the
coverprofile against the nearest go.mod. When it's invoked from the
repo root (no go.mod) it fails with:

  cover: no required module provides package <pkg>: go.mod file not
  found in current directory or any parent directory

The original coverage-summary.sh extract_go_total() took the module
dir as an arg and cd'd into it inside a subshell. I dropped that arg
when I extracted the function into coverage-surface.sh, so the
control-plane and sdk-go matrix jobs ran the `go test` with coverage
successfully but then failed at the total-extraction step.

Restore the two-argument form and cd into the module dir before
running `go tool cover -func`.
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 9, 2026

📊 Coverage gate

Thresholds from .coverage-gate.toml: per-surface ≥ 86%, aggregate ≥ 88%, max per-surface regression ≤ 1.0 pp, max aggregate regression ≤ 0.50 pp.

Surface Current Baseline Δ
control-plane 87.20% 87.30% ↓ -0.10 pp 🟡
sdk-go 90.70% 90.70% → +0.00 pp 🟢
sdk-python 93.63% 93.63% ↑ +0.00 pp 🟢
sdk-typescript 92.56% 92.56% → +0.00 pp 🟢
web-ui 90.02% 90.01% ↑ +0.01 pp 🟢
aggregate 88.98% 89.01% ↓ -0.03 pp 🟡

✅ Gate passed

No surface regressed past the allowed threshold and the aggregate stayed above the floor.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 9, 2026

📐 Patch coverage gate

Threshold: 80% on lines this PR touches vs origin/main (from .coverage-gate.toml:thresholds.min_patch).

Surface Touched lines Patch coverage Status
control-plane 0 ➖ no changes
sdk-go 0 ➖ no changes
sdk-python 0 ➖ no changes
sdk-typescript 0 ➖ no changes
web-ui 0 ➖ no changes

✅ Patch gate passed

Every surface whose lines were touched by this PR has patch coverage at or above the threshold.

@santoshkumarradha santoshkumarradha merged commit 051b967 into main Apr 9, 2026
23 checks passed
@santoshkumarradha santoshkumarradha deleted the ci/coverage-dedupe branch April 9, 2026 19:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant