Automatically run release validation tests in CI against multiple base systems #307
Automatically run release validation tests in CI against multiple base systems #307
Conversation
This is a first step / exercise for testing against multiple "skeleton" versions.
We exploit the fact that release disks are built with passwordless root (for debugging purposes).
9bbc9eb to
606eaaf
Compare
Cannot reproduce it locally, but maybe it is "too fast" on CI
|
Discovered some flakiness in release-validation tests (mostly in CI), which seems to be fixed with |
Trying to fix strange CI issues
For weird reasons and only on CI the 2024.7.0-DISK version seems to take very long until controller GUI pages are fully loaded on the screen. Worse, sometimes the OCR results indicate that parts of two different controller pages are visible at the same time. While a long sleep after navigate_to_status_page() seems to resolve this, it is unreasonable to slow down all other versions because of this.
d1d70d6 to
f01ad6f
Compare
|
There's probably some video driver compatibility with QEMU issue in 2024.7.0, causing partial/very slow screen refresh. The OCR texts indicate that both bits from the both the Controller main (Information) page and the Status Page are simultaneously visible on the screen - something that should not be possible. For example: https://github.com/dividat/playos/actions/runs/21176096534/job/60905229495#step:4:4611 Possible solutions:
I tried option 1 in d1d70d6 and it seems to be sufficient, which amounts to an extra ~5 second sleep, but I don't think it's reasonable to penalize all other versions because of one flaky one. So instead for now applying option 2 in f01ad6f, if we discover the same issues for some skeleton, we can go back to option 1. |
Currently we manually run release-validation tests, but this is easy to forget and there is no visibility to the team about the outcomes.
Since they are "expensive" to run (take a long time), running them on each push/PR to main is maybe too extreme, but running them on each push to
release/*is warranted since it indicates we intend to deploy the tree in the somewhat near future.CI also allows us to easily run the release-validation tests against multiple base system images. Since we have not yet decided on the "archetypal skeletons", for now this just hard-codes base system versions to some "ancient" ones + the previous (latest released) version.
Manually started CI run: https://github.com/dividat/playos/actions/runs/21177566028 🟢