Github Action Debugging #4338

jmckulk · 2025-11-19T21:21:40Z

Checklist:

Have you added an explanation of what your changes do and why you'd like them to be included?
Have you updated or added documentation for the change, as applicable?
Have you tested your changes on all related environments with successful results, as applicable?
- Have you added automated tests?

Type of Changes:

What is the current behavior (link to any open issues here)?

What is the new behavior (if this is a feature change)?

Breaking change (fix or feature that would cause existing functionality to change)

Other Information:

Prevents race conditions where pod is found but Postgres isn't ready yet.

Reduce PVC sizes from 1Gi to 256Mi and use Foreground deletion with explicit PVC cleanup waits to prevent disk exhaustion on GitHub-hosted runners.

Use env var instead of JMESPath expression in shell script.

Prevent script timeout (5s default) from killing 2m kubectl wait.

Only prefetch images actually used by the test: pgbackrest and postgres. Removes ~500MB of unused images (pgbouncer, pgadmin, exporter, upgrade).

- Add pgbouncer, exporter, upgrade, pgadmin images to prefetch - Increase KUTTL timeout from 300s to 450s - Increase prefetch timeout to 5m

jmckulk · 2025-11-24T19:29:42Z

The two biggest improvements here were reducing the number of prefetch images and splitting chainsaw and kuttl tests into separate jobs. I think we were hitting issues with disk pressure and these two changes help with that.

For the prefetch images change, we aren't testing GIS here so there is no point in fetching them. In addition to removing these extra images, this change tries to ensure we are using the images that we fetch. This does mean we might need to add images, specifically to chainsaw, as we test more functionality.

For the split change, this feels like a stopgap fix. We are really just giving the tests more space to run by splitting them up. We should continue to consider ways reduce disk space usage. One option may be to refactor the chainsaw pgbackrest-restore test to only use one cluster.

Other changes in this pr are mostly either timeout increases or adding extra logging or checks. The change to reduce volume request size likely doesn't have much impact on actual disk usage but feels like a safe change.

dsessler7 · 2025-11-24T22:00:19Z

.github/workflows/test.yaml

            --env 'RELATED_IMAGE_PGEXPORTER=registry.developers.crunchydata.com/crunchydata/crunchy-postgres-exporter:ubi9-0.17.1-2542' \
            --env 'RELATED_IMAGE_PGUPGRADE=registry.developers.crunchydata.com/crunchydata/crunchy-upgrade:ubi9-18.0-2542' \


I guess these (and PGBOUNCER) could be removed as well

I was thinking leave related images. That way if an image isn't pre-fetched it could be pulled.

Or we could fail when we don't have all the images we need?

dsessler7 · 2025-11-24T22:01:24Z

.github/workflows/test.yaml

            --env 'RELATED_IMAGE_STANDALONE_PGADMIN=registry.developers.crunchydata.com/crunchydata/crunchy-pgadmin4:ubi9-9.8-2542' \
            --env 'RELATED_IMAGE_COLLECTOR=registry.developers.crunchydata.com/crunchydata/postgres-operator:ubi9-5.8.4-0' \


These can be removed too

dsessler7

Couple nitpicks, but LGTM

jmckulk changed the title ~~Fix naming for podlogs~~ Github Action Debugging Nov 19, 2025

jmckulk force-pushed the jmckulk/pgbr-chainsaw-test-debug branch 2 times, most recently from f0bcc54 to bff9d68 Compare November 21, 2025 16:39

sfc-gh-jmckulka added 17 commits November 24, 2025 11:30

Fix naming for podlogs

4f9d9a2

more checks and output

7327b58

remove unnecessary images

0c34eb1

use variable in template

e0f9a1c

add more logging

f9a2253

Increase timeout to 5m for backup/restore operations

eed9964

Increate timeout for image prefetch

a531ce8

Wait for postgres to be ready

910f94e

Add Postgres readiness checks before psql execs

e7906ee

Prevents race conditions where pod is found but Postgres isn't ready yet.

Add storage provisioning diagnostics for CI failures

0645472

test: reduce disk usage in pgbackrest-restore chainsaw test

872e8ad

Reduce PVC sizes from 1Gi to 256Mi and use Foreground deletion with explicit PVC cleanup waits to prevent disk exhaustion on GitHub-hosted runners.

test: fix shell syntax error in clone-cluster template

136aa04

Use env var instead of JMESPath expression in shell script.

test: add timeout to PVC deletion wait scripts

c6fee50

Prevent script timeout (5s default) from killing 2m kubectl wait.

test: remove unnecessary image prefetch for chainsaw tests

0566e11

Only prefetch images actually used by the test: pgbackrest and postgres. Removes ~500MB of unused images (pgbouncer, pgadmin, exporter, upgrade).

Split e2e-k3d into separate chainsaw and kuttl jobs

c257229

Fixup: remove extra debug

569abc5

test: add missing image prefetch and increase KUTTL timeout

aa19cad

- Add pgbouncer, exporter, upgrade, pgadmin images to prefetch - Increase KUTTL timeout from 300s to 450s - Increase prefetch timeout to 5m

jmckulk force-pushed the jmckulk/pgbr-chainsaw-test-debug branch from 5a2aee5 to aa19cad Compare November 24, 2025 16:32

sfc-gh-jmckulka added 4 commits November 24, 2025 12:22

Remove some debug logging

6cfdeeb

Revert timeout bump

e8b0ac4

Revert foreground deletion of pvc

a2d12f5

Restore background deletion policy for clone clusters

bd01f63

jmckulk force-pushed the jmckulk/pgbr-chainsaw-test-debug branch from bd01f63 to 63c4ecb Compare November 24, 2025 18:43

jmckulk marked this pull request as ready for review November 24, 2025 19:29

ValClarkson approved these changes Nov 24, 2025

View reviewed changes

dsessler7 reviewed Nov 24, 2025

View reviewed changes

dsessler7 approved these changes Nov 24, 2025

View reviewed changes

Remove more prefetch images from chainsaw action

51f6ea0

jmckulk force-pushed the jmckulk/pgbr-chainsaw-test-debug branch from 63c4ecb to 51f6ea0 Compare November 25, 2025 15:53

jmckulk enabled auto-merge (rebase) November 25, 2025 15:53

jmckulk merged commit f28e554 into CrunchyData:main Nov 25, 2025
20 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Github Action Debugging #4338

Github Action Debugging #4338

Uh oh!

jmckulk commented Nov 19, 2025

Uh oh!

jmckulk commented Nov 24, 2025

Uh oh!

dsessler7 Nov 24, 2025

Uh oh!

jmckulk Nov 24, 2025

Uh oh!

dsessler7 Nov 24, 2025

Uh oh!

dsessler7 left a comment •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

		--env 'RELATED_IMAGE_PGEXPORTER=registry.developers.crunchydata.com/crunchydata/crunchy-postgres-exporter:ubi9-0.17.1-2542' \
		--env 'RELATED_IMAGE_PGUPGRADE=registry.developers.crunchydata.com/crunchydata/crunchy-upgrade:ubi9-18.0-2542' \

		--env 'RELATED_IMAGE_STANDALONE_PGADMIN=registry.developers.crunchydata.com/crunchydata/crunchy-pgadmin4:ubi9-9.8-2542' \
		--env 'RELATED_IMAGE_COLLECTOR=registry.developers.crunchydata.com/crunchydata/postgres-operator:ubi9-5.8.4-0' \

Github Action Debugging #4338

Github Action Debugging #4338

Uh oh!

Conversation

jmckulk commented Nov 19, 2025

Uh oh!

jmckulk commented Nov 24, 2025

Uh oh!

dsessler7 Nov 24, 2025

Choose a reason for hiding this comment

Uh oh!

jmckulk Nov 24, 2025

Choose a reason for hiding this comment

Uh oh!

dsessler7 Nov 24, 2025

Choose a reason for hiding this comment

Uh oh!

dsessler7 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

dsessler7 left a comment •

edited

Loading