Bug description:
We recently found out that it takes a while for runnable to figure out the underlying object is completed. For instance:
conditions:
- lastTransitionTime: "2022-09-29T22:13:10Z"
message: ""
reason: Ready
status: "True"
type: RunTemplateReady
- lastTransitionTime: "2022-09-30T02:23:56Z"
message: ""
reason: SucceededCondition
status: "True"
type: StampedObjectCondition
- lastTransitionTime: "2022-09-30T02:23:56Z"
message: ""
reason: Ready
status: "True"
type: Ready
It took 4+ hours to transition from RunTemplateReady to SucceededCondition. The underlaying object was Tekton PipelineRun and it was succeeded in 26 seconds:
conditions:
- lastTransitionTime: "2022-09-29T22:13:36Z"
message: 'Tasks Completed: 1 (Failed: 0, Cancelled 0), Skipped: 0'
reason: Succeeded
status: "True"
type: Succeeded
Steps to reproduce:
Unfortunately I cannot have any clear path to reproduction.
Expected behavior:
Just after the PipelineRun gets its Succeeded condition set to True, Runnable should set its Ready condition to True.
Actual behavior:
Workload that stamped the Runnable was waiting for the output from Runanble and the Runnable didn't refresh its state for over 4 hours although the underlying PipelineRun resource completed almost instantaneously
Logs, k8s object dumps:
apiVersion: carto.run/v1alpha1
kind: Runnable
metadata:
creationTimestamp: "2022-09-29T22:13:08Z"
generation: 1
labels:
carto.run/cluster-template-name: sign-image-template-verify-dzyla-v0-0-3
carto.run/resource-name: sign-image
carto.run/supply-chain-name: supply-chain-verify-dzyla-v0-0-3
carto.run/template-kind: ClusterImageTemplate
carto.run/workload-name: foo-bar
carto.run/workload-namespace: default
name: 2b50086e-340b-41b6-8b53-e1ed40b761ed-store
namespace: default
ownerReferences:
- apiVersion: carto.run/v1alpha1
blockOwnerDeletion: true
controller: true
kind: Workload
name: foo-bar
uid: 2b50086e-340b-41b6-8b53-e1ed40b761ed
resourceVersion: "107440535"
uid: e759ce93-1f33-4c84-94e0-83418ef75401
spec:
inputs:
image: example.com:tag
url: example.com@sha:digest
retentionPolicy:
maxFailedRuns: 10
maxSuccessfulRuns: 10
runTemplateRef:
name: sign-image-tekton-pipelinerun-verify-dzyla-v0-0-3
serviceAccountName: verify-sa
status:
conditions:
- lastTransitionTime: "2022-09-29T22:13:10Z"
message: ""
reason: Ready
status: "True"
type: RunTemplateReady
- lastTransitionTime: "2022-09-30T02:23:56Z"
message: ""
reason: SucceededCondition
status: "True"
type: StampedObjectCondition
- lastTransitionTime: "2022-09-30T02:23:56Z"
message: ""
reason: Ready
status: "True"
type: Ready
observedGeneration: 1
outputs:
# outputs here
Versions:
- Infrastructure (kind, TKG, GKE etc.)
- cartographer version: v0.5.0
- k8s version: v1.21.6
- tekton-pipelines v0.40.1
Deployment info:
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
deployment.kubernetes.io/revision: "2"
creationTimestamp: "2022-05-09T17:32:18Z"
generation: 6
labels:
app.kubernetes.io/component: cartographer
app.kubernetes.io/name: cartographer-controller
app.kubernetes.io/version: v0.5.0
name: cartographer-controller
namespace: cartographer-system
resourceVersion: "107201119"
uid: 6214fba2-0089-46c3-95c8-4bc18ae9106f
spec:
progressDeadlineSeconds: 600
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
app: cartographer-controller
kapp.k14s.io/app: "1652117534801234013"
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25% [0/9023]
type: RollingUpdate
template:
metadata:
creationTimestamp: null
labels:
app: cartographer-controller
kapp.k14s.io/app: "1652117534801234013"
kapp.k14s.io/association: v1.c55856bac76106a4a463f38db29588aa
spec:
containers:
- args:
- -cert-dir=/cert
image: example.com/cache/projectcartographer/cartographer@sha256:c6e5bca54b39ec595f00f5646f809c9929322a603958805a2d826fd3cb2ee1f3
imagePullPolicy: IfNotPresent
name: cartographer-controller
resources:
limits:
cpu: "1"
memory: 1Gi
requests:
cpu: 500m
memory: 512Mi
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- all
readOnlyRootFilesystem: true
runAsNonRoot: true
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /cert
name: cert
readOnly: true
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
serviceAccount: cartographer-controller
serviceAccountName: cartographer-controller
terminationGracePeriodSeconds: 30
volumes:
- name: cert
secret:
defaultMode: 420
secretName: cartographer-webhook
status:
availableReplicas: 1
conditions:
- lastTransitionTime: "2022-05-09T17:32:18Z"
lastUpdateTime: "2022-08-25T18:15:56Z"
message: ReplicaSet "cartographer-controller-6dcfd6c4c" has successfully progressed.
reason: NewReplicaSetAvailable
status: "True"
type: Progressing
- lastTransitionTime: "2022-08-31T20:31:17Z"
lastUpdateTime: "2022-08-31T20:31:17Z"
message: Deployment has minimum availability.
reason: MinimumReplicasAvailable
status: "True"
type: Available
observedGeneration: 6
readyReplicas: 1
replicas: 1
updatedReplicas: 1
The cartographer was deployed as imgpkg bundle using a Carvel App CRD.
Additional context:
It happens randomly so it's hard to reproduce.
Bug description:
We recently found out that it takes a while for runnable to figure out the underlying object is completed. For instance:
It took 4+ hours to transition from
RunTemplateReadytoSucceededCondition. The underlaying object was Tekton PipelineRun and it was succeeded in 26 seconds:Steps to reproduce:
Unfortunately I cannot have any clear path to reproduction.
Expected behavior:
Just after the PipelineRun gets its Succeeded condition set to True, Runnable should set its Ready condition to True.
Actual behavior:
Workload that stamped the Runnable was waiting for the output from Runanble and the Runnable didn't refresh its state for over 4 hours although the underlying PipelineRun resource completed almost instantaneously
Logs, k8s object dumps:
Versions:
Deployment info:
The cartographer was deployed as imgpkg bundle using a Carvel App CRD.
Additional context:
It happens randomly so it's hard to reproduce.