[coordinator] Fix status reporting of out-of-order component updates#13119
Open
VihasMakwana wants to merge 5 commits intoelastic:mainfrom
Open
[coordinator] Fix status reporting of out-of-order component updates#13119VihasMakwana wants to merge 5 commits intoelastic:mainfrom
VihasMakwana wants to merge 5 commits intoelastic:mainfrom
Conversation
760eaab to
d8ca1fb
Compare
Contributor
|
Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane) |
c118dfe to
68abcc5
Compare
68abcc5 to
faa0fdf
Compare
swiatekm
reviewed
Mar 11, 2026
internal/pkg/agent/application/coordinator/coordinator_state.go
Outdated
Show resolved
Hide resolved
Contributor
Author
Unfortunately, I haven't been able to reproduce this. I'm thinking of adding |
VihasMakwana
commented
Mar 11, 2026
| Component: component.Component{ | ||
| ID: id, | ||
| }, | ||
| Component: comp.Component, |
Contributor
Author
There was a problem hiding this comment.
This one is needed because the state coordinator needs LastConfiguredAt to correctly handle state transitions.
Contributor
Author
Member
|
Thanks, agree this is a nice solution. I'll let Mikolaj do the approving. |
Contributor
💛 Build succeeded, but was flaky
Failed CI StepsHistory
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do?
When transitioning from otel to process runtime, if an otel component takes too long to stop, it will emit
Stoppedstate only after timeout expiration. By this time, the process runtime would have already reported aStartingstate.Upon receiving a
Stoppedstate from old runtime, we will erroneously remove the newStartingstate.This PR fixes the flow by introducing a new
LastCreatedAtvariable for a component. We will only process a state update when the state update is either from same instance of the component, or from a newer instance.Why is it important?
Buggy scenario:
cis created attime=0sStartingstate. We will report this state as it's the first state for this component.processmode andtime=1s. It also reports aStartingstate for a given component.Stoppedstate.Stoppedevent and erroneously removes the new component from the status map.After the PR:
cis created atstartTime=0sStartingstate. We will report this state as it's the first state for this component.processmode andstartTime=1s. It also reports aStartingstate for a given component.Stoppedstate.Stoppedevent but ignores it, since the storedstartTimeof the current component is later than the received event.Checklist
./changelog/fragmentsusing the changelog toolHow to test this PR locally
Related issues