-
Notifications
You must be signed in to change notification settings - Fork 361
fix(leader-election): exit after Leader status is lost #2236
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
okay, sorry, i should've been more clear. the leader status is never lost, and that's the problem. the controller holds on to it indefinitely (at you can keep this bit of code and then call |
|
I have just pushed an update. @acuteaura can you take a look? |
|
Hm, I hadn't considered that the process also calls Leaves you with a chance of killing the E2E suite again though, because this function can never cleanly exit now, so you may actually have cherrypick the changes from |
|
@Revolyssup Would this arguably unclean but more reliable fix work for you? |
|
Would be great if this fix could be merged. We are running on GKE with out-of-date updates. After each update, we have a fifty-fifty chance of needing a manual restart of the ingress-controller. |
|
Hello, Looking at the commit history, it feels like Apisix Ingress Controller is in maintenance mode. Is this a correct assessment or just deemed feature complete? |
|
this one's not correct. the correct way would be to revivie #2152 and fix the e2e test. or at least cherry-pick run to have an error return value and hard exit when it's not nil so it doesn't just... give up when the server isn't available at boot. i wouldn't consider this project "dead" or "in maintenance mode", it's just very driven by individual contributors implementing what they need and some extra volunteers. if you or @wofr need this now and not someday, I'd suggest you PR it. |
|
This pull request has been marked as stale due to 60 days of inactivity. It will be closed in 30 days if no further activity occurs. If you think that's incorrect or this pull request should instead be reviewed, please simply write any comment. Even if closed, you can still revive the PR at any time or discuss it on the [email protected] list. Thank you for your contributions. |
|
This pull request/issue has been closed due to lack of activity. If you think that is incorrect, or the pull request requires review, you can revive the PR at any time. |
This PR is a subset of @acuteaura's PR #2152
The current problem (v1.8.2) is that whenever a controller loses it's
leaderstatus, it does not exit gracefully, thus it fails silently. In order to prevent this, anos.exithas been implemented to shut itself down, and depend on Kubernetes to bring it back up.Type of change:
What this PR does / why we need it:
ingress-controller doesn't recover from failed sync
#1980
Pre-submission checklist: