fix: add tolerations to Cilium Operator for cloud provider initialization taint#1882
Conversation
…tion taint Adds toleration for node.cloudprovider.kubernetes.io/uninitialized taint to the Cilium Operator deployment via Helm values. This ensures the operator can schedule during initial cluster creation when nodes are temporarily tainted by the cloud controller manager initialization process. The taint prevents normal pods from scheduling until the cloud provider completes initialization, but the operator needs to run during this period. Adding this toleration is safe and backward compatible - it only affects scheduling during bootstrap and doesn't force scheduling on tainted nodes if others are available. Fixes #1879 🤖 Generated with Claude Code Co-Authored-By: Claude <noreply@anthropic.com>
There was a problem hiding this comment.
Summary of Changes
Hello @mysticaltech, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
This pull request addresses a critical scheduling issue for the Cilium Operator during Kubernetes cluster initialization. Previously, the operator would remain unscheduled due to the node.cloudprovider.kubernetes.io/uninitialized taint applied by cloud provider controllers. The core change involves adding a default toleration for this specific taint to the Cilium Operator's Helm values, allowing it to bypass the temporary taint and schedule successfully during the bootstrap phase. This fix is designed to be safe, backward-compatible, and minimally intrusive, ensuring the operator can run when needed without affecting existing clusters or custom configurations.
Highlights
- Cilium Operator Scheduling Fix: Added a default toleration for the
node.cloudprovider.kubernetes.io/uninitializedtaint to the Cilium Operator's Helm values. - Cluster Initialization Compatibility: This change ensures the Cilium Operator can schedule and function correctly during the early stages of Kubernetes cluster initialization, even when nodes are temporarily tainted by the cloud provider.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.
| Feature | Command | Description |
|---|---|---|
| Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
| Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
| Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
| Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Pull Request Overview
This PR fixes a scheduling issue with the Cilium Operator during cluster initialization by adding tolerations for the cloud provider initialization taint. The fix ensures the operator can schedule on nodes that are temporarily tainted during the bootstrap process.
- Adds toleration for
node.cloudprovider.kubernetes.io/uninitializedtaint to Cilium Operator - Uses the existing Helm chart pattern with
operator.tolerationsconfiguration - Provides a safe, backward-compatible default that doesn't affect existing deployments
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
There was a problem hiding this comment.
Code Review
This pull request introduces a toleration for the node.cloudprovider.kubernetes.io/uninitialized taint to the default Cilium Helm values. This is a well-thought-out fix that ensures the Cilium Operator can be scheduled during the initial cluster bootstrap phase, which is crucial for network readiness. The implementation is correct, using the standard operator.tolerations configuration for the Cilium chart. The change is safe, backward-compatible, and minimal. I have no further suggestions as the change is excellent.
Summary
node.cloudprovider.kubernetes.io/uninitializedtaint to Cilium OperatorIssue
Fixes #1879
Problem
During initial cluster creation, the Cilium Operator pod remains unscheduled due to the
node.cloudprovider.kubernetes.io/uninitializedtaint applied by the cloud controller manager. This taint prevents normal pods from scheduling until cloud provider initialization completes, but the Cilium Operator needs to run during this bootstrap period.Solution
Added default tolerations to the Cilium Operator via Helm values in the default
cilium_valuesconfiguration. The toleration usesoperator: Existsto handle any value the taint might have, ensuring robust scheduling across different environments.Why This Approach
cilium_valuesoperator.tolerationspattern without introducing special-case logicTest Plan
terraform fmtto ensure proper formattingterraform planto verify no breaking changesNotes
cilium_valuesare unaffected unless they adopt this toleration🤖 Generated with Claude Code