Skip to content

Fix targets registering twice on load balancers#1929

Merged
mysticaltech merged 1 commit into
mysticaltech:fix/staging-review-2026-01-11from
mrjoe7:fix-double-target
Jan 11, 2026
Merged

Fix targets registering twice on load balancers#1929
mysticaltech merged 1 commit into
mysticaltech:fix/staging-review-2026-01-11from
mrjoe7:fix-double-target

Conversation

@mrjoe7
Copy link
Copy Markdown
Contributor

@mrjoe7 mrjoe7 commented Oct 9, 2025

Fix for #1900

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello @mrjoe7, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses an issue where load balancer targets were being registered redundantly. By refining the label selection logic for the hcloud_load_balancer_target resource, this change ensures that only the intended control plane nodes are targeted, resolving the duplicate registration problem and improving the accuracy of load balancer configurations.

Highlights

  • Load Balancer Target Registration: Modified the hcloud_load_balancer_target resource configuration to prevent targets from registering twice on load balancers by refining the label selection logic.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request aims to fix an issue where load balancer targets are registered twice. The change modifies the label selector for the cluster load balancer target to only include control-plane nodes, preventing conflicts with the Hetzner Cloud Controller Manager (CCM) which manages agent node targets. While this is a good step, the fix is incomplete for cases where scheduling on control-plane nodes is enabled. I've suggested a more robust solution to conditionally add control-plane nodes as targets, which should fully resolve the duplication issue in all configurations.

Comment thread init.tf
Comment on lines 57 to 60
[
# Generic label merge from control plane and agent namespaces with "or",
# resulting in: role in (control_plane_node,agent_node)
for key in keys(merge(local.labels_control_plane_node, local.labels_agent_node)) :
"${key} in (${
join(",", compact([
for labels in [local.labels_control_plane_node, local.labels_agent_node] :
try(labels[key], "")
]))
})"
for key, value in local.labels_control_plane_node :
"${key}=${value}"
]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This change correctly removes agent nodes from the Terraform-managed targets, fixing the 'registering twice' issue when CCM is active. However, it always adds control-plane nodes.

If var.allow_scheduling_on_control_plane is set to true, ingress controller pods can be scheduled on control-plane nodes. In this case, the Hetzner CCM will add them as targets, and this Terraform resource will also add them, re-introducing the 'registering twice' problem for control-plane nodes.

To make the fix more robust, we should only add control-plane nodes via Terraform when they are not expected to be managed by CCM. This happens when scheduling on them is disabled (the default behavior).

This suggestion makes the addition of control-plane nodes conditional on local.allow_scheduling_on_control_plane.

    (
      local.allow_scheduling_on_control_plane
      ? []
      : [
        for key, value in local.labels_control_plane_node :
        "${key}=${value}"
      ]
    )

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mrjoe7 FYI

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Control plane nodes are explicitly excluded from the load balancer here, so they won’t be registered twice.

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mrjoe7 I see, but you did not address the specific concern. Sometimes we want the control planes to be included as targets when var.allow_scheduling_on_control_plane is true.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mysticaltech I was thinking about this and my question is: Should we even keep using the LB Label Selector?
To explain — membership in the LB Cloud Server group is controlled by node.kubernetes.io/exclude-from-external-load-balancers=true in the default_control_plane_labels. We then combine this with the Label Selector mechanism, effectively using two methods to achieve the same result, which leads to LB registering node twice. Maybe we should just rely solely on node labels to control the Hetzner LB targets?

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have a point. I need to go back to how those label selector came about in the codebase in the first place, to really get to the bottom of it, this can be done via exploring the git history (with something like code claude). I will give it a shot ASAP, but don't hesitate to do the same.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mysticaltech have you had any success figuring out why it was designed this way?

@mysticaltech
Copy link
Copy Markdown
Owner

@codex review please, and let me know if backward compatible or not.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread init.tf
Comment on lines 55 to 60
label_selector = join(",", concat(
[for k, v in local.labels : "${k}=${v}"],
[
# Generic label merge from control plane and agent namespaces with "or",
# resulting in: role in (control_plane_node,agent_node)
for key in keys(merge(local.labels_control_plane_node, local.labels_agent_node)) :
"${key} in (${
join(",", compact([
for labels in [local.labels_control_plane_node, local.labels_agent_node] :
try(labels[key], "")
]))
})"
for key, value in local.labels_control_plane_node :
"${key}=${value}"
]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P0 Badge Cluster load balancer no longer targets agent nodes

The new label selector now appends only local.labels_control_plane_node, yielding role=control_plane_node and filtering out all agent nodes. Previously the selector matched both control plane and agent roles so the external load balancer attached to every worker node running ingress. After this change, multi-node clusters will register only control-plane servers and ingress traffic will bypass agents, which is a breaking change relative to prior behaviour.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mrjoe7 What do you think?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes this is a change in behaviour, but I don't see any issue with it as k8s will automatically register nodes in loadbalancer. I have been testing this with my cluster for a month without any issues.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But speaking of “no issues”… I tried changing the load balancer’s network settings and noticed it dropped all the targets from the LB. The targets matched by a label were added back automatically, but the ones registered by the Kubernetes CCM were not. I waited maybe a minute after manually adding the missing targets. Maybe they would have eventually been re-added automatically, but that would have caused an unnecessary outage and I didn’t want to wait.

So maybe we should keep the tags, but also add node.kubernetes.io/exclude-from-external-load-balancers=true to the worker nodes?

What do you think @mysticaltech ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took the liberty to update my PR to match the proposal above.

Now my LB looks like this:
obrázok

@mysticaltech
Copy link
Copy Markdown
Owner

Can't merge this until it's all fixed please.

@mrjoe7 mrjoe7 force-pushed the fix-double-target branch 2 times, most recently from 78cca11 to d45cc7b Compare November 14, 2025 19:33
@mrjoe7
Copy link
Copy Markdown
Contributor Author

mrjoe7 commented Nov 19, 2025

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses an issue where load balancer targets were being registered twice. The fix involves adding the node.kubernetes.io/exclude-from-external-load-balancers=true label to agent nodes when a Hetzner Cloud Load Balancer is in use. This prevents the Cloud Controller Manager from automatically adding the nodes, which resolves the double registration problem. The implementation is correct, but I've pointed out a confusing inline comment that should be updated for better clarity and future maintainability.

Comment thread locals.tf Outdated
@mrjoe7
Copy link
Copy Markdown
Contributor Author

mrjoe7 commented Nov 19, 2025

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request aims to fix an issue with load balancer targets being registered twice by conditionally adding an exclusion label to agent nodes. However, the implemented logic appears to be inverted. It would exclude agent nodes from being load balancer targets precisely when they are needed for ingress traffic, which would likely break the feature. I've left a critical review comment with a suggested code change to correct this logic. I also recommend renaming a confusingly named local variable for better code clarity in a future change.

Comment thread locals.tf
@mrjoe7
Copy link
Copy Markdown
Contributor Author

mrjoe7 commented Nov 19, 2025

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses an issue where load balancer targets were being registered twice. The change modifies default_agent_labels in locals.tf to conditionally add the node.kubernetes.io/exclude-from-external-load-balancers=true label to agent nodes. This label is added when the cluster is configured to use an external Hetzner Load Balancer (i.e., when local.has_external_load_balancer is false).

My analysis suggests this is likely a workaround to prevent k3s's built-in servicelb from interfering with the Hetzner Cloud Controller Manager (CCM), which would explain the "registering twice" symptom. By adding this label, servicelb (if not fully disabled for some reason) would be instructed to ignore these agent nodes, leaving only the CCM to manage them as load balancer targets.

The implementation appears correct and the logic is consistent across different cluster configurations (external LB, klipper/servicelb, single-node). The code change is well-contained and should effectively resolve the described problem.

@mysticaltech mysticaltech changed the base branch from master to fix/staging-review-2026-01-11 January 11, 2026 18:07
@mysticaltech mysticaltech merged commit 06194d6 into mysticaltech:fix/staging-review-2026-01-11 Jan 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants