Skip to content

Consul clients cache Kubernetes Service IP addresses and don't retry DNS resolution on connection failures (new IPs) #4657

@ChrisNoSim

Description

@ChrisNoSim

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request. Searching for pre-existing feature requests helps us consolidate datapoints for identical requirements into a single place, thank you!
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request.
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment.

Overview of the Issue

Consul clients resolve DNS names in retry_join configuration only once during startup and cache the resulting IP addresses indefinitely. When Consul servers get new IP addresses (common in Kubernetes environments during node upgrades, pod restarts, etc.), even newly restarted client pods continue attempting connections to cached stale IP addresses and never retry DNS resolution, causing permanent connection failures.

This creates a critical issue in cloud-native Kubernetes deployments where pod IP changes are routine operations. When servers restart and get new IPs, client pods that restart simultaneously (or shortly after) still resolve to the old cached DNS entries and get stuck in init containers indefinitely, even though they are performing fresh DNS lookups during their startup process.

Root Cause: Consul's retry_join mechanism performs DNS resolution only at startup and caches resolved IPs permanently, without implementing DNS re-resolution on connection failures. This affects both existing clients and newly started clients that may still resolve stale DNS entries due to DNS caching layers.

Reproduction Steps

  1. Deploy Consul cluster using official Helm chart with the following configuration:
global:
  name: consul
  datacenter: dc1
  domain: consul.example.com
  enabled: true
  logLevel: debug

server:
  enabled: true
  replicas: 3
  bootstrapExpect: 3
  extraConfig: |
    {
      "retry_join": [
        "consul-server-0.consul.example.com.",
        "consul-server-1.consul.example.com.",
        "consul-server-2.consul.example.com."
      ]
    }

client:
  enabled: true
  1. Observe initial startup - clients successfully resolve DNS and join cluster:

    • consul-server-0.consul.example.com.10.1.2.10
    • consul-server-1.consul.example.com.10.1.2.11
    • consul-server-2.consul.example.com.10.1.2.12
  2. Trigger server StatefulSet restart (simulating node upgrade or maintenance):

kubectl rollout restart statefulset/consul-server -n consul
  1. New server pods get different IP addresses:

    • consul-server-010.1.5.20
    • consul-server-110.1.5.21
    • consul-server-210.1.5.22
  2. Restart client pods (simulating simultaneous restart during maintenance):

kubectl delete pods -l app=consul,component=client -n consul
  1. Verify DNS resolution works correctly from new client pods:
kubectl exec -it consul-client-xyz -n consul -- nslookup consul-server-0.consul.example.com.
# Returns: 10.1.5.20 (new correct IP)
  1. Issue: Even freshly restarted client pods continue attempting connections to old cached IP addresses (10.1.2.10, 10.1.2.11, 10.1.2.12) instead of the newly resolved IPs

  2. Client pods remain stuck in init containers indefinitely, never performing fresh DNS resolution during retry attempts

Logs

Freshly Restarted Client Logs During Issue
[DEBUG] agent: Starting Consul agent (fresh restart)
[INFO]  agent: Consul agent running!
[DEBUG] agent: Retry join is supported for the following discovery methods: cluster_addr, aliyun, aws, azure, digitalocean, gce, k8s, linode, mdns, os, scaleway, triton, vsphere
[INFO]  agent: Joining cluster...
[DEBUG] agent: (LAN) joining: [consul-server-0.consul.example.com.:8301 consul-server-1.consul.example.com.:8301 consul-server-2.consul.example.com.:8301]

# Initial DNS resolution during startup - still resolving to OLD IPs
[DEBUG] agent: Resolved consul-server-0.consul.example.com.:8301 to 10.1.2.10:8301
[DEBUG] agent: Resolved consul-server-1.consul.example.com.:8301 to 10.1.2.11:8301
[DEBUG] agent: Resolved consul-server-2.consul.example.com.:8301 to 10.1.2.12:8301

# Connection attempts to cached OLD IPs (servers no longer exist at these addresses)
[ERROR] agent: failed to join: error="dial tcp 10.1.2.10:8301: connect: connection refused" address=10.1.2.10:8301
[ERROR] agent: failed to join: error="dial tcp 10.1.2.11:8301: connect: connection refused" address=10.1.2.11:8301
[ERROR] agent: failed to join: error="dial tcp 10.1.2.12:8301: connect: connection refused" address=10.1.2.12:8301
[WARN]  agent: Join failed: error="3 errors occurred:\n\t* dial tcp 10.1.2.10:8301: connection refused\n\t* dial tcp 10.1.2.11:8301: connection refused\n\t* dial tcp 10.1.2.12:8301: connection refused"

# Retry attempts - NO DNS re-resolution, continues with same cached IPs
[DEBUG] agent: (LAN) joining: [consul-server-0.consul.example.com.:8301 consul-server-1.consul.example.com.:8301 consul-server-2.consul.example.com.:8301]
[ERROR] agent: failed to join: error="dial tcp 10.1.2.10:8301: connect: connection refused" address=10.1.2.10:8301
[ERROR] agent: failed to join: error="dial tcp 10.1.2.11:8301: connect: connection refused" address=10.1.2.11:8301
[ERROR] agent: failed to join: error="dial tcp 10.1.2.12:8301: connect: connection refused" address=10.1.2.12:8301

# Pattern repeats indefinitely - DNS names shown in logs but IPs never re-resolved
DNS Verification Shows Correct Resolution Available
# Manual DNS lookup from same freshly restarted client pod shows correct new IPs:
$ kubectl exec -it consul-client-xyz -n consul -- nslookup consul-server-0.consul.example.com.
Server:    10.96.0.10
Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local

Name:      consul-server-0.consul.example.com.
Address 1: 10.1.5.20 consul-server-0.consul-server.consul.svc.cluster.local

$ kubectl exec -it consul-client-xyz -n consul -- nslookup consul-server-1.consul.example.com.
Name:      consul-server-1.consul.example.com.
Address 1: 10.1.5.21 consul-server-1.consul-server.consul.svc.cluster.local

$ kubectl exec -it consul-client-xyz -n consul -- nslookup consul-server-2.consul.example.com.
Name:      consul-server-2.consul.example.com.
Address 1: 10.1.5.22 consul-server-2.consul-server.consul.svc.cluster.local

# But client continues trying old cached IPs 10.1.2.x
# DNS resolution works perfectly - the issue is Consul's caching behavior

Expected behavior

  • Consul clients should retry DNS resolution when connection attempts fail
  • When retry_join contains DNS names, DNS lookups should be performed on each retry attempt or periodically
  • Clients should automatically discover new server IP addresses without manual intervention
  • Init containers should eventually succeed when servers become available at new IPs
  • DNS TTL settings should be respected for re-resolution timing

Environment details

  • consul-k8s version: 1.5.5 (Chart version 1.5.5)
  • Kubernetes version: v1.28.x
  • Cloud Provider: Google Kubernetes Engine (GKE)
  • Networking CNI plugin: Default GKE networking

Complete values.yaml:

client:
  enabled: true
connectInject:
  enabled: false
dns:
  enabled: true
global:
  acls:
    bootstrapToken:
      secretKey: token
      secretName: consul-bootstrap
    manageSystemACLs: false
  datacenter: dc1
  domain: consul.example.com
  enabled: true
  gossipEncryption:
    autoGenerate: false
    secretKey: key
    secretName: consul-gossip
  logLevel: debug
  metrics:
    agentMetricsRetentionTime: 1h
    enableAgentMetrics: true
    enabled: true
  name: consul
  tls:
    caCert:
      secretKey: caCert
      secretName: consul-federation
    caKey:
      secretKey: caKey
      secretName: consul-federation
    enableAutoEncrypt: true
    enabled: true
    httpsOnly: false
    serverAdditionalDNSSANs:
    - consul.example.com.
    - consul-ui.example.com.
    - consul-server-join.example.com.
    - server.dc1.consul.example.com
    serverAdditionalIPSANs: []
    verify: true
server:
  bootstrapExpect: 3
  connect: false
  enabled: true
  extraConfig: |
    {
      "retry_join": [
        "consul-server-0.consul.example.com.",
        "consul-server-1.consul.example.com.",
        "consul-server-2.consul.example.com."
      ],
      "limits": {
        "http_max_conns_per_client": 1000,
        "rpc_max_conns_per_client": 1000
      }
    }
  replicas: 3
  resources:
    limits:
      cpu: 1000m
      memory: 4Gi
    requests:
      cpu: 1000m
      memory: 4Gi
  service:
    additionalSpec: |
      publishNotReadyAddresses: true
    annotations: |
      "external-dns.alpha.kubernetes.io/dns-zone": "internal"
      "external-dns.alpha.kubernetes.io/hostname": "consul.example.com."
      "external-dns.alpha.kubernetes.io/ttl": "60"
ui:
  enabled: true
  ingress:
    annotations: |
      'cert-manager.io/cluster-issuer': 'letsencrypt'
      'external-dns.alpha.kubernetes.io/dns-zone': 'external'
      'external-dns.alpha.kubernetes.io/hostname': 'consul-ui.example.com'
      'kubernetes.io/ingress.class': 'gce'
      'networking.gke.io/v1beta1.FrontendConfig': 'frontend-config-consul'
      'kubernetes.io/ingress.allow-http': 'false'
    enabled: true
    hosts:
    - host: consul-ui.example.com
      paths:
      - /
      - /*
    pathType: ImplementationSpecific
    tls:
    - hosts:
      - consul-ui.example.com
      secretName: consul-ingress-cert
  metrics:
    enabled: false
  service:
    annotations: |
      'beta.cloud.google.com/backend-config': '{"default":"consul-backend-config"}'
      'cloud.google.com/app-protocols': '{"https":"HTTPS", "http":"HTTP"}'
      'cloud.google.com/neg': '{"ingress":true}'
    type: ClusterIP

Additional Context

Impact on Production Operations:
This issue severely impacts routine Kubernetes operations:

  • GKE Node Upgrades - Automatic maintenance causes pod IP changes
  • Pod Evictions - Resource pressure or node draining
  • Rolling Updates - Server pod updates get new IPs
  • Cluster Autoscaling - Node scaling operations
  • StatefulSet Restarts - Maintenance operations requiring server restarts

Workarounds Attempted:

  • Using IP addresses instead of DNS names - defeats purpose of service discovery
  • Manual client pod restarts - temporary fix but not sustainable for production

Suggested Solutions:

  1. Implement periodic DNS re-resolution in retry_join logic
  2. Add configuration option to control DNS cache TTL behavior
  3. Retry DNS resolution on connection failures
  4. Respect Kubernetes DNS TTL settings for service discovery

This issue makes Consul unsuitable for production Kubernetes environments where pod IP changes are normal operations.

Metadata

Metadata

Assignees

No one assigned

    Labels

    type/bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions