-
Notifications
You must be signed in to change notification settings - Fork 329
Description
Community Note
- Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request. Searching for pre-existing feature requests helps us consolidate datapoints for identical requirements into a single place, thank you!
- Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request.
- If you are interested in working on this issue or have submitted a pull request, please leave a comment.
Overview of the Issue
Consul clients resolve DNS names in retry_join configuration only once during startup and cache the resulting IP addresses indefinitely. When Consul servers get new IP addresses (common in Kubernetes environments during node upgrades, pod restarts, etc.), even newly restarted client pods continue attempting connections to cached stale IP addresses and never retry DNS resolution, causing permanent connection failures.
This creates a critical issue in cloud-native Kubernetes deployments where pod IP changes are routine operations. When servers restart and get new IPs, client pods that restart simultaneously (or shortly after) still resolve to the old cached DNS entries and get stuck in init containers indefinitely, even though they are performing fresh DNS lookups during their startup process.
Root Cause: Consul's retry_join mechanism performs DNS resolution only at startup and caches resolved IPs permanently, without implementing DNS re-resolution on connection failures. This affects both existing clients and newly started clients that may still resolve stale DNS entries due to DNS caching layers.
Reproduction Steps
- Deploy Consul cluster using official Helm chart with the following configuration:
global:
name: consul
datacenter: dc1
domain: consul.example.com
enabled: true
logLevel: debug
server:
enabled: true
replicas: 3
bootstrapExpect: 3
extraConfig: |
{
"retry_join": [
"consul-server-0.consul.example.com.",
"consul-server-1.consul.example.com.",
"consul-server-2.consul.example.com."
]
}
client:
enabled: true-
Observe initial startup - clients successfully resolve DNS and join cluster:
consul-server-0.consul.example.com.→10.1.2.10consul-server-1.consul.example.com.→10.1.2.11consul-server-2.consul.example.com.→10.1.2.12
-
Trigger server StatefulSet restart (simulating node upgrade or maintenance):
kubectl rollout restart statefulset/consul-server -n consul-
New server pods get different IP addresses:
consul-server-0→10.1.5.20consul-server-1→10.1.5.21consul-server-2→10.1.5.22
-
Restart client pods (simulating simultaneous restart during maintenance):
kubectl delete pods -l app=consul,component=client -n consul- Verify DNS resolution works correctly from new client pods:
kubectl exec -it consul-client-xyz -n consul -- nslookup consul-server-0.consul.example.com.
# Returns: 10.1.5.20 (new correct IP)-
Issue: Even freshly restarted client pods continue attempting connections to old cached IP addresses (
10.1.2.10,10.1.2.11,10.1.2.12) instead of the newly resolved IPs -
Client pods remain stuck in init containers indefinitely, never performing fresh DNS resolution during retry attempts
Logs
Freshly Restarted Client Logs During Issue
[DEBUG] agent: Starting Consul agent (fresh restart)
[INFO] agent: Consul agent running!
[DEBUG] agent: Retry join is supported for the following discovery methods: cluster_addr, aliyun, aws, azure, digitalocean, gce, k8s, linode, mdns, os, scaleway, triton, vsphere
[INFO] agent: Joining cluster...
[DEBUG] agent: (LAN) joining: [consul-server-0.consul.example.com.:8301 consul-server-1.consul.example.com.:8301 consul-server-2.consul.example.com.:8301]
# Initial DNS resolution during startup - still resolving to OLD IPs
[DEBUG] agent: Resolved consul-server-0.consul.example.com.:8301 to 10.1.2.10:8301
[DEBUG] agent: Resolved consul-server-1.consul.example.com.:8301 to 10.1.2.11:8301
[DEBUG] agent: Resolved consul-server-2.consul.example.com.:8301 to 10.1.2.12:8301
# Connection attempts to cached OLD IPs (servers no longer exist at these addresses)
[ERROR] agent: failed to join: error="dial tcp 10.1.2.10:8301: connect: connection refused" address=10.1.2.10:8301
[ERROR] agent: failed to join: error="dial tcp 10.1.2.11:8301: connect: connection refused" address=10.1.2.11:8301
[ERROR] agent: failed to join: error="dial tcp 10.1.2.12:8301: connect: connection refused" address=10.1.2.12:8301
[WARN] agent: Join failed: error="3 errors occurred:\n\t* dial tcp 10.1.2.10:8301: connection refused\n\t* dial tcp 10.1.2.11:8301: connection refused\n\t* dial tcp 10.1.2.12:8301: connection refused"
# Retry attempts - NO DNS re-resolution, continues with same cached IPs
[DEBUG] agent: (LAN) joining: [consul-server-0.consul.example.com.:8301 consul-server-1.consul.example.com.:8301 consul-server-2.consul.example.com.:8301]
[ERROR] agent: failed to join: error="dial tcp 10.1.2.10:8301: connect: connection refused" address=10.1.2.10:8301
[ERROR] agent: failed to join: error="dial tcp 10.1.2.11:8301: connect: connection refused" address=10.1.2.11:8301
[ERROR] agent: failed to join: error="dial tcp 10.1.2.12:8301: connect: connection refused" address=10.1.2.12:8301
# Pattern repeats indefinitely - DNS names shown in logs but IPs never re-resolved
DNS Verification Shows Correct Resolution Available
# Manual DNS lookup from same freshly restarted client pod shows correct new IPs:
$ kubectl exec -it consul-client-xyz -n consul -- nslookup consul-server-0.consul.example.com.
Server: 10.96.0.10
Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local
Name: consul-server-0.consul.example.com.
Address 1: 10.1.5.20 consul-server-0.consul-server.consul.svc.cluster.local
$ kubectl exec -it consul-client-xyz -n consul -- nslookup consul-server-1.consul.example.com.
Name: consul-server-1.consul.example.com.
Address 1: 10.1.5.21 consul-server-1.consul-server.consul.svc.cluster.local
$ kubectl exec -it consul-client-xyz -n consul -- nslookup consul-server-2.consul.example.com.
Name: consul-server-2.consul.example.com.
Address 1: 10.1.5.22 consul-server-2.consul-server.consul.svc.cluster.local
# But client continues trying old cached IPs 10.1.2.x
# DNS resolution works perfectly - the issue is Consul's caching behaviorExpected behavior
- Consul clients should retry DNS resolution when connection attempts fail
- When
retry_joincontains DNS names, DNS lookups should be performed on each retry attempt or periodically - Clients should automatically discover new server IP addresses without manual intervention
- Init containers should eventually succeed when servers become available at new IPs
- DNS TTL settings should be respected for re-resolution timing
Environment details
- consul-k8s version: 1.5.5 (Chart version 1.5.5)
- Kubernetes version: v1.28.x
- Cloud Provider: Google Kubernetes Engine (GKE)
- Networking CNI plugin: Default GKE networking
Complete values.yaml:
client:
enabled: true
connectInject:
enabled: false
dns:
enabled: true
global:
acls:
bootstrapToken:
secretKey: token
secretName: consul-bootstrap
manageSystemACLs: false
datacenter: dc1
domain: consul.example.com
enabled: true
gossipEncryption:
autoGenerate: false
secretKey: key
secretName: consul-gossip
logLevel: debug
metrics:
agentMetricsRetentionTime: 1h
enableAgentMetrics: true
enabled: true
name: consul
tls:
caCert:
secretKey: caCert
secretName: consul-federation
caKey:
secretKey: caKey
secretName: consul-federation
enableAutoEncrypt: true
enabled: true
httpsOnly: false
serverAdditionalDNSSANs:
- consul.example.com.
- consul-ui.example.com.
- consul-server-join.example.com.
- server.dc1.consul.example.com
serverAdditionalIPSANs: []
verify: true
server:
bootstrapExpect: 3
connect: false
enabled: true
extraConfig: |
{
"retry_join": [
"consul-server-0.consul.example.com.",
"consul-server-1.consul.example.com.",
"consul-server-2.consul.example.com."
],
"limits": {
"http_max_conns_per_client": 1000,
"rpc_max_conns_per_client": 1000
}
}
replicas: 3
resources:
limits:
cpu: 1000m
memory: 4Gi
requests:
cpu: 1000m
memory: 4Gi
service:
additionalSpec: |
publishNotReadyAddresses: true
annotations: |
"external-dns.alpha.kubernetes.io/dns-zone": "internal"
"external-dns.alpha.kubernetes.io/hostname": "consul.example.com."
"external-dns.alpha.kubernetes.io/ttl": "60"
ui:
enabled: true
ingress:
annotations: |
'cert-manager.io/cluster-issuer': 'letsencrypt'
'external-dns.alpha.kubernetes.io/dns-zone': 'external'
'external-dns.alpha.kubernetes.io/hostname': 'consul-ui.example.com'
'kubernetes.io/ingress.class': 'gce'
'networking.gke.io/v1beta1.FrontendConfig': 'frontend-config-consul'
'kubernetes.io/ingress.allow-http': 'false'
enabled: true
hosts:
- host: consul-ui.example.com
paths:
- /
- /*
pathType: ImplementationSpecific
tls:
- hosts:
- consul-ui.example.com
secretName: consul-ingress-cert
metrics:
enabled: false
service:
annotations: |
'beta.cloud.google.com/backend-config': '{"default":"consul-backend-config"}'
'cloud.google.com/app-protocols': '{"https":"HTTPS", "http":"HTTP"}'
'cloud.google.com/neg': '{"ingress":true}'
type: ClusterIPAdditional Context
Impact on Production Operations:
This issue severely impacts routine Kubernetes operations:
- ✅ GKE Node Upgrades - Automatic maintenance causes pod IP changes
- ✅ Pod Evictions - Resource pressure or node draining
- ✅ Rolling Updates - Server pod updates get new IPs
- ✅ Cluster Autoscaling - Node scaling operations
- ✅ StatefulSet Restarts - Maintenance operations requiring server restarts
Workarounds Attempted:
- Using IP addresses instead of DNS names - defeats purpose of service discovery
- Manual client pod restarts - temporary fix but not sustainable for production
Suggested Solutions:
- Implement periodic DNS re-resolution in
retry_joinlogic - Add configuration option to control DNS cache TTL behavior
- Retry DNS resolution on connection failures
- Respect Kubernetes DNS TTL settings for service discovery
This issue makes Consul unsuitable for production Kubernetes environments where pod IP changes are normal operations.