This repository was archived by the owner on May 16, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 1.9k
This repository was archived by the owner on May 16, 2023. It is now read-only.
[elasticsearch] Readiness probe is failing again with 8.0.0-SNAPSHOT and default config #1443
Copy link
Copy link
Closed
Labels
Description
Chart version: 8.0.0-SNAPSHOT
Kubernetes version: all
Kubernetes provider: all
Helm Version: all
helm get release output:
Output of helm get release
NAME: es
LAST DEPLOYED: Fri Nov 5 17:27:14 2021
NAMESPACE: default
STATUS: deployed
REVISION: 1
USER-SUPPLIED VALUES:
null
COMPUTED VALUES:
antiAffinity: hard
antiAffinityTopologyKey: kubernetes.io/hostname
clusterHealthCheckParams: wait_for_status=green&timeout=1s
clusterName: elasticsearch
enableServiceLinks: true
envFrom: []
esConfig: {}
esJavaOpts: ""
esMajorVersion: ""
extraContainers: []
extraEnvs: []
extraInitContainers: []
extraVolumeMounts: []
extraVolumes: []
fsGroup: ""
fullnameOverride: ""
healthNameOverride: ""
hostAliases: []
httpPort: 9200
image: docker.elastic.co/elasticsearch/elasticsearch
imagePullPolicy: IfNotPresent
imagePullSecrets: []
imageTag: 8.0.0-SNAPSHOT
ingress:
annotations: {}
className: nginx
enabled: false
hosts:
- host: chart-example.local
paths:
- path: /
pathtype: ImplementationSpecific
tls: []
initResources: {}
keystore: []
labels: {}
lifecycle: {}
masterService: ""
maxUnavailable: 1
minimumMasterNodes: 2
nameOverride: ""
networkHost: 0.0.0.0
networkPolicy:
http:
enabled: false
transport:
enabled: false
nodeAffinity: {}
nodeGroup: master
nodeSelector: {}
persistence:
annotations: {}
enabled: true
labels:
enabled: false
podAnnotations: {}
podManagementPolicy: Parallel
podSecurityContext:
fsGroup: 1000
runAsUser: 1000
podSecurityPolicy:
create: false
name: ""
spec:
fsGroup:
rule: RunAsAny
privileged: true
runAsUser:
rule: RunAsAny
seLinux:
rule: RunAsAny
supplementalGroups:
rule: RunAsAny
volumes:
- secret
- configMap
- persistentVolumeClaim
- emptyDir
priorityClassName: ""
protocol: http
rbac:
automountToken: true
create: false
serviceAccountAnnotations: {}
serviceAccountName: ""
readinessProbe:
failureThreshold: 3
initialDelaySeconds: 10
periodSeconds: 10
successThreshold: 3
timeoutSeconds: 5
replicas: 3
resources:
limits:
cpu: 1000m
memory: 2Gi
requests:
cpu: 1000m
memory: 2Gi
roles:
- master
- data
- data_content
- data_hot
- data_warm
- data_cold
- ingest
- ml
- remote_cluster_client
- transform
schedulerName: ""
secret:
enabled: true
password: ""
secretMounts: []
securityContext:
capabilities:
drop:
- ALL
runAsNonRoot: true
runAsUser: 1000
service:
annotations: {}
enabled: true
externalTrafficPolicy: ""
httpPortName: http
labels: {}
labelsHeadless: {}
loadBalancerIP: ""
loadBalancerSourceRanges: []
nodePort: ""
transportPortName: transport
type: ClusterIP
sysctlInitContainer:
enabled: true
sysctlVmMaxMapCount: 262144
terminationGracePeriod: 120
tests:
enabled: true
tolerations: []
transportPort: 9300
updateStrategy: RollingUpdate
volumeClaimTemplate:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 30Gi
HOOKS:
---
# Source: elasticsearch/templates/test/test-elasticsearch-health.yaml
apiVersion: v1
kind: Pod
metadata:
name: "es-qzanl-test"
annotations:
"helm.sh/hook": test
"helm.sh/hook-delete-policy": hook-succeeded
spec:
securityContext:
fsGroup: 1000
runAsUser: 1000
containers:
- name: "es-novrx-test"
image: "docker.elastic.co/elasticsearch/elasticsearch:8.0.0-SNAPSHOT"
imagePullPolicy: "IfNotPresent"
command:
- "sh"
- "-c"
- |
#!/usr/bin/env bash -e
curl -XGET --fail 'elasticsearch-master:9200/_cluster/health?wait_for_status=green&timeout=1s'
restartPolicy: Never
MANIFEST:
---
# Source: elasticsearch/templates/poddisruptionbudget.yaml
apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
name: "elasticsearch-master-pdb"
spec:
maxUnavailable: 1
selector:
matchLabels:
app: "elasticsearch-master"
---
# Source: elasticsearch/templates/secret.yaml
apiVersion: v1
kind: Secret
metadata:
name: elasticsearch-master-credentials
labels:
heritage: "Helm"
release: "es"
chart: "elasticsearch"
app: "elasticsearch-master"
type: Opaque
data:
username: ZWxhc3RpYw==
password: "dDFIa1VCTkxIRTE0VkdyRQ=="
---
# Source: elasticsearch/templates/service.yaml
kind: Service
apiVersion: v1
metadata:
name: elasticsearch-master
labels:
heritage: "Helm"
release: "es"
chart: "elasticsearch"
app: "elasticsearch-master"
annotations:
{}
spec:
type: ClusterIP
selector:
release: "es"
chart: "elasticsearch"
app: "elasticsearch-master"
ports:
- name: http
protocol: TCP
port: 9200
- name: transport
protocol: TCP
port: 9300
---
# Source: elasticsearch/templates/service.yaml
kind: Service
apiVersion: v1
metadata:
name: elasticsearch-master-headless
labels:
heritage: "Helm"
release: "es"
chart: "elasticsearch"
app: "elasticsearch-master"
annotations:
service.alpha.kubernetes.io/tolerate-unready-endpoints: "true"
spec:
clusterIP: None # This is needed for statefulset hostnames like elasticsearch-0 to resolve
# Create endpoints also if the related pod isn't ready
publishNotReadyAddresses: true
selector:
app: "elasticsearch-master"
ports:
- name: http
port: 9200
- name: transport
port: 9300
---
# Source: elasticsearch/templates/statefulset.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: elasticsearch-master
labels:
heritage: "Helm"
release: "es"
chart: "elasticsearch"
app: "elasticsearch-master"
annotations:
esMajorVersion: "8"
spec:
serviceName: elasticsearch-master-headless
selector:
matchLabels:
app: "elasticsearch-master"
replicas: 3
podManagementPolicy: Parallel
updateStrategy:
type: RollingUpdate
volumeClaimTemplates:
- metadata:
name: elasticsearch-master
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 30Gi
template:
metadata:
name: "elasticsearch-master"
labels:
release: "es"
chart: "elasticsearch"
app: "elasticsearch-master"
annotations:
spec:
securityContext:
fsGroup: 1000
runAsUser: 1000
automountServiceAccountToken: true
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- "elasticsearch-master"
topologyKey: kubernetes.io/hostname
terminationGracePeriodSeconds: 120
volumes:
enableServiceLinks: true
initContainers:
- name: configure-sysctl
securityContext:
runAsUser: 0
privileged: true
image: "docker.elastic.co/elasticsearch/elasticsearch:8.0.0-SNAPSHOT"
imagePullPolicy: "IfNotPresent"
command: ["sysctl", "-w", "vm.max_map_count=262144"]
resources:
{}
containers:
- name: "elasticsearch"
securityContext:
capabilities:
drop:
- ALL
runAsNonRoot: true
runAsUser: 1000
image: "docker.elastic.co/elasticsearch/elasticsearch:8.0.0-SNAPSHOT"
imagePullPolicy: "IfNotPresent"
readinessProbe:
exec:
command:
- sh
- -c
- |
#!/usr/bin/env bash -e
# Exit if ELASTIC_PASSWORD in unset
if [ -z "${ELASTIC_PASSWORD}" ]; then
echo "ELASTIC_PASSWORD variable is missing, exiting"
exit 1
fi
# If the node is starting up wait for the cluster to be ready (request params: "wait_for_status=green&timeout=1s" )
# Once it has started only check that the node itself is responding
START_FILE=/tmp/.es_start_file
# Disable nss cache to avoid filling dentry cache when calling curl
# This is required with Elasticsearch Docker using nss < 3.52
export NSS_SDB_USE_CACHE=no
http () {
local path="${1}"
local args="${2}"
set -- -XGET -s
if [ "$args" != "" ]; then
set -- "$@" $args
fi
set -- "$@" -u "elastic:${ELASTIC_PASSWORD}"
curl --output /dev/null -k "$@" "http://127.0.0.1:9200${path}"
}
if [ -f "${START_FILE}" ]; then
echo 'Elasticsearch is already running, lets check the node is healthy'
HTTP_CODE=$(http "/" "-w %{http_code}")
RC=$?
if [[ ${RC} -ne 0 ]]; then
echo "curl --output /dev/null -k -XGET -s -w '%{http_code}' \${BASIC_AUTH} http://127.0.0.1:9200/ failed with RC ${RC}"
exit ${RC}
fi
# ready if HTTP code 200, 503 is tolerable if ES version is 6.x
if [[ ${HTTP_CODE} == "200" ]]; then
exit 0
elif [[ ${HTTP_CODE} == "503" && "8" == "6" ]]; then
exit 0
else
echo "curl --output /dev/null -k -XGET -s -w '%{http_code}' \${BASIC_AUTH} http://127.0.0.1:9200/ failed with HTTP code ${HTTP_CODE}"
exit 1
fi
else
echo 'Waiting for elasticsearch cluster to become ready (request params: "wait_for_status=green&timeout=1s" )'
if http "/_cluster/health?wait_for_status=green&timeout=1s" "--fail" ; then
touch ${START_FILE}
exit 0
else
echo 'Cluster is not yet ready (request params: "wait_for_status=green&timeout=1s" )'
exit 1
fi
fi
failureThreshold: 3
initialDelaySeconds: 10
periodSeconds: 10
successThreshold: 3
timeoutSeconds: 5
ports:
- name: http
containerPort: 9200
- name: transport
containerPort: 9300
resources:
limits:
cpu: 1000m
memory: 2Gi
requests:
cpu: 1000m
memory: 2Gi
env:
- name: node.name
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: cluster.initial_master_nodes
value: "elasticsearch-master-0,elasticsearch-master-1,elasticsearch-master-2,"
- name: node.roles
value: "master,data,data_content,data_hot,data_warm,data_cold,ingest,ml,remote_cluster_client,transform,"
- name: discovery.seed_hosts
value: "elasticsearch-master-headless"
- name: cluster.name
value: "elasticsearch"
- name: network.host
value: "0.0.0.0"
- name: ELASTIC_PASSWORD
valueFrom:
secretKeyRef:
name: elasticsearch-master-credentials
key: password
volumeMounts:
- name: "elasticsearch-master"
mountPath: /usr/share/elasticsearch/data
NOTES:
1. Watch all cluster members come up.
$ kubectl get pods --namespace=default -l app=elasticsearch-master -w
2. Retrieve elastic user's password.
$ kubectl get secrets --namespace=default elasticsearch-master-credentials -ojsonpath='{.data.password}' | base64 -d
3. Test cluster health using Helm test.
$ helm --namespace=default test es
Describe the bug:
When using 8.0.0-SNAPSHOT and default values (default elasticsearch config, security not enforced), Elasticsearch chart fails to deploy, with pods never reaching ready state due to Readiness probe failing:
$ kubectl describe pod elasticsearch-master-0
...
Normal Started 116s (x3 over 3m37s) kubelet Started container elasticsearch
Warning Unhealthy 76s (x11 over 3m26s) kubelet Readiness probe failed: Waiting for elasticsearch cluster to become ready (request params: "wait_for_status=green&timeout=1s" )
Cluster is not yet ready (request params: "wait_for_status=green&timeout=1s" )
Warning BackOff 72s (x2 over 2m8s) kubelet Back-off restarting failed container
This seems to be related to the new behavior where Elasticsearch generates TLS certificates by default: elastic/elasticsearch#77231
cc @elastic/es-delivery @jkakavas @mgreau @nkammah
Steps to reproduce:
- Try deploy Elasticsearch chart with default values from master branch
$ cd helm-charts
$ git checkout master
$ helm install es ./elasticsearch
- That's all
Expected behavior: Readiness probe should success.
Provide logs and/or server output (if relevant):
Elasticsearch logs
ERROR: [1] bootstrap checks failed. You must address the points described in the following [1] lines before starting Elasticsearch.
bootstrap check failure [1] of [1]: Transport SSL must be enabled if security is enabled. Please set [xpack.security.transport.ssl.enabled] to [true] or disable security by setting [xpack.security.enabled] to [false]
ERROR: Elasticsearch did not exit normally - check the logs at /usr/share/elasticsearch/logs/elasticsearch.log
Any additional context:
zencircle and jdnielss