Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,9 @@ All notable changes to this project will be documented in this file.
- Add Listener support ([#17]).
- Make the environment variables `OPENSEARCH_HOME` and `OPENSEARCH_PATH_CONF` overridable, so that
images can be used which have a different directory structure than the Stackable image ([#18]).
- Add Prometheus labels and annotations to role-group services ([#26]).

[#10]: https://github.com/stackabletech/opensearch-operator/pull/10
[#17]: https://github.com/stackabletech/opensearch-operator/pull/17
[#18]: https://github.com/stackabletech/opensearch-operator/pull/18
[#26]: https://github.com/stackabletech/opensearch-operator/pull/26
151 changes: 151 additions & 0 deletions docs/modules/opensearch/pages/usage-guide/monitoring.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,151 @@
= Monitoring
:description: Use Prometheus to monitor OpenSearch

OpenSearch clusters can be monitored with Prometheus, see also the general xref:operators:monitoring.adoc[] page.
The Prometheus metrics are exposed on the HTTP port 9200 at the path `/_prometheus/metrics`.

The role group services contain the corresponding labels and annotations:

[source,yaml]
----
---
apiVersion: v1
kind: Service
metadata:
name: opensearch-nodes-default-headless
labels:
prometheus.io/scrape: "true"
annotations:
prometheus.io/path: /_prometheus/metrics
prometheus.io/port: "9200"
prometheus.io/scheme: https
prometheus.io/scrape: "true"
----

If authentication is enabled in the OpenSearch security plugin, then the metrics endpoint is also secured.
To make the metrics accessible for all users, especially Prometheus, anonymous authentication can be enabled and access to the monitoring statistics can be allowed for the role of the anonymous user:

[source,yaml]
----
---
apiVersion: v1
kind: Secret
metadata:
name: opensearch-security-config
stringData:
config.yml: |
---
_meta:
type: config
config_version: 2
config:
dynamic:
authc:
basic_internal_auth_domain:
description: Authenticate via HTTP Basic against internal users database
http_enabled: true
transport_enabled: true
order: 1
http_authenticator:
type: basic
challenge: false # <1>
authentication_backend:
type: intern
authz: {}
http:
anonymous_auth_enabled: true # <2>
roles.yml: |
---
_meta:
type: roles
config_version: 2
monitoring: # <3>
reserved: true
cluster_permissions:
- cluster:monitor/health
- cluster:monitor/nodes/info
- cluster:monitor/nodes/stats
- cluster:monitor/prometheus/metrics
- cluster:monitor/state
index_permissions:
- index_patterns:
- "*"
allowed_actions:
- indices:monitor/health
- indices:monitor/stats
roles_mapping.yml: |
---
_meta:
type: rolesmapping
config_version: 2
monitoring: # <4>
backend_roles:
- opendistro_security_anonymous_backendrole
----
<1> If anonymous authentication is enabled, then all defined HTTP authenticators are non-challenging.
<2> Enable https://docs.opensearch.org/latest/security/access-control/anonymous-authentication/[anonymous authentication]
<3> Create a role "monitoring" with the required permissions for the Prometheus endpoint
<4> Map the role "monitoring" to the backend role "opendistro_security_anonymous_backendrole" that is assigned to the anonymous user

If you use the https://prometheus-operator.dev/[Prometheus Operator] to install Prometheus, then you can define a https://prometheus-operator.dev/docs/api-reference/api/#monitoring.coreos.com/v1.ServiceMonitor[ServiceMonitor] to collect the metrics:

[source,yaml]
----
---
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: stackable-opensearch
labels:
release: prometheus-stack # <1>
spec:
selector:
matchLabels: # <2>
prometheus.io/scrape: "true"
endpoints:
- relabelings:
- sourceLabels: # <3>
- __meta_kubernetes_service_annotation_prometheus_io_scheme
action: replace
targetLabel: __scheme__
regex: (https?)
- sourceLabels: # <4>
- __meta_kubernetes_service_annotation_prometheus_io_path
action: replace
targetLabel: __metrics_path__
regex: (.+)
- sourceLabels: # <5>
- __meta_kubernetes_pod_name
- __meta_kubernetes_service_name
- __meta_kubernetes_namespace
- __meta_kubernetes_service_annotation_prometheus_io_port
action: replace
targetLabel: __address__
regex: (.+);(.+);(.+);(\d+)
replacement: $1.$2.$3.svc.cluster.local:$4
tlsConfig: # <6>
ca:
configMap:
name: truststore
key: ca.crt
---
apiVersion: secrets.stackable.tech/v1alpha1
kind: TrustStore
metadata:
name: truststore
spec:
secretClassName: tls
format: tls-pem
----
<1> The `release` label must match the Helm release name.
This Helm release was installed with `helm install prometheus-stack oci://ghcr.io/prometheus-community/charts/kube-prometheus-stack ...`.
<2> Label selector to select the Kubernetes `Endpoints` objects to scrape metrics from.
The Endpoints inherit the labels from their Service.
<3> Use the schema (`http` or `https`) from the Service annotation `prometheus.io/scheme`
<4> Use the path (`/_prometheus/metrics`) from the Service annotation `prometheus.io/path`.
These values could also be hard-coded in the ServiceMonitor but it is better to use the ones provided by the operator if they change in the future.
<5> Use the FQDN instead of the IP address because the IP address is not contained in the certificate.
The FQDN is constructed from the pod name, service name, namespace and the HTTP port provided in the Service annotation `prometheus.io/port`, e.g. `opensearch-nodes-default-0.opensearch-nodes-default-headless.my-namespace.svc.cluster.local:9200`.
<6> If TLS is used and the CA is not already provided to Prometheus in another way, then it can be taken from a xref:secret-operator:truststore.adoc[] ConfigMap.
The TrustStore ConfigMap is updated whenever the CA is rotated.
In this case, Prometheus takes over the new certificate.
1 change: 1 addition & 0 deletions docs/modules/opensearch/partials/nav.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
** xref:opensearch:usage-guide/listenerclass.adoc[]
** xref:opensearch:usage-guide/storage-resource-configuration.adoc[]
** xref:opensearch:usage-guide/configuration-environment-overrides.adoc[]
** xref:opensearch:usage-guide/monitoring.adoc[]
** xref:opensearch:usage-guide/operations/index.adoc[]
*** xref:opensearch:usage-guide/operations/cluster-operations.adoc[]
*** xref:opensearch:usage-guide/operations/pod-placement.adoc[]
Expand Down
65 changes: 63 additions & 2 deletions rust/operator-binary/src/controller/build/node_config.rs
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
use std::str::FromStr;

use serde_json::{Value, json};
use stackable_operator::builder::pod::container::FieldPathEnvVar;

Expand Down Expand Up @@ -88,7 +90,12 @@ impl NodeConfig {
}

/// static for the cluster
pub fn static_opensearch_config(&self) -> String {
pub fn static_opensearch_config_file(&self) -> String {
Self::to_yaml(self.static_opensearch_config())
}

/// static for the cluster
pub fn static_opensearch_config(&self) -> serde_json::Map<String, Value> {
let mut config = serde_json::Map::new();

config.insert(
Expand Down Expand Up @@ -124,7 +131,24 @@ impl NodeConfig {
// Ensure a deterministic result
config.sort_keys();

Self::to_yaml(config)
config
}

pub fn tls_on_http_port_enabled(&self) -> bool {
self.static_opensearch_config()
.get("plugins.security.ssl.http.enabled")
.and_then(Self::value_as_bool)
== Some(true)
}

pub fn value_as_bool(value: &Value) -> Option<bool> {
value.as_bool().or(
// OpenSearch parses the strings "true" and "false" as boolean, see
// https://github.com/opensearch-project/OpenSearch/blob/3.1.0/libs/common/src/main/java/org/opensearch/common/Booleans.java#L45-L84
value
.as_str()
.and_then(|value| FromStr::from_str(value).ok()),
)
}

/// different for every node
Expand Down Expand Up @@ -262,6 +286,43 @@ mod tests {
framework::{ClusterName, ProductVersion, role_utils::GenericProductSpecificCommonConfig},
};

#[test]
pub fn test_value_as_bool() {
// boolean
assert_eq!(Some(true), NodeConfig::value_as_bool(&Value::Bool(true)));
assert_eq!(Some(false), NodeConfig::value_as_bool(&Value::Bool(false)));

// valid strings
assert_eq!(
Some(true),
NodeConfig::value_as_bool(&Value::String("true".to_owned()))
);
assert_eq!(
Some(false),
NodeConfig::value_as_bool(&Value::String("false".to_owned()))
);

// invalid strings
assert_eq!(
None,
NodeConfig::value_as_bool(&Value::String("True".to_owned()))
);

// invalid types
assert_eq!(None, NodeConfig::value_as_bool(&Value::Null));
assert_eq!(
None,
NodeConfig::value_as_bool(&Value::Number(
serde_json::Number::from_i128(1).expect("should be a valid number")
))
);
assert_eq!(None, NodeConfig::value_as_bool(&Value::Array(vec![])));
assert_eq!(
None,
NodeConfig::value_as_bool(&Value::Object(serde_json::Map::new()))
);
}

#[test]
pub fn test_environment_variables() {
let image: ProductImage = serde_json::from_str(r#"{"productVersion": "3.0.0"}"#)
Expand Down
2 changes: 1 addition & 1 deletion rust/operator-binary/src/controller/build/role_builder.rs
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ impl<'a> RoleBuilder<'a> {

// TODO Only one builder function which calls the other ones?

pub fn role_group_builders(&self) -> Vec<RoleGroupBuilder> {
pub fn role_group_builders(&self) -> Vec<RoleGroupBuilder<'_>> {
self.cluster
.role_group_configs
.iter()
Expand Down
Loading