// Module included in the following assemblies: // // * monitoring/monitoring.adoc [id="configuring-etcd-monitoring_{context}"] = Configuring etcd monitoring If the `etcd` service does not run correctly, successful operation of the whole {product-title} cluster is in danger. Therefore, it is reasonable to configure monitoring of `etcd`. .Procedure . Verify that the monitoring stack is running: + [subs="quotes"] ---- $ oc -n openshift-monitoring get pods NAME READY STATUS RESTARTS AGE alertmanager-main-0 3/3 Running 0 34m alertmanager-main-1 3/3 Running 0 33m alertmanager-main-2 3/3 Running 0 33m cluster-monitoring-operator-67b8797d79-sphxj 1/1 Running 0 36m grafana-c66997f-pxrf7 2/2 Running 0 37s kube-state-metrics-7449d589bc-rt4mq 3/3 Running 0 33m node-exporter-5tt4f 2/2 Running 0 33m node-exporter-b2mrp 2/2 Running 0 33m node-exporter-fd52p 2/2 Running 0 33m node-exporter-hfqgv 2/2 Running 0 33m prometheus-k8s-0 4/4 Running 1 35m prometheus-k8s-1 0/4 ContainerCreating 0 21s prometheus-operator-6c9fddd47f-9jfgk 1/1 Running 0 36m ---- . Open the configuration file for the cluster monitoring stack: + [subs="quotes"] ---- $ oc -n openshift-monitoring edit configmap cluster-monitoring-config ---- . Under `config.yaml: |+`, add the `etcd` section. + .. If you run `etcd` in static pods on your control plane nodes (also known as master nodes), you can specify the `etcd` nodes using the selector: + [subs="quotes"] ---- ... data: config.yaml: |+ ... *etcd: targets: selector: openshift.io/component: etcd openshift.io/control-plane: "true"* ---- + .. If you run `etcd` on separate hosts, you must specify the nodes using IP addresses: + [subs="quotes"] ---- ... data: config.yaml: |+ ... *etcd: targets: ips: - "127.0.0.1" - "127.0.0.2" - "127.0.0.3"* ---- + If `etcd` nodes IP addresses change, you must update this list. . Verify that the `etcd` service monitor is now running: + [subs="quotes"] ---- $ oc -n openshift-monitoring get servicemonitor NAME AGE alertmanager 35m *etcd 1m* kube-apiserver 36m kube-controllers 36m kube-state-metrics 34m kubelet 36m node-exporter 34m prometheus 36m prometheus-operator 37m ---- + It might take up to a minute for the `etcd` service monitor to start. . Now you can navigate to the Web interface to see more information about status of `etcd` monitoring: + .. To get the URL, run: + [subs="quotes"] ---- $ oc -n openshift-monitoring get routes NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD ... prometheus-k8s prometheus-k8s-openshift-monitoring.apps.msvistun.origin-gce.dev.openshift.com prometheus-k8s web reencrypt None ---- + .. Using `https`, navigate to the URL listed for `prometheus-k8s`. Log in. . Ensure the user belongs to the `cluster-monitoring-view` role. This role provides access to viewing cluster monitoring UIs. For example, to add user `developer` to `cluster-monitoring-view`, run: $ oc adm policy add-cluster-role-to-user cluster-monitoring-view developer + . In the Web interface, log in as the user belonging to the `cluster-monitoring-view` role. . Click *Status*, then *Targets*. If you see an `etcd` entry, `etcd` is being monitored. + image::etcd-no-certificate.png[] While `etcd` is being monitored, Prometheus is not yet able to authenticate against `etcd`, and so cannot gather metrics. To configure Prometheus authentication against `etcd`: . Copy the `/etc/etcd/ca/ca.crt` and `/etc/etcd/ca/ca.key` credentials files from the control plane node to the local machine: + [subs="quotes"] ---- $ ssh -i gcp-dev/ssh-privatekey cloud-user@35.237.54.213 ... ---- . Create the `openssl.cnf` file with these contents: + ---- [ req ] req_extensions = v3_req distinguished_name = req_distinguished_name [ req_distinguished_name ] [ v3_req ] basicConstraints = CA:FALSE keyUsage = nonRepudiation, keyEncipherment, digitalSignature extendedKeyUsage=serverAuth, clientAuth ---- . Generate the `etcd.key` private key file: + [subs="quotes"] ---- $ openssl genrsa -out etcd.key 2048 ---- . Generate the `etcd.csr` certificate signing request file: + [subs="quotes"] ---- $ openssl req -new -key etcd.key -out etcd.csr -subj "/CN=etcd" -config openssl.cnf ---- . Generate the `etcd.crt` certificate file: + [subs="quotes"] ---- $ openssl x509 -req -in etcd.csr -CA ca.crt -CAkey ca.key -CAcreateserial -out etcd.crt -days 365 -extensions v3_req -extfile openssl.cnf ---- . Put the credentials into format used by {product-title}: + ---- cat <<-EOF > etcd-cert-secret.yaml apiVersion: v1 data: etcd-client-ca.crt: "$(cat ca.crt | base64 --wrap=0)" etcd-client.crt: "$(cat etcd.crt | base64 --wrap=0)" etcd-client.key: "$(cat etcd.key | base64 --wrap=0)" kind: Secret metadata: name: kube-etcd-client-certs namespace: openshift-monitoring type: Opaque EOF ---- + This creates the *_etcd-cert-secret.yaml_* file . Apply the credentials file to the cluster: ---- $ oc apply -f etcd-cert-secret.yaml ---- . Visit the "Targets" page of the Web interface again. Verify that `etcd` is now being correctly monitored. It might take several minutes for changes to take effect. + image::etcd-monitoring-working.png[]