日韩成人电影一区二区三区,赶尸艳谈3艳片a级bd,日本熟妇色,三级在线观看视频,人人插操,日韩美做爱视频网,国产刺激出水片,免费特级黄毛片

我們知道怎么自定義一個 ServiceMonitor 對象，但是如果需要自定義一個報警規(guī)則的話呢？我們?nèi)ゲ榭?Prometheus Dashboard 的 Alert 頁面下面就已經(jīng)有很多報警規(guī)則了，這一系列的規(guī)則其實都來自于項目 https://github.com/kubernetes-monitoring/kubernetes-mixin，我們都通過 Prometheus Operator 安裝配置上了。

配置 PrometheusRule

但是這些報警信息是哪里來的呢？他們應(yīng)該用怎樣的方式通知我們呢？我們知道之前我們使用自定義的方式可以在 Prometheus 的配置文件之中指定 AlertManager 實例和報警的 rules 文件，現(xiàn)在我們通過 Operator 部署的呢？我們可以在 Prometheus Dashboard 的 Config 頁面下面查看關(guān)于 AlertManager 的配置：

alerting:
  alert_relabel_configs:
    - separator: ;
      regex: prometheus_replica
      replacement: $1
      action: labeldrop
  alertmanagers:
    - kubernetes_sd_configs:
        - role: endpoints
          namespaces:
            names:
              - monitoring
      scheme: http
      path_prefix: /
      timeout: 10s
      api_version: v1
      relabel_configs:
        - source_labels: [__meta_kubernetes_service_name]
          separator: ;
          regex: alertmanager-main
          replacement: $1
          action: keep
        - source_labels: [__meta_kubernetes_endpoint_port_name]
          separator: ;
          regex: web
          replacement: $1
          action: keep
rule_files:
  - /etc/prometheus/rules/prometheus-k8s-rulefiles-0/*.yaml

上面 alertmanagers 的配置我們可以看到是通過 role 為 endpoints 的 kubernetes 的自動發(fā)現(xiàn)機制獲取的，匹配的是服務(wù)名為 alertmanager-main，端口名為 web 的 Service 服務(wù)，我們可以查看下 alertmanager-main 這個 Service：

$ kubectl describe svc alertmanager-main -n monitoring
Name:                     alertmanager-main
Namespace:                monitoring
Labels:                   alertmanager=main
Annotations:              kubectl.kubernetes.io/last-applied-configuration:
                            {"apiVersion":"v1","kind":"Service","metadata":{"annotations":{},"labels":{"alertmanager":"main"},"name":"alertmanager-main","namespace":"...
Selector:                 alertmanager=main,app=alertmanager
Type:                     NodePort
IP:                       10.106.211.33
Port:                     web  9093/TCP
TargetPort:               web/TCP
NodePort:                 web  31742/TCP
Endpoints:                10.244.3.119:9093,10.244.4.112:9093,10.244.8.164:9093
Session Affinity:         ClientIP
External Traffic Policy:  Cluster
Events:                   <none>

可以看到服務(wù)名正是 alertmanager-main，Port 定義的名稱也是 web，符合上面的規(guī)則，所以 Prometheus 和 AlertManager 組件就正確關(guān)聯(lián)上了。而對應(yīng)的報警規(guī)則文件位于：/etc/prometheus/rules/prometheus-k8s-rulefiles-0/目錄下面所有的 YAML 文件。我們可以進入 Prometheus 的 Pod 中驗證下該目錄下面是否有 YAML 文件：

$ kubectl exec -it prometheus-k8s-0 /bin/sh -n monitoring
Defaulting container name to prometheus.
Use 'kubectl describe pod/prometheus-k8s-0 -n monitoring' to see all of the containers in this pod.
/prometheus $ ls /etc/prometheus/rules/prometheus-k8s-rulefiles-0/
monitoring-prometheus-k8s-rules.yaml
/prometheus $ cat /etc/prometheus/rules/prometheus-k8s-rulefiles-0/monitoring-pr
ometheus-k8s-rules.yaml
groups:
- name: k8s.rules
  rules:
  - expr: |
      sum(rate(container_cpu_usage_seconds_total{job="kubelet", image!="", container_name!=""}[5m])) by (namespace)
    record: namespace:container_cpu_usage_seconds_total:sum_rate
......

這個 YAML 文件實際上就是我們之前創(chuàng)建的一個 PrometheusRule 文件包含的內(nèi)容：

$ cat prometheus-rules.yaml
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  labels:
    prometheus: k8s
    role: alert-rules
  name: prometheus-k8s-rules
  namespace: monitoring
spec:
  groups:
  - name: node-exporter.rules
    rules:
    - expr: |
        count without (cpu) (
          count without (mode) (
            node_cpu_seconds_total{job="node-exporter"}
          )
        )
      record: instance:node_num_cpu:sum
    - expr: |
......

我們這里的 PrometheusRule 的 name 為 prometheus-k8s-rules，namespace 為 monitoring，我們可以猜想到我們創(chuàng)建一個 PrometheusRule 資源對象后，會自動在上面的 prometheus-k8s-rulefiles-0 目錄下面生成一個對應(yīng)的 <namespace>-<name>.yaml 文件，所以如果以后我們需要自定義一個報警選項的話，只需要定義一個 PrometheusRule 資源對象即可。至于為什么 Prometheus 能夠識別這個 PrometheusRule 資源對象呢？這就需要查看我們創(chuàng)建的 prometheus 這個資源對象了，里面有非常重要的一個屬性 ruleSelector，用來匹配 rule 規(guī)則的過濾器，要求匹配具有 prometheus=k8s 和 role=alert-rules 標簽的 PrometheusRule 資源對象，現(xiàn)在明白了吧？

ruleSelector:
  matchLabels:
    prometheus: k8s
    role: alert-rules

所以我們要想自定義一個報警規(guī)則，只需要創(chuàng)建一個具有 prometheus=k8s 和 role=alert-rules 標簽的 PrometheusRule 對象就行了，比如現(xiàn)在我們添加一個 etcd 是否可用的報警，我們知道 etcd 整個集群有一半以上的節(jié)點可用的話集群就是可用的，所以我們判斷如果不可用的 etcd 數(shù)量超過了一半那么就觸發(fā)報警，創(chuàng)建文件 prometheus-etcdRules.yaml：

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  labels:
    prometheus: k8s
    role: alert-rules
  name: etcd-rules
  namespace: monitoring
spec:
  groups:
    - name: etcd
      rules:
        - alert: EtcdClusterUnavailable
          annotations:
            summary: etcd cluster small
            description: If one more etcd peer goes down the cluster will be unavailable
          expr: |
            count(up{job="etcd"} == 0) > (count(up{job="etcd"}) / 2 - 1)
          for: 3m
          labels:
            severity: critical

注意 label 標簽一定至少要有 prometheus=k8s 和 role=alert-rules，創(chuàng)建完成后，隔一會兒再去容器中查看下 rules 文件夾：

$ kubectl apply -f prometheus-etcdRules.yaml
prometheusrule.monitoring.coreos.com/etcd-rules created
$ kubectl exec -it prometheus-k8s-0 /bin/sh -n monitoring
Defaulting container name to prometheus.
Use 'kubectl describe pod/prometheus-k8s-0 -n monitoring' to see all of the containers in this pod.
/prometheus $ ls /etc/prometheus/rules/prometheus-k8s-rulefiles-0/
monitoring-etcd-rules.yaml            monitoring-prometheus-k8s-rules.yaml

可以看到我們創(chuàng)建的 rule 文件已經(jīng)被注入到了對應(yīng)的 rulefiles 文件夾下面了，證明我們上面的設(shè)想是正確的。然后再去 Prometheus Dashboard 的 Alert 頁面下面就可以查看到上面我們新建的報警規(guī)則了：

配置報警

我們知道了如何去添加一個報警規(guī)則配置項，但是這些報警信息用怎樣的方式去發(fā)送呢？前面的課程中我們知道我們可以通過 AlertManager 的配置文件去配置各種報警接收器，現(xiàn)在我們是通過 Operator 提供的 alertmanager 資源對象創(chuàng)建的組件，應(yīng)該怎樣去修改配置呢？

首先我們?nèi)?Alertmanager 的頁面上 status 路徑下面查看 AlertManager 的配置信息:

這些配置信息實際上是來自于 Prometheus-Operator 自動創(chuàng)建的名為 alertmanager-main-generated 的 Secret 對象：

$ kubectl get secret alertmanager-main-generated -n monitoring -o json | jq -r '.data."alertmanager.yaml"' | base64 --decode
"global":
  "resolve_timeout": "5m"
"inhibit_rules":
- "equal":
  - "namespace"
  - "alertname"
  "source_match":
    "severity": "critical"
  "target_match_re":
    "severity": "warning|info"
- "equal":
  - "namespace"
  - "alertname"
  "source_match":
    "severity": "warning"
  "target_match_re":
    "severity": "info"
"receivers":
- "name": "Default"
- "name": "Watchdog"
- "name": "Critical"
"route":
  "group_by":
  - "namespace"
  "group_interval": "5m"
  "group_wait": "30s"
  "receiver": "Default"
  "repeat_interval": "12h"
  "routes":
  - "match":
      "alertname": "Watchdog"
    "receiver": "Watchdog"
  - "match":
      "severity": "critical"
    "receiver": "Critical"

我們可以看到內(nèi)容和上面查看的配置信息是一致的，所以如果我們想要添加自己的接收器，我們就可以直接更改這個文件，但是這里的內(nèi)容是 base64 編碼過后的，如果手動添加內(nèi)容就非常不方便。

AlertmanagerConfig

為此 Prometheus-Operator 新增了一個 AlertmanagerConfig 的 CRD，比如我們將 Critical 這個接收器的報警信息都發(fā)送到釘釘進行報警。

首先在 monitoring 命名空間下面部署一個簡單的釘釘 webhook 處理器，前面 Alertmanager 章節(jié)已經(jīng)學(xué)習(xí)過，這里就不贅述了。

然后新建一個 AlertmanagerConfig 類型的資源對象，可以通過 kubectl explain alertmanagerconfig 或者在線 API 文檔(https://github.com/prometheus-operator/prometheus-operator/blob/master/Documentation/user-guides/alerting.md)來查看字段的含義

# alertmanager-config.yaml
apiVersion: monitoring.coreos.com/v1alpha1
kind: AlertmanagerConfig
metadata:
  name: dinghook
  namespace: monitoring
  labels:
    alertmanagerConfig: example
spec:
  receivers:
    - name: Critical
      webhookConfigs:
        - url: http://dingtalk-hook:5000
          sendResolved: true
  route:
    groupBy: ["namespace"]
    groupWait: 30s
    groupInterval: 5m
    repeatInterval: 12h
    receiver: Critical
    routes:
      - receiver: Critical
        match:
          severity: critical

不過如果直接創(chuàng)建上面的配置是不會生效的，我們需要添加一個 Label 標簽，并在 Alertmanager 的資源對象中通過標簽來關(guān)聯(lián)上面的這個對象，比如我們這里新增了一個 Label 標簽：alertmanagerConfig: example，然后需要重新更新 Alertmanager 對象，添加 alertmanagerConfigSelector 屬性去匹配 AlertmanagerConfig 資源對象：

# alertmanager-alertmanager.yaml
apiVersion: monitoring.coreos.com/v1
kind: Alertmanager
metadata:
  labels:
    alertmanager: main
  name: main
  namespace: monitoring
spec:
  image: quay.io/prometheus/alertmanager:v0.21.0
  nodeSelector:
    kubernetes.io/os: linux
  replicas: 3
  securityContext:
    fsGroup: 2000
    runAsNonRoot: true
    runAsUser: 1000
  serviceAccountName: alertmanager-main
  version: v0.21.0
  configSecret:
  alertmanagerConfigSelector: # 匹配 AlertmanagerConfig 的標簽
    matchLabels:
      alertmanagerConfig: example

現(xiàn)在我們重新更新上面的資源對象：

kubectl apply -f alertmanager-config.yaml
kubectl apply -f alertmanager-alertmanager.yaml

更新完成后默認的配置會和我們創(chuàng)建的配置進行合并，我們可以重新查看生成的 Secret 資源對象內(nèi)容，也可以直接查看 Alertmanager 的 WEB UI 界面的配置內(nèi)容：

可以看到我們在 AlertmanagerConfig 里面定義的名為 Critical 的 Receiver，在最終生成的配置中名稱了 monitoring-dinghook-Critical，格式為 <namespace>-<name>-<receiver name>。

點擊屏末 | 閱讀原文 | 即刻學(xué)習(xí)

個人視頻號

最后推薦下自己的個人視頻號，內(nèi)容不局限于技術(shù)了，對視頻制作感興趣的童鞋也可以加我個人微信(iEverything)交流

Prometheus Operator 使用 AlertmanagerConfig 進行報警配置

配置 PrometheusRule

配置報警

AlertmanagerConfig