k8s部署prometheus/grafana

[!TIP]
K8S部署prometheus以及grafana
并且使用Ingress对外访问

Ingress Nginx Controller
的安装教程地址:(https://janrs.com/2023/02/k8s%e9%83%a8%e7%bd%b2ingress-controller/)

转载请注明出处:https://janrs.com


k8s部署prometheus以及grafana

k8s部署prometheus以及grafana,并且挂载nfs进行持久化

1.创建nfs服务

查看教程,地址:(https://janrs.com/2023/02/k8s%e9%83%a8%e7%bd%b2nfs/)

2.创建命名空间

创建


kubectl create ns monitoring

查看创建结果

kubectl get ns

显示

NAME              STATUS   AGE
default           Active   23h
ingress-nginx     Active   175m
kube-node-lease   Active   23h
kube-public       Active   23h
kube-system       Active   23h
kuboard           Active   23h
monitoring        Active   4s
nfs               Active   23h
web-nginx         Active   144m

2-1.生成拉取镜像密钥

生成密钥

[!NOTE]
--docker-password参数修改为自己的密码

kubectl --namespace monitoring create secret docker-registry aliimagesecret --docker-server=registry.cn-shenzhen.aliyuncs.com --docker-username=yjy86868@163.com --docker-password=${PASSWORD} --docker-email=yjy86868@163.com

3.创建pvc

创建grafana.-pvc.yaml

vim  monitoring-grafana-pvc.yaml

添加以下yaml

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: grafana-pvc
  namespace: monitoring
spec:
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 10Gi
  storageClassName: nfs-storage
status:
  accessModes:
    - ReadWriteMany
  capacity:
    storage: 10Gi

执行创建

kubectl apply -f monitoring-grafana-pvc.yaml

查看创建结果

kubectl get pvc -n monitoring

显示状态为Bound表示成功

NAME          STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
grafana-pvc   Bound    pvc-2d0399ca-b20b-4df0-b8f9-a4674410627e   10Gi       RWX            nfs-storage    27s

4.部署prometheus

[!NOTE]
官方提供的教程有些镜像下载不了,所以要手动更换一下镜像
并且官方提供的教程没有挂载nfs,这里部署之前需要先修改一下再进行部署

4-1.下载prometheus

直接下载

git clone https://github.com/coreos/kube-prometheus.git

[h4]4-1-1.修改grafana-deployment.yaml[/h4]

修改grafana deployment挂载创建的pvc

打开grafana-deployment.yaml

cd kube-prometheus/manifests/ &&
vim grafana-deployment.yaml

找到带有emptyDirgrafana-storage配置参数

将官方的emptyDir更换为persistentVolumeClaim

并且添加上面创建的pvc: grafana-pvc 然后保存。

[!NOTE]
grafana-pvc就是刚才上面创建pvc时指定的名称

...
      securityContext:
        fsGroup: 65534
        runAsNonRoot: true
        runAsUser: 65534
      serviceAccountName: grafana
      volumes:
      # 在这里修改
      #- emptyDir: {}
      - name: nfs-storage
        persistentVolumeClaim:
          claimName: grafana-pvc
      - name: grafana-datasources
...

修改指定存储的位置

...
        securityContext:
          allowPrivilegeEscalation: false
          capabilities:
            drop:
            - ALL
          readOnlyRootFilesystem: true
        volumeMounts:
        - mountPath: /var/lib/grafana
          # 位置在这。同样修改为上面创建的sc
          name: nfs-storage
          readOnly: false
        - mountPath: /etc/grafana/provisioning/datasources
          name: grafana-datasources
          readOnly: false
        - mountPath: /etc/grafana/provisioning/dashboards
          name: grafana-dashboards
          readOnly: false
        - mountPath: /tmp
          name: tmp-plugins
...

[h4]4-1-2.prometheus-k8s持久化[/h4]

[!NOTE]
有出现Warning可以不用管

prometheus-server 获取各端点数据并存储与本地,创建方式为自定义资源 crd中的prometheus

创建自定义资源prometheus后,会启动一个statefulset,即prometheus-server。 默认是没有配置持久化存储的。

修改prometheus-prometheus.yaml

cd kube-prometheus/manifests/ &&
vim prometheus-prometheus.yaml  

修改位置为如下所示

[!NOTE]
storageClassName参数指定的值为已创建好的NFS Storage Class Name

  enableFeatures: []
  externalLabels: {}
  imagePullSecrets:
    - name: aliimagesecret
  image: registry.cn-shenzhen.aliyuncs.com/yjy_k8s/k8s-prometheus-prometheus:v2.38.0
  #image: quay.io/prometheus/prometheus:v2.38.0
  # 在这里添加storage
  storage:
    volumeClaimTemplate:
      spec:
        storageClassName: nfs-storage
        accessModes: ["ReadWriteOnce"]
        resources:
          requests:
            storage: 10Gi
  nodeSelector:
    kubernetes.io/os: linux
  podMetadata:
    labels:
      app.kubernetes.io/component: prometheus
      app.kubernetes.io/instance: k8s
      app.kubernetes.io/name: prometheus
      app.kubernetes.io/part-of: kube-prometheus
      app.kubernetes.io/version: 2.38.0

[h4]4-1-3.修改存储时长[/h4]

设置一下日志的保存时间

cd manifests &&
vim prometheusOperator-deployment.yaml

修改位置为

    spec:
      automountServiceAccountToken: true
      imagePullSecrets:
        - name: aliimagesecret
      containers:
      - args:
        - --kubelet-service=kube-system/kubelet
        - --prometheus-config-reloader=registry.cn-shenzhen.aliyuncs.com/yjy_k8s/k8s-prometheus-prometheus-config-reloader:v0.58.0
        # 在这里添加下面这行配置
        - storage.tsdb.retention.time=180d   ## 修改存储时长
        #- --prometheus-config-reloader=quay.io/prometheus-operator/prometheus-config-reloader:v0.58.0
        image: registry.cn-shenzhen.aliyuncs.com/yjy_k8s/k8s-prometheus-prometheus-operator:v0.58.0
        #image: quay.io/prometheus-operator/prometheus-operator:v0.58.0
        name: prometheus-operator
        ports:
        - containerPort: 8080
          name: http
        resources:
          limits:
            cpu: 200m
            memory: 200Mi

[h4]4-1-4.查看所需镜像[/h4]
[h5]a) 替换镜像地址[/h5]

有些镜像国内下载不了,需要替换成国内的

进入文件夹

cd kube-prometheus/manifests

列出所有需要用到的镜像

grep -riE 'quay.io|k8s.gcr|grafana/|image:' *

显示

alertmanager-alertmanager.yaml:  image: quay.io/prometheus/alertmanager:v0.24.0
blackboxExporter-deployment.yaml:        image: quay.io/prometheus/blackbox-exporter:v0.22.0
blackboxExporter-deployment.yaml:        image: jimmidyson/configmap-reload:v0.5.0
blackboxExporter-deployment.yaml:        image: quay.io/brancz/kube-rbac-proxy:v0.13.0
grafana-deployment.yaml:        image: grafana/grafana:9.1.1
grafana-prometheusRule.yaml:        runbook_url: https://runbooks.prometheus-operator.dev/runbooks/grafana/grafanarequestsfailing
kubeStateMetrics-deployment.yaml:        image: k8s.gcr.io/kube-state-metrics/kube-state-metrics:v2.6.0
kubeStateMetrics-deployment.yaml:        image: quay.io/brancz/kube-rbac-proxy:v0.13.0
kubeStateMetrics-deployment.yaml:        image: quay.io/brancz/kube-rbac-proxy:v0.13.0
nodeExporter-daemonset.yaml:        image: quay.io/prometheus/node-exporter:v1.3.1
nodeExporter-daemonset.yaml:        image: quay.io/brancz/kube-rbac-proxy:v0.13.0
prometheusAdapter-deployment.yaml:        image: k8s.gcr.io/prometheus-adapter/prometheus-adapter:v0.9.1
prometheusOperator-deployment.yaml:        - --prometheus-config-reloader=quay.io/prometheus-operator/prometheus-config-reloader:v0.58.0
prometheusOperator-deployment.yaml:        image: quay.io/prometheus-operator/prometheus-operator:v0.58.0
prometheusOperator-deployment.yaml:        image: quay.io/brancz/kube-rbac-proxy:v0.13.0
prometheus-prometheus.yaml:  image: quay.io/prometheus/prometheus:v2.38.0

[h5]b) 下载镜像[/h5]

把上面有用到的所有镜像依次到阿里镜像仓库构建好

[!NOTE]
上面检索出来的镜像有些是重复的。但是需要在不同的yaml文件使用
我个人使用的是阿里的私人镜像仓库。把上面显示的所有需要用到的镜像先到阿里镜像仓库构建好。然后替换。

[h5]c) 替换镜像[/h5]

根据上面检索出来的镜像所在文件,把阿里构建好的镜像全部替换上去

[!NOTE]
如果跟我一样使用的是私人的镜像仓库,还要把密钥设置上去。

4-2.安装

根据官网教程直接安装

kubectl create -f manifests/setup

以上步骤安装后,需要执行以下命令查看是否已经准备好

until kubectl get servicemonitors --all-namespaces ; do date; sleep 1; echo ""; done

显示以下信息表示已经准备好安装

No resources found

执行安装

kubectl create -f manifests/

安装后查看pods

kubectl get pods -n monitoring

显示以下信息表示安装成功

NAME                                   READY   STATUS    RESTARTS   AGE
alertmanager-main-0                    2/2     Running   0          35m
alertmanager-main-1                    2/2     Running   0          35m
alertmanager-main-2                    2/2     Running   0          35m
blackbox-exporter-67976f746b-x45vl     3/3     Running   0          35m
grafana-98b487d5f-mtd8q                1/1     Running   0          35m
kube-state-metrics-7fbb67d4df-cwhf7    3/3     Running   0          35m
node-exporter-7zckp                    2/2     Running   0          35m
node-exporter-fz6k8                    2/2     Running   0          35m
node-exporter-lls7g                    2/2     Running   0          35m
prometheus-adapter-7f5d756f48-pm4nb    1/1     Running   0          35m
prometheus-adapter-7f5d756f48-tlhrg    1/1     Running   0          35m
prometheus-k8s-0                       2/2     Running   0          35m
prometheus-k8s-1                       2/2     Running   0          35m
prometheus-operator-84576d8b79-2r4ss   2/2     Running   0          35m

4-3.配置NetworkPolicy

[!NOTE]
k8s网络策略是基于pod实现了。这个需要了解
网络规则的应用是通过podSelector来控制的
本教程使用的是Calico

官方提供的默认的部署文件是有网络隔离的,ingress nginx无法访问。

需要手动进行配置让ingress nginx controller访问。

[h4]4-3-1.查看Ingress Nginx的podLabel[/h4]

查看ingress-nginxpod

kubectl get pods -n ingress-nginx

显示

NAME                                   READY   STATUS      RESTARTS   AGE
ingress-nginx-admission-create-hxh9h   0/1     Completed   0          17h
ingress-nginx-admission-patch-5hh89    0/1     Completed   0          17h
ingress-nginx-controller-g8wdx         1/1     Running     0          17h
nginx-errors-5c6dd76c59-xnb4b          1/1     Running     0          17h

查看label

kubectl get pod ingress-nginx-controller-g8wdx --show-labels -n ingress-nginx

显示

NAME                             READY   STATUS    RESTARTS   AGE   LABELS
ingress-nginx-controller-g8wdx   1/1     Running   0          17h   app.kubernetes.io/component=controller,app.kubernetes.io/instance=ingress-nginx,app.kubernetes.io/name=ingress-nginx,controller-revision-hash=6b75b78ffb,pod-template-generation=1

[!NOTE]
可以看到有一个很多个label。通常有关网络的使用,是选用app.kubernetes.io/name=ingress-nginx来选择的

[h4]4-3-2.配置Prometheus NetworkPolicy[/h4]

打开prometheus-networkPolicy.yaml

vim prometheus-networkPolicy.yaml

添加网络规则
直接在nodeSelector添加上面获取到的ingress-nginxlabel

...
  - from:
    - podSelector:
        matchLabels:
          app.kubernetes.io/name: grafana
    ports:
    - port: 9090
      protocol: TCP
  podSelector:
    matchLabels:
      app.kubernetes.io/component: prometheus
      app.kubernetes.io/instance: k8s
      app.kubernetes.io/name: prometheus
      # 在这里添加
      app.kubernetes.io/name: ingress-nginx
      app.kubernetes.io/part-of: kube-prometheus
  policyTypes:
  - Egress
  - Ingress
...

重新加载配置

kubectl apply -f prometheus-networkPolicy.yaml

显示

networkpolicy.networking.k8s.io/prometheus-k8s configured

[h4]4-3-3.配置Grafana NetworkPolicy[/h4]

打开grafana-networkPolicy.yaml

vim grafana-networkPolicy.yaml

添加网络规则
直接在nodeSelector添加上面获取到的ingress-nginxlabel

...
  - from:
    - podSelector:
        matchLabels:
          app.kubernetes.io/name: grafana
    ports:
    - port: 9090
      protocol: TCP
  podSelector:
    matchLabels:
      app.kubernetes.io/component: prometheus
      app.kubernetes.io/instance: k8s
      app.kubernetes.io/name: prometheus
      # 在这里添加
      app.kubernetes.io/name: ingress-nginx
      app.kubernetes.io/part-of: kube-prometheus
  policyTypes:
  - Egress
  - Ingress
...

重新加载配置

kubectl apply -f grafana-networkPolicy.yaml

显示

networkpolicy.networking.k8s.io/grafana configured

4-4.创建Ingress

配置好网络策略后,就可以创建Ingress进行访问了

创建monitoring-ingress.yaml

vim monitoring-ingress.yaml

添加以下yaml

[!NOTE]
下面用到的域名是我自己的。如果要域名访问。修改 - host参数替换成自己的。
怎么给Ingress Nginx设置域名访问。在本文开头提到的教程有

---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    k8s.eip.work/workload: grafana
    k8s.kuboard.cn/workload: grafana
  generation: 2
  labels:
    app: grafana
  name: grafana
  namespace: monitoring
spec:
  ingressClassName: 'nginx'
  rules:
    - host: k8s-grafana.janrs.com
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: grafana
                port:
                  number: 3000
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    k8s.kuboard.cn/workload: prometheus-k8s
  generation: 2
  labels:
    app: prometheus
    prometheus: k8s
  managedFields:
    - apiVersion: networking.k8s.io/v1
  name: prometheus-k8s
  namespace: monitoring
spec:
  ingressClassName: 'nginx'
  rules:
    - host: k8s-prom.janrs.com
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: prometheus-k8s
                port:
                  number: 9090

执行创建

kubectl apply -f monitoring-ingress.yaml

查看创建结果

kubectl get ingress -n monitoring

显示

NAME             CLASS   HOSTS                   ADDRESS          PORTS   AGE
grafana          nginx   k8s-grafana.janrs.com   172.31.235.118   80      3h31m
prometheus-k8s   nginx   k8s-prom.janrs.com      172.31.235.118   80      3h31m

域名访问
直接在浏览器打开自己配置的域名

Grafana登录页面截图如下

Prometheus页面截图如下

4-5.添加Dashboard

在左侧控制栏Dashboards -> Import

输入常用的DashboardID : 8919

然后导入即可

4-6.每个节点部署Node Exporter

[!NOTE]
默认部署的Node Exporter只有监控被调度到的那个节点
要监控到每个节点需要用到节点亲和性
节点亲和性也是用到label标签选择器。具体自行谷歌补充知识。很简单
这样就可以在上面添加的ID8919Dashboard监控到每个节点的资源使用情况

打开nodeExporter-daemonset.yaml

vim nodeExporter-daemonset.yaml

修改为节点亲和性部署

注释掉硬性调度策略nodeSelector。然后改为亲和性调度部署。代码如下:

...
          runAsGroup: 65532
          runAsNonRoot: true
          runAsUser: 65532
      hostNetwork: true
      hostPID: true
      # 注释或删除掉该硬性调度
      #nodeSelector:
        #node-label-prometheus: 'true'
        #kubernetes.io/os: linux
      #添加节点亲和性调度
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: node-label-node-exporter
                operator: In
                values:
                - 'true'
      priorityClassName: system-cluster-critical
      securityContext:
        runAsNonRoot: true
        runAsUser: 65534
      serviceAccountName: node-exporter
...

执行

kubectl apply -f nodeExporter-daemonset.yaml

查看执行结果

kubectl get pods -n monitoring -o wide | grep node-exporter

显示

[!NOTE]
我这边有3Worker节点。所以只有显示三个

node-exporter-2q2rw                    2/2     Running   0          15m   172.16.222.231   k8s-node02   <none>           <none>
node-exporter-jprr2                    2/2     Running   0          14m   172.16.222.233   k8s-node03   <none>           <none>
node-exporter-pj7gn                    2/2     Running   0          15m   172.16.222.230   k8s-node01   <none>           <none>

Grafana监控页面截图

删除Prometheus

kubectl delete --ignore-not-found=true -f manifests/ -f manifests/setup

至此。k8s部署prometheus以及grafana成功。