上一篇文章中,我们讲到如何安装 kube dashboard 对我们 k8s 环境的对象运行状况进行观测,今天我们将讲解如何使用 kube-prometheus-stack 对其进行监控。
kube-prometheus-stack 安装
这里我们直接使用 helm 安装,文档地址参考 https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack。
主要命令如下:
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
helm show values prometheus-community/kube-prometheus-stack > ./values.yaml # 更改默认配置,比如抓取、告警、以及持久卷等。
helm upgrade --install kube-prometheus-stack prometheus-community/kube-prometheus-stack -f values.yaml -n monitoring
备注: 如果遇到 x509的问题,可以使用参数 –insecure-skip-tls-verify 忽略证书校验。
k8s 服务运行状态
当命令运行成功后,我们已经在 monitoring 命令空间下部署了 kube-prometheus-stack,当我们执行 kubectl get all -n monitoring
,可以看到所有的相关对象运行情况如下:
# pods
NAME READY STATUS RESTARTS AGE
pod/alertmanager-kube-prometheus-stack-alertmanager-0 2/2 Running 2 (4h21m ago) 19h
pod/kube-prometheus-stack-grafana-65df74f4fb-7t4k7 3/3 Running 3 (4h21m ago) 19h
pod/kube-prometheus-stack-kube-state-metrics-57dc5579bd-4v97f 1/1 Running 2 (4h20m ago) 19h
pod/kube-prometheus-stack-operator-75b7d5b98d-nxkft 1/1 Running 2 (4h19m ago) 19h
pod/kube-prometheus-stack-prometheus-node-exporter-2545c 1/1 Running 1 (4h21m ago) 19h
pod/kube-prometheus-stack-prometheus-node-exporter-4twsb 1/1 Running 1 (4h21m ago) 19h
pod/kube-prometheus-stack-prometheus-node-exporter-z6bff 1/1 Running 1 (4h21m ago) 19h
pod/prometheus-kube-prometheus-stack-prometheus-0 2/2 Running 2 (4h21m ago) 19h
# services
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/alertmanager-operated ClusterIP None <none> 9093/TCP,9094/TCP,9094/UDP 19h
service/kube-prometheus-stack-alertmanager ClusterIP 10.96.223.171 <none> 9093/TCP,8080/TCP 19h
service/kube-prometheus-stack-grafana ClusterIP 10.96.209.142 <none> 80/TCP 19h
service/kube-prometheus-stack-kube-state-metrics ClusterIP 10.96.173.162 <none> 8080/TCP 19h
service/kube-prometheus-stack-operator ClusterIP 10.96.156.60 <none> 443/TCP 19h
service/kube-prometheus-stack-prometheus ClusterIP 10.96.33.206 <none> 9090/TCP,8080/TCP 19h
service/kube-prometheus-stack-prometheus-node-exporter ClusterIP 10.96.49.229 <none> 9100/TCP 19h
service/prometheus-operated ClusterIP None <none> 9090/TCP 19h
# daemonset
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
daemonset.apps/kube-prometheus-stack-prometheus-node-exporter 3 3 3 3 3 kubernetes.io/os=linux 19h
# deployment
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/kube-prometheus-stack-grafana 1/1 1 1 19h
deployment.apps/kube-prometheus-stack-kube-state-metrics 1/1 1 1 19h
deployment.apps/kube-prometheus-stack-operator 1/1 1 1 19h
# replicaset
NAME DESIRED CURRENT READY AGE
replicaset.apps/kube-prometheus-stack-grafana-65df74f4fb 1 1 1 19h
replicaset.apps/kube-prometheus-stack-kube-state-metrics-57dc5579bd 1 1 1 19h
replicaset.apps/kube-prometheus-stack-operator-75b7d5b98d 1 1 1 19h
# statefulset
NAME READY AGE
statefulset.apps/alertmanager-kube-prometheus-stack-alertmanager 1/1 19h
statefulset.apps/prometheus-kube-prometheus-stack-prometheus 1/1 19h
通过最终部署的 pods 可以看到,默认 kube-prometheus-stack 配置一共包含:
- 1 个 Prometheus server
- 3 个 node exporter, 因为本地 kind 部署 k8s 集群有1个控制面节点,2个 Worker 节点
- 1 个 alertmanager
- 1 个 Grafana
- 1 个 kube-state-metrics 和 一个 kube-prometheus-stack-operator
查看 Grafana 和 Prometheus 控制台
因为默认配置没有启用 ingress,所以我们可以通过修改 service 的 type 为 NodePort 方式,然后使用 node 的IP+端口进行访问。
- 修改 grafana service 的配置
kubectl edit svc/kube-prometheus-stack-grafana -n monitoring
==>
spec:
ports:
port: 80
protocol: TCP
targetPort: 3000
nodePort: 30030 # 添加对应 Node节点访问端口
....
type: NodePort # 修改为 NodePort 类型
此时通过<控制面节点IP>:30030 即可进入 grafana 页面,默认密码为 admin/prom-operator。控制面节点IP>
- 修改 prometheus service 的配置
kubectl edit svc/kube-prometheus-stack-prometheus -n monitoring
同理修改为 NodePort 类型,并指定响应端口即可。
简单总结
到目前为止,我们就通过 kube-prometheus-stack 快速搭建了 Prometheus 监控环境,除了使用 heml 之外,我还可以使用 Prometheus Operator 的 manifests 进行安装,它默认是多实例部署。