Debian上Kubernetes如何监控集群

1. 使用Prometheus+Grafana组合（主流方案）
Prometheus是云原生监控的事实标准，擅长采集时间序列指标；Grafana则是可视化工具，可将Prometheus中的数据转化为直观的仪表盘。在Debian上，可通过Helm快速部署：

添加Prometheus Helm仓库：helm repo add prometheus-community https://prometheus-community.github.io/helm-charts && helm repo update
创建监控命名空间：kubectl create namespace monitoring
安装Prometheus Operator（包含Prometheus、Grafana等组件）：helm install kube-prometheus-stack prometheus-community/kube-prometheus-stack -n monitoring
访问Grafana（默认地址为http://<Grafana-Service-IP>:3000，默认账号admin/admin），添加Prometheus作为数据源，导入Kubernetes官方仪表盘（如Kubernetes Cluster Monitoring），即可查看节点、Pod、容器等资源的CPU、内存、网络等指标。

2. 命令行工具（快速排查问题）

kubectl：Kubernetes原生命令行工具，通过简单命令即可获取集群状态。常用命令包括：
- kubectl get nodes：查看节点状态（Ready表示正常）；
- kubectl get pods --all-namespaces：查看所有命名空间的Pod状态（Running表示运行中）；
- kubectl top nodes：查看节点资源使用情况（需安装Metrics Server）；
- kubectl describe pod <pod-name>：查看Pod详情（如事件、容器状态）。

3. 可视化工具（直观监控集群）

Kubernetes Dashboard：官方提供的Web界面，适合新手快速上手。部署步骤：
- kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v2.7.0/aio/deploy/recommended.yaml
- 获取访问URL：kubectl -n kube-system get svc kubernetes-dashboard，通过kubectl proxy或端口转发访问（如http://localhost:8001/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy/），登录后即可查看集群资源。
K9s：终端可视化工具，适合喜欢命令行的用户。安装：curl -LO "https://github.com/derailed/k9s/releases/latest/download/k9s_$(uname -s)_$(uname -m).tar.gz" && tar xzvf k9s_*.tar.gz && sudo mv k9s /usr/local/bin。运行k9s后，可通过快捷键（如d查看详情、l查看日志）快速监控Pod、节点、部署等资源。

4. 节点与容器监控（基础指标采集）

cAdvisor：默认集成在Kubelet中，采集节点和容器的CPU、内存、磁盘、网络等指标。无需额外安装，直接通过节点IP+端口8080访问（如http://<node-ip>:8080/metrics），或通过Prometheus抓取其指标。
node-exporter：采集节点级别的系统指标（如CPU、内存、磁盘IO、网络流量）。部署：kubectl apply -f https://raw.githubusercontent.com/prometheus/node_exporter/master/deploy/daemonset.yaml，访问http://<node-ip>:9100/metrics查看指标，或通过Prometheus配置抓取。

5. 集群状态监控（对象状态指标）

kube-state-metrics：监听Kubernetes API Server，生成集群对象（如Deployment、Pod、Node、Service）的状态指标（如Replicas数量、Pod重启次数、节点可用性）。部署：kubectl apply -f https://github.com/kubernetes-sigs/kube-state-metrics/releases/latest/download/components.yaml，访问http://<kube-state-metrics-ip>:8081/metrics查看指标，或通过Prometheus抓取，为Grafana仪表盘提供对象状态数据。

6. 日志与事件监控（补充排查能力）

Kubetail：将多个Pod的日志聚合到一个终端流，方便查看分布式应用的日志。安装：curl -LO https://raw.githubusercontent.com/johanhaleby/kubetail/master/kubetail，运行kubetail <pod-prefix>即可聚合该前缀的所有Pod日志。
Kubewatch：监控Kubernetes事件（如Pod创建、删除、节点故障），并将事件发送到Slack、Email等通知渠道。部署：kubectl apply -f https://raw.githubusercontent.com/bitnami-labs/kubewatch/master/deploy/kubewatch.yaml，编辑ConfigMap配置通知方式（如Slack Webhook）。

最新问答