监控体系需要的四个服务

n * Node Exporter (收集Host硬件和操作系统信息)
n * cAdvisor (负责收集Host上运行的容器信息)
1 * Prometheus Server(普罗米修斯监控主服务器 )
1 * Grafana (展示普罗米修斯监控界面）

被监控的服务器上启动Node Exporter (收集Host硬件和操作系统信息) 以及 cAdvisor (负责收集Host上运行的容器信息)即可

然后在prometheus.yml配置文件中配置上targets地址即可。

规划

目前只有一个Host(192.168.0.108)

NodeExporter端口9100

cAdvisor端口8080

Prometheus端口9090

Grafana端口3000

部署规划

启动NodeExporter

https://github.com/prometheus/node_exporter/

docker run -d -p 9100:9100 \
-v "/proc:/host/proc" \
-v "/sys:/host/sys" \
-v "/:/rootfs" \
-v "/etc/localtime:/etc/localtime" \
--net=host \
--name=node-exporter \
prom/node-exporter \
--path.procfs /host/proc \
--path.sysfs /host/sys \
--collector.filesystem.ignored-mount-points "^/(sys|proc|dev|host|etc)($|/)"

启动cAdvisor

https://github.com/google/cadvisor

docker run -d \
--volume=/:/rootfs:ro \
--volume=/var/run:/var/run:rw \
--volume=/sys:/sys:ro \
--volume=/var/lib/docker/:/var/lib/docker:ro \
--publish=8080:8080 \
--detach=true \
--name=cadvisor \
--net=host \
-v "/etc/localtime:/etc/localtime" \
google/cadvisor:latest

启动Prometheus

Prometheus - Monitoring system & time series database

prometheus的配置文件，主要是填写监听的地址（所有被监控机器的NodeExporter服务与cAdvisor服务的列表）

# my global config
global:
  scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
  alertmanagers:
    - static_configs:
        - targets:
          # - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first_rules.yml"
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: "prometheus"
    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.
    static_configs:
      - targets: ["localhost:9090"]
  - job_name: "docker-cluster"
    static_configs:
      - targets: ["192.168.0.108:9100","192.168.0.108:8080"]

启动,注意映射配置文件的目录

docker run -d -p 9090:9090 \
-v /opt/sc/runner/prometheus.yml:/etc/prometheus/prometheus.yml \
-v "/etc/localtime:/etc/localtime" \
--name prometheus \
--net=host \
prom/prometheus

启动Grafana

docker run -d -i -p 3000:3000 \
-v "/etc/localtime:/etc/localtime" \
-e "GF_SERVER_ROOT_URL=http://grafana.server.name" \
-e "GF_SECURITY_ADMIN_PASSWORD=admin8888" \
--name grafana \
--net=host \
grafana/grafana

用户名密码admin/admin8888

配置Prometheus作为Datasource

浏览器访问Gafana的地址192.168.0.108:3000。

Configuration - Data sources - Add data source - 选择Prometheus类型 - 填写URL为：192.168.0.108:9090即可，其它可以默认

导入Dashborad

在下面网站搜索到想要添加的Dashborad。

Docker 启动 cAdvisor 报错问题解决

Failed to start container manager: inotify_add_watch
/sys/fs/cgroup/cpuacct,cpu: no such file or directory

1 2	`mount -o remount,rw '/sys/fs/cgroup' ln -s /sys/fs/cgroup/cpu,cpuacct /sys/fs/cgroup/cpuacct,cpu`

然后重启容器即可。sd

初尝Prometheus与Grafana监控Docker

http://www.tung7.com/实践出真知/初尝Prometheus与Grafana监控Docker.html

Author

Tung7

Posted on

August 4, 2021

Licensed under