利用Python开发Exporter，集成Prometheus和Grafana对进程监控

在现代软件开发和运维中，监控是确保系统稳定运行和快速响应问题的重要手段。Prometheus和Grafana的组合是监控领域的强大工具，它们能够收集、处理和展示各种指标数据。本文将介绍如何利用Python开发一个Exporter，通过Prometheus收集数据，并在Grafana中展示进程监控指标。

1. 环境准备

首先，确保你的环境中已经安装了Python、Prometheus和Grafana。以下是基本的安装步骤：

安装Python：Python的安装可以通过其官网下载并安装，确保版本为Python 3.x。

安装Prometheus：从Prometheus的GitHub发布页面下载对应操作系统的安装包，解压并配置。

安装Grafana：从Grafana的官网下载并安装Grafana。

2. Python Exporter开发

Exporter是Prometheus的一个组件，用于暴露监控数据给Prometheus。我们将使用Python的prometheus_client库来开发一个简单的Exporter，用于监控系统进程。

步骤1：安装三方库

在Python环境中安装prometheus_client、pyyaml、psutil库：

pip install prometheus_client

pip install pyyaml

pip install psutil

步骤2：编写Exporter脚本

创建一个Python脚本，用于收集系统进程的信息并暴露给Prometheus。

import psutil

import yaml

from prometheus_client import start_http_server, Gauge, Info

import time

from concurrent.futures import ThreadPoolExecutor

# 读取YAML文件

def read_yaml(file_path):

    with open(file_path, 'r') as file:

        try:

            data = yaml.safe_load(file)

            return data

        except yaml.YAMLError as e:

            print(e)

# 获取进程数据

def print_process(pid):

    # 使用进程ID获取进程对象

    try:

        process = psutil.Process(pid)

    except psutil.NoSuchProcess:

        print(f"进程ID {pid} 不存在")

        time.sleep(1)

        return [-1, '进程不存在', 0, 0, 0]

    # 打印结果: 进程ID, 进程名称, CPU利用率, 内存, 内存占用率

    return [pid, process.name(), process.cpu_percent(interval=1), process.memory_info().rss, process.memory_percent()]

# 使用函数

yaml_file_path = 'config.yml'  # 替换为你的YAML文件路径

data = read_yaml(yaml_file_path)

pid_list = data['pid_list']

# exporter信息

subprocess_exporter_info = Info('subprocess_exporter_info', '子进程监控基础信息')

subprocess_info = Gauge('subprocess_info', '子进程信息', ['pid', 'name'])

cpu_utilization = Gauge('cpu_utilization', 'CPU利用率', ['pid', 'name'])

memory = Gauge('memory', '内存(MB)', ['pid', 'name'])

memory_usage_rate = Gauge('memory_usage_rate', '内存占用率', ['pid', 'name'])

# 赋值

subprocess_exporter_info.info({'version': '1.1.2', 'author': '岳罡', 'blog': 'https://www.cnblogs.com/test-gang'})

def process_request(pid):

    a = print_process(pid)

    subprocess_info.labels(pid=f'{pid}', name=f'{a[1]}')

    cpu_utilization.labels(pid=f'{pid}', name=f'{a[1]}').set(a[2])

    memory.labels(pid=f'{pid}', name=f'{a[1]}').set(a[3]/1048576)

    memory_usage_rate.labels(pid=f'{pid}', name=f'{a[1]}').set(a[4])

if __name__ == '__main__':

    # 启动 HTTP 服务器，默认监听在 8000 端口

    start_http_server(data['config']['start_http_server'])

    # 循环处理请求

    while True:

        # 创建 ThreadPoolExecutor

        with ThreadPoolExecutor(max_workers=4) as executor:  # 控制线程池大小为4

            # 提交任务给线程池

            future = [executor.submit(process_request, pid) for pid in pid_list]

            time.sleep(4)

创建一个yml文件，用于为python脚本传输进程PID和http_server的端口号。

config:

  start_http_server: 8000

pid_list: [21352, 123]

步骤3：运行Exporter

运行上述Python脚本，它将在8000端口上启动一个HTTP服务器，等待Prometheus的拉取请求。

gitee仓库：https://gitee.com/qdyg/subprocess_exporter
github仓库：https://github.com/YueGang0725/subprocess_exporter

3. Prometheus配置

接下来，需要配置Prometheus以从我们的Exporter中拉取数据。

步骤1：修改Prometheus配置文件

找到Prometheus的配置文件（通常是prometheus.yml），并添加一个job来抓取我们的Exporter：

scrape_configs:

  - job_name: 'process_exporter'

    scrape_interval: 15s

    static_configs:

      - targets: ['localhost:8000']

步骤2：重启Prometheus服务

保存配置文件并重启Prometheus服务，使其加载新的配置。

4. Grafana配置

最后，在Grafana中配置数据源和仪表盘，以展示从Prometheus获取的进程监控数据。

步骤1：添加Prometheus数据源

在Grafana中，添加一个新的数据源，选择Prometheus，并填写Prometheus服务器的URL（如http://localhost:9090）。

步骤2：导入仪表盘

导入进程性能详情.json仪表盘模板

gitee仓库：https://gitee.com/qdyg/subprocess_exporter
github仓库：https://github.com/YueGang0725/subprocess_exporter

步骤3：运行结果