前言

prometheus 是监控应用软件类似于nagios.

安装

1.官网下载prometheus-2.2.0.linux-amd64压缩包，解压,执行./prometheus即可。这里重要的是配置文件。

a.如果要远程热加载配置文件,启动时加上--web.enable-lifecycle参数。调用指令是curl -X POST http://localhost:9090/-/reload

b.重要掌握 prometheus.yml 配置文件.prometheus启动时会加载它。

[root@vm-local1 prometheus-2.2..linux-amd64]# cat prometheus.yml

# my global config

global:

  scrape_interval:     15s # Set the scrape interval to every  seconds. Default is every  minute.

  evaluation_interval: 15s # Evaluate rules every  seconds. The default is every  minute.评估间隔

  # scrape_timeout is set to the global default (10s). 默认抓取超时10秒

# Alertmanager configuration #管理报警配置

alerting:

  alertmanagers:

  - static_configs:

    - targets: ["localhost:9093"]  #管理报警包需要单独下载，默认启动端口是9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.

rule_files:

  # - "first_rules.yml"

  # - "second_rules.yml"

  - rules/mengyuan.rules     #要发送报警，就得写规则，定义规则文件

# A scrape configuration containing exactly one endpoint to scrape:

# Here it's Prometheus itself.

scrape_configs:    #抓取配置，就是你要抓取那些主机

  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.

  - job_name: 'prometheus'  #任务名称

    # metrics_path defaults to '/metrics'  #默认抓取监控机的url后缀地址是/metrics

    # scheme defaults to 'http'.   #模式是http

    static_configs:

      - targets: ['localhost:9090','localhost:9100']

        labels:

          group: 'zus'    #targets就是要抓取的主机，对应的客户端，我这有两个，把它们俩规定为一个组，组名是zus

  - job_name: dj   #又建立个任务名称

    static_configs:

      - targets: ['localhost:8000']  #我用django自定义的客户端

注意：

localhost:9090,默认prometheus提供了数据抓取接口，9100端口是prometheus提供的一个监控客户端

2.安装prometheus客户端

官网下载node_exporter-0.16.0-rc.1.linux-amd64客户端，解压,执行./node_exporter 即可，默认是9100端口

3.如何自定义一个客户端，其实很简单，只要返回的数据库类型是这样就可以.我这用的django..只要格式正确就可以

def metrics(req):

    ss = "feiji 32" + "\n" + "caidian 31"

    return HttpResponse(ss)

4.编写 rules/mengyuan.rules 规则，规则是发送报警的前提

[root@vm-local1 rules]# cat mengyuan.rules

groups:

- name: zus

  rules:

  # Alert for any instance that is unreachable for >5 minutes.

  - alert: InstanceDown   #报警名字随便写

    expr: up == 0   #这是一个表达式，如果主机up状态为0,表示关机了，条件为真就会触发报警 可以通过$value得到值

    for: 5s         #5s内，还是0，就发送报警信息，当然是发送给报警管理器

    labels:

      severity: page  #这个类型的报警定了个标签

    annotations:

      summary: "Instance {{ $labels.instance }} down dangqian  {{ $value }}"

      description: "{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 5 minutes."

5.现在安装报警管理器

a.官网下载alertmanager-0.15.0-rc.1.linux-amd64　　

重要的还是配置文件，创建修改它

[root@vm-local1 alertmanager-0.15.0-rc.1.linux-amd64]# cat alertmanager.yml

route:

  receiver: mengyuan2  #接收的名字，默认必须有一个，对应receivers的- name

  group_wait: 1s  #等待1s

  group_interval: 1s #发送间隔1s

  repeat_interval: 1m  #重复发送等待1m分钟再发

  group_by: ["zus"]

  routes:      #路由了，匹配规则标签的severity:page 走 receiver: mengyuan , 如果routes不写，就会走默认的mengyuan2

  - receiver: mengyuan

    match:

      severity: page

receivers:

- name: 'mengyuan'

  webhook_configs:  #这我用的webhook_configs 钩子方法,  默认会把规则的报警信息发送到127.0.0.1:8000

  - url: http://127.0.0.1:8000

    send_resolved: true

- name: 'mengyuan2'

  webhook_configs:

  - url: http://127.0.0.1:8000/2

    send_resolved: true

6.django接收报警发过来的消息

用Django的 request.body会受到json格式的数据,大概像这样

{"receiver":"mengyuan","status":"resolved","alerts":[{"status":"resolved","labels":{"alertname":"InstanceDown","group":"zus","instance":"localhost:9100","job":"prometheus","severity":"page"},"annotations":{"description":"localhost:9100 of job prometheus has been down for more than 5 minutes.","summary":"Instance localhost:9100 down dangqian 0"},"startsAt":"2018-04-06T22:34:13.51281763+08:00","endsAt":"2018-04-06T23:07:43.514552824+08:00","generatorURL":"http://vm-local1:9090/graph?g0.expr=up+%3D%3D+0\u0026g0.tab=1"}],"groupLabels":{},"commonLabels":{"alertname":"InstanceDown","group":"zus","instance":"localhost:9100","job":"prometheus","severity":"page"},"commonAnnotations":{"description":"localhost:9100 of job prometheus has been down for more than 5 minutes.","summary":"Instance localhost:9100 down dangqian 0"},"externalURL":"http://vm-local1:9093","version":"4","groupKey":"{}/{severity=\"page\"}:{}"}

到此，我就可以根据收到的数据，调用邮件接口，或其他第三方报警接口了。

总结：

本人也是刚入门。做的一个笔记。

prometheus 笔记的更多相关文章

Prometheus笔记（二）监控go项目实时给grafana展示
欢迎加入go语言学习交流群 636728449 Prometheus笔记(二)监控go项目实时给grafana展示 Prometheus笔记(一)metric type 文章目录一.promethe ...
Prometheus笔记（一）metric type
欢迎加入go语言学习交流群 636728449 Prometheus笔记(二)监控go项目实时给grafana展示 Prometheus笔记(一)metric type 文章目录 Prometheus ...
Grafana-监控-报警-运维文档
Grafana运维文档 2019/09/23 Chenxin Wuweiwei 参考资料 https://grafana.com/grafana https://blog.52itstyle.vip/ ...
Prometheus学习笔记（7）PromQL玩法入门
目录 1.什么是PromQL??? 2.如何查询??? 1.什么是PromQL??? PromQL是Prometheus内置的数据查询语言,其提供对时间序列数据丰富的查询,聚合以及逻辑运算能力的支持. ...
Prometheus监控学习笔记之Prometheus从1.x升级到2.x
详细参考这篇文章 https://cloud.tencent.com/developer/article/1171434 prometheus 2.0于2017-11-08发布,主要是存储引擎进行了优 ...
Prometheus监控学习笔记之Prometheus 2.0 告警规则介绍
0x00 变化 Prometheus 2.0 已经发布一段时间了,从今天开始我将分几篇文章为大家介绍其中的一些变化. 此篇文章主要介绍 2.0 的告警规则声明的新写法. 从 1.x 到 2.0 规则声 ...
Prometheus监控学习笔记之Prometheus的Relabel，SD以及Federation功能
0x00 k8s 的监控设计 k8s 默认以及推荐的监控体系是它自己的一套东西:Heapster + cAdvisor + Influxdb + Grafana,具体可以看这里 . 包括 k8s 自 ...
Prometheus监控学习笔记之PromQL简单示例
0x00 简单的时间序列选择返回度量指标 http_requests_total 的所有时间序列样本数据: http_requests_total 返回度量指标名称为 http_requests_t ...
Prometheus监控学习笔记之在 HTTP API 中使用 PromQL
0x00 概述 Prometheus 当前稳定的 HTTP API 可以通过 /api/v1 访问. 0x01 API 响应格式 Prometheus API 使用了 JSON 格式的响应内容. 当 ...

随机推荐

Test 3.27 T2 旅行
Description FGD 想从成都去上海旅游.在旅途中他希望经过一些城市并在那里欣赏风景,品尝风味小吃或者做其他的有趣的事情.经过这些城市的顺序不是完全随意的,比如说 FGD 不希望在刚吃过一顿 ...
[转]php判断mysql_query是否成功执行
针对update 语句等会对数据表进行修改的语句在mysql_query($sql);后面加上 $result = mysql_affected_rows(); 如果$result 值为-1表明语句 ...
tuple拆包操作
""" tuple 是不可变对象 """ user_tuple = ('admin', 18, "cd", " ...
OC + RAC(一) RACSignal 基本使用
-(void)_test1{ //测试RAC流程发送next类型事件以completed结束时: //至于有无 sendCompleted 的区别主要是用在需要知道信号状态 NSLog(@&quo ...
vue框架搭建--axios使用
前后端数据交互作为项目最基础需求(静态的除外),同时也是项目中最重要的需求. 本文重点介绍axios如何配合vue搭建项目框架,而axios的详细使用介绍请移步使用说明 1.安装 cnpm insta ...
java单双引号转义问题
JavaScript代码:var str = '<a href="javascript:;" onclick="visaDetail(\'1\',' + value ...
前端JS编码规范
对初学者来说应该学习的JavaScript编码规范: 传送门: http://blog.chinaunix.net/xmlrpc.php?r=blog/article&uid=29292475 ...
css中div垂直居中的方法。
利用绝对定位实现的居中代码如下: <!DOCTYPE html> <html> <head> <meta charset="UTF-8" ...
iOS应用发布打包时为什么选择release，而不是debug
一.Debug和Release版本区别? 众所周知,我们进行iOS开发,在Xcode调试程序时,分为两种方式,Debug和Release,在Target的Setting中相信大家应该看到很多选项都分为 ...
Jenkins使用二：新建任务
准备一个用于测试脚本,就打印hello world 新建job 配置: 添加步骤立即构建

prometheus 笔记

前言

prometheus 笔记的更多相关文章

随机推荐

热门专题