[转帖]Introduction to Linux monitoring and alerting
Introduction to Linux monitoring and alerting
https://www.redhat.com/sysadmin/linux-monitoring-and-alerting mail 的命令挺有意思的.
Posted October 3, 2019 | byKen Hess (Red Hat)

There are system administrators who love to do things themselves, and then there are those of us who must do things ourselves because budgets just don't allow for mega-purchases. Enterprise monitoring and alerting suites are for companies that either have large budgets or for those that have mission-critical applications, systems, or services that absolutely must be up 100% of the time. There are some open-source monitoring and alerting suites, but they require a dedicated system and a considerable amount of time to set up. Most also require agents to be installed on monitored endpoints, which requires approval and time to deploy. The quicker and easier solution is to create your own monitoring and alerting scripts and then schedule them via cron. The best part of localized (per server) monitoring and alerting is that you can customize thresholds for each system and service, rather than having to live with a global configuration that might not meet your needs.
More Linux resources
This article takes you through the process of creating a script that checks every five minutes for the Apache web server process, attempts to restart it if it's down, and then alerts you via email if it's down for more than 30 seconds and cannot be restarted.
Most processes have a process ID (PID) file under the /run directory when they are running, and many of those have their own separate directories that contain their corresponding PID files. In this example, the Apache web server (httpd) has a PID file: /run/httpd/httpd.pid.
I named this script apache.sh , and placed it into root's home directory. Be sure to change permissions on the file to 750 ( rwxr -x---) so that no other user can execute or even read this file, regardless of location:
$ sudo chmod 750 apache.sh
Note: If you don't have Apache installed, it doesn't matter, because you can replace the httpd.pid file pointed to in the script with any other PID file that works for your system.
There are many different ways to create such a script, but this is how I did it, and it works. I identified the PID file with the variable, FILE. I decided that rather than have an alert sent if the Apache web server was down, I would have the script attempt a service restart, and then check again. I repeated this process two more times, waiting for 10 seconds between checks. If the Apache service is still down and cannot be restarted after 30 seconds, then the script sends the system administrator team an email:
#!/bin/bash
FILE=/run/httpd/httpd.pid
if ! [ -f "$FILE" ]; then
systemctl start httpd.service
fi
sleep 10s
if ! [ -f "$FILE" ]; then
systemctl start httpd.service
fi
sleep 10s
if ! [ -f "$FILE" ]; then
systemctl start httpd.service
fi
sleep 10s
if ! [ -f "$FILE" ]; then
mail -s 'Apache is down' sysadmins@mydomain.com <<< 'Apache is down on SERVER1 and cannot be restarted'
fi
You could just as easily send an SMS message to a team on-call mobile phone. This script checks for the non-existence of the httpd.pid file and then takes action if it's not found. If the file exists, then no action is taken. No one wants to receive emails or notices that a service is up every five minutes.
Once you've tested your script and satisfied that it operates as desired, place this script into the root user's crontab:
$ sudo crontab -e
The entry I've made below runs the script every five minutes:
*/5 * * * * /root/apache.sh
This script is an example of a quick method for setting up a process monitor and alert on a local system. Yes, it's primitive and simple, but it works and it's free. It also doesn't require any budget discussions, nor does it require a maintenance window for agent installation. You'll also find that this script doesn't significantly impact performance on your system. These are all good things. And if you're an Ansible administrator, you could deliver this script to your entire fleet of systems without having to touch each one individually.
Want to learn more advanced techniques for monitoring in Linux? Check out The open source guide to DevOps monitoring tools.
[转帖]Introduction to Linux monitoring and alerting的更多相关文章
- A Quick Introduction to Linux Policy Routing
A Quick Introduction to Linux Policy Routing 29 May 2013 In this post, I’m going to introduce you to ...
- (转帖整理)Linux下的Autoconf和AutoMake(理论篇) 1
在搜索网上资料过程中,这是感觉最简洁有效的一篇文章,特进行转帖记录,并根据情况对部分内容进行了修改.原帖传送门:Linux下的Autoconf和AutoMake 1.工具安装在开始使用autoconf ...
- Introduction to Linux Threads
Introduction to Linux Threads A thread of execution is often regarded as the smallest unit of proces ...
- [转帖]Introduction to text manipulation on UNIX-based systems
Introduction to text manipulation on UNIX-based systems https://www.ibm.com/developerworks/aix/libra ...
- [转帖]NotePad++编辑Linux中的文件
NotePad++编辑Linux中的文件 https://blog.csdn.net/chengqiuming/article/details/78882692 原作者 未经允许不允许转帖 加密自己参 ...
- [转帖]Windows和Linux对决(多进程多线程)
Windows和Linux对决(多进程多线程) https://blog.csdn.net/world_2015/article/details/44920467 太长了 还没看完.. 还是没太理解好 ...
- 【转帖】理解 Linux 的虚拟内存
理解 Linux 的虚拟内存 https://www.cnblogs.com/zhenbianshu/p/10300769.html 段页式内存 文章了里面讲了 页表 没讲段表 记得最开始的时候 学习 ...
- [转帖]新的Linux后门开始肆虐 主要攻击中国服务器
新的Linux后门开始肆虐 主要攻击中国服务器 https://www.cnbeta.com/articles/tech/815639.htm 一种新的 Linux 系统后门已经开始肆虐,并主要运行在 ...
- []转帖] 浅谈Linux下的五种I/O模型
浅谈Linux下的五种I/O模型 https://www.cnblogs.com/chy2055/p/5220793.html 一.关于I/O模型的引出 我们都知道,为了OS的安全性等的考虑,进程是 ...
随机推荐
- (2)ESP8266 矩阵的逆求解
#include "math.h" int N=4; int M=4; float a[4][4]={ {1,0,0,0}, {1,0.5,0,0}, {1,0,1,0}, {1, ...
- xamarin/xamarin.forms 在锁屏电源唤醒时保持后台运行
PARTIAL_WAKE_LOCK:保持CPU 运转,屏幕和键盘灯有可能是关闭的. SCREEN_DIM_WAKE_LOCK:保持CPU 运转,允许保持屏幕显示但有可能是灰的,允许关闭键盘灯 SCRE ...
- 开源项目 04 PdfSharp
using PdfSharp.Drawing; using PdfSharp.Pdf; using System; using System.Collections.Generic; using Sy ...
- 2019SDSC夏令营游记
Day 1 2019.7.22 晴 第一天夏令营,是在一所大学举办的. 到之前的我好兴奋,要提前看一下大学到底是什么样的. 聊了一上午的天 坐了一上午的公交终于到了目的地,下午很自由,自己在宿舍里面休 ...
- c语言博客作业03--循环结构
0.展示PTA总分 1.本章学习总结 1.1学习内容总结 循环语句 for语句: for( 表达式1; 表达式2; 表达式3 ) { // 需要执行的语句; } 其执行过程是:表达式 1 首先执行且只 ...
- Kubernetes Namespaces
Kubernetes可以使用Namespaces(命名空间)创建多个虚拟集群. 大多数Kubernetes资源(例如pod.services.replication controllers或其他)都在 ...
- 记录linux上mongo迁移使用的命令
首先mongodb的文件路径必须在系统盘,这里是 这里安装路径 /usr/mongodb/bin 一般迁移的只是db文件夹和log文件 看配置文件内容 port=27017 #端口 dbpath=/d ...
- 基于hive的《反贪风暴4》的影评
一:将爬虫大作业产生的csv文件上传到HDFS 查看文件中前10条信息,即可证明是否上传成功. 二.对CSV文件进行预处理生成无标题文本文件 创建一个deal.sh,主要实现数据分割成什么样的意思 执 ...
- assert(0)的作用
捕捉逻辑错误.可以在程序逻辑必须为真的条件上设置断言.除非发生逻辑错误,否则断言对程序无任何影响.即预防性的错误检查,在认为不可能的执行到的情况下加一句ASSERT(0),如果运行到此,代码逻辑或条件 ...
- Servlet快速入门及运行流程
一.Servlet快速入门 1.创建一个web工程 2.在JavaResource中src下创建一个包名称为com.myxq.servlet 3.在创建的servlet包当中创建一个class文件起名 ...