Although a lot of new tools have arrived since 2011, it's clear that older open source tools like Nagios, and Nagios alternatives like Zabbix and Icinga, still dominate the market, with 70% of the companies we spoke to still using these tools for their core monitoring & alerting.

Around 70% of the companies used more than one monitoring tool, with most using an average of two. Nagios/Graphite configurations were most common, with many also using New Relic. However, only two of the companies we spoke to actually paid for New Relic, with most of the companies using the free version as they found the paid version too expensive.

In the "other" category, there were a lot of different tools with no particular one standing out. Types of tools that fell into this category were SaaS monitoring tools such as Librato & Datadog, used by several smaller start-ups, or many older open source tools like Cacti or Munin. Some AWS users rely on CloudWatch, and there were even a few custom built solutions.

Graph 1: Percentage of companies with monitoring tools deployed.

If we look at tool usage versus the number of servers the companies manage (< 20 being new startup services, and all the way to > 1000 servers for the large online services), you can see that the proportion of older open source tools like Nagios, or paid on-premise tools goes up as the service gets larger, whereas the smaller, newer services are more likely to use developer focused tools like Graphite, LogStash and New Relic.

This makes sense, as many of the larger services are older (> 5 years old) so have legacy monitoring infrastructure, and also have the resources to hire a dedicated operations team who tend to bring in the tools their most familiar with, namely Nagios or Nagios alternatives. They also have more money to pay for monitoring tools like Splunk (which everyone would love to have if they could afford it) or AppDynamics.

The newer smaller services tend not to have any DevOps/Operations people in their company, so developers tend to use simpler-to-install SaaS monitoring tools, or tools that help them such as Graphite or LogStash. There seems to be a tipping point between 50-100 servers when the company has the resources to bring in a DevOps/Operations person or team and they start bringing in the infrastructure monitoring tools like Nagios to provide the coverage they need.

Graph 2: Tool Usage vs. Number of Servers Managed

Key Trends:

1.  Many people found the newer services lacked the flexibility of open source solutions with their ability to customize them to their requirements, and didn't like the idea of learning a proprietary system with its own plugin design and features. So they built their own "kit car".

2. While the services became larger, the trend was the move towards microservices, with different cross-functional development teams building, deploying and supporting their own parts of the service.

3. There are some simpler things that can be done to reduce spammy alerts with potential of predictive & more intelligent alerting using machine learning.

[excerpt from Outlyer]

Monitoring tools that everyone's currently using的更多相关文章

  1. PostgreSQL Performance Monitoring Tools

    PostgreSQL Performance Monitoring Tools https://github.com/CloudServer/postgresql-perf-tools This pa ...

  2. 4. Traffic monitoring tools (流量监控工具 10个)

    4. Traffic monitoring tools (流量监控工具 10个)EttercapNtop SolarWinds已经创建并销售了针对系统管理员的数十种专用工具. 安全相关工具包括许多网络 ...

  3. Top 10 Free Wireless Network hacking/monitoring tools for ethical hackers and businesses

    There are lots of free tools available online to get easy access to the WiFi networks intended to he ...

  4. Top 12 Best Free Network Monitoring Tools (12种免费网络监控工具)

    1) Fiddler Fiddler(几乎)是适用于任何平台和任何操作系统的最好的免费网络工具,并提供了一些广受欢迎的关键特性.如:性能测试.捕捉记录HTTP/HTTPs请求响应.进行web调试等很多 ...

  5. Java Monitoring&Troubleshooting Tools

    JDK Tools and Utilities Monitoring Tools You can use the following tools to monitor JVM performance ...

  6. troubleshooting tools in JDK 7--转载

    This chapter describes in detail the troubleshooting tools that are available in JDK 7. In addition, ...

  7. Java Performance Optimization Tools and Techniques for Turbocharged Apps--reference

    Java Performance Optimization by: Pierre-Hugues Charbonneau reference:http://refcardz.dzone.com/refc ...

  8. Flink监控:Monitoring Apache Flink Applications

    This post originally appeared on the Apache Flink blog. It was reproduced here under the Apache Lice ...

  9. MySQL Performance Tuning: Tips, Scripts and Tools

    With MySQL, common configuration mistakes can cause serious performance problems. In fact, if you mi ...

随机推荐

  1. 【codevs2011】最小距离之和 [LNOI2013](Floyd)

    题目网址:http://codevs.cn/problem/2011/ 题目大意:有一个图,每次删一条边(可以重复删),求每次删边之后所有点对的最短距离之和. 看了一眼题目,顿时发现了O(n^4)的暴 ...

  2. 排序算法(java版)

    一直想理解一下基本的排序算法,最近正好在搞java所以就一并了(为了便于理解,这儿都是以从小到大排序为目的) 冒泡排序 也就是比较连续的两个值,如果前面一个值大于后面一个值,则交换. 时间复杂度为O( ...

  3. [转]Android:改变Activity切换方式

    overridePendingTransition(enterAnim, exitAnim); Intent intent =new Intent(this,item2.class); startAc ...

  4. Windows下软件调试

    1. 视频: (1).VS下的C++调试方法.wnv (2).WinDbg高级调试技术.wmv (3).内存与句柄泄漏处理技巧.wmv 2. “WinDbg高级调试技巧” 中 [01:22]讲到“软件 ...

  5. 安全的 ActiveMQ

    本章知识点 ActiveMQ 鉴权 ActiveMQ 授权 怎么创建一个自定义安全插件 使用基于证书的安全保证 简介 安全地访问消息代理以及它的 destinations 是公众关注的焦点.因此,Ac ...

  6. hive从查询中获取数据插入到表或动态分区

    Hive的insert语句能够从查询语句中获取数据,并同时将数据Load到目标表中.现在假定有一个已有数据的表staged_employees(雇员信息全量表),所属国家cnty和所属州st是该表的两 ...

  7. 关于js序列化时间的方法

    var time = new Date(); var otime = getMyDate(time); //将毫秒转换成 年月日+时分秒 格式的 (1970-01-11 00:00:00) funct ...

  8. ps6-工具的基础使用

    1.图像的移动与对齐 ctrl+j:复制图层,然后再移动不损坏原来的图像. Ctrl+Z =返回键 Shift+单击最下方图层 选择全部 Alt+鼠标移动 复制并粘贴 2.规则选择工具组 shift键 ...

  9. 通过 objc_setAssociatedObject alert 和 button关联 及传值

    原文地址 http://blog.csdn.net/lengshengren/article/details/16886915 //唯一静态变量key static const char associ ...

  10. BZOJ - 4196 软件包管理器 (树链剖分+dfs序+线段树)

    题目链接 设白色结点为未安装的软件,黑色结点为已安装的软件,则: 安装软件i:输出结点i到根的路径上的白色结点的数量,并把结点i到根的路径染成黑色.复杂度$O(nlog^2n)$ 卸载软件i:输出结点 ...