A user-facing bug causes search results to be unavailable for your service. Someone suggests adding a prober to monitor the service and, if search results are unavailable, notify the team via bug report. Google has plenty of prober options available; if we pick one and use it we’re done, right?

Not necessarily. Just because monitoring could detect a bug does not mean it is the best, or only, solution. For any given bug, you should consider which mixture of monitoring and testing is appropriate. Monitoring and testing each have pros and cons, and solve slightly different problems.
Monitoring observes—and sometimes interacts with—user-facing production systems. Monitoring is useful for detecting:
  • Load issues: Real users accessing real services induces load on real servers. The only way to measure the effect of production traffic is to directly measure the servers themselves. Common measurements include QPS, RPC response times, memory usage, and disk usage.
  • Service unavailability: A service might become unavailable because (1) the service itself is down, or (2) the services on which it depends are unavailable. Monitoring end-user experiences (e.g., “Does web search return results?”) is a great way to detect and alert about this situation.
  • Unanticipated user behavior: Even the most well-designed test scenarios can fail to anticipate real user behavior. Monitoring can inform your quality strategy by observing and measuring real-world behavior.
  • Version incompatibility: Different binary versions may interact incompatibly in ways that are hard to detect without production data. Monitoring can detect unanticipated data inconsistencies.
  • Data changes: User-facing data can change over time, sometimes in bad ways. Monitoring can statistically characterize data, diff them against previous data state, and alert on outside-of-threshold changes.
Testing isolates components in a non-production environment and verifies components’ behavior. Since it occurs prior to release, it reduces the cost of fixing a bug. Testing is useful for ensuring:
  • Functional correctness: A hermetic unit test remains the best way to prove that a small piece of code logic fulfills its interface contract.
  • Inter-component compatibility: An integration test is an excellent way to ensure that two components (e.g., a client and a Stubby service) work together properly.
When considering which techniques to employ, review the list above to determine which ones are appropriate. Worried about an individual vendor’s ad inventory suddenly dropping? Monitor ad volume for each vendor! Unsure if the price2value() function handles currency conversions? Write a unit test! Not sure how often users actually log into your system? Monitor login events! A judicious mix of monitoring and testing will speed up development and ensure that fewer bugs reach end users.

Episode 388: Testing vs Monitoring的更多相关文章

  1. docker 镜像管理

    docker:/root# docker search centos NAME DESCRIPTION STARS OFFICIAL AUTOMATED centos The official bui ...

  2. docker 创建镜像

    docker:/root# docker search centos NAME DESCRIPTION STARS OFFICIAL AUTOMATED centos The official bui ...

  3. Docker镜像与仓库(一)

    Docker镜像与仓库(一) Docker镜像与仓库(一) 如何查找镜像? Docker Hub https://registry.hub.docker.com docker search [OPTI ...

  4. 前端性能核对表Checklist-2018

    前端性能核对表Checklist-2018 1. 计划与度量 Get Ready: Planning and Metrics ☐ Establish a performance culture. ☐ ...

  5. 热修复 DexPosed AOP Xposed MD

    Markdown版本笔记 我的GitHub首页 我的博客 我的微信 我的邮箱 MyAndroidBlogs baiqiantao baiqiantao bqt20094 baiqiantao@sina ...

  6. linux tcp调优

    Linux TCP Performance Tuning News Linux Performance Tuning Recommended Books Recommended Links Linux ...

  7. 20 个 OpenSSH 最佳安全实践

    来源:https://linux.cn/article-9394-1.html OpenSSH 是 SSH 协议的一个实现.一般通过 scp 或 sftp 用于远程登录.备份.远程文件传输等功能.SS ...

  8. dexposed框架Android在线热修复

    移动client应用相对于Webapp的最大一个问题每次出现bug,不能像web一样在server就完毕修复,不须要发版本号.紧急或者有安全漏洞的问题, 假设是Webapp你可能最多花个1,2个小时紧 ...

  9. Codeforces Testing Round #10 A. Forgotten Episode

    水题,注意数据范围 #include <iostream> using namespace std; int main(){ long long n,a; cin >> n; ...

随机推荐

  1. 【iScroll源码学习01】准备阶段

    前言 我们昨天初步了解了为什么会出现iScroll:[SPA]移动站点APP化研究之上中下页面的iScroll化(上),然后简单的写了一个demo来模拟iScroll,其中了解到了以下知识点: ① v ...

  2. arcgis破解的时候,不能启动license manager的问题

    1.防火墙没关:(非常重要) 2.win+R,调出控制台,输入services.msc.然后手动开启ArcGIS license manager服务,关闭其余类似erdas,matlab影响该服务的开 ...

  3. flume 集群安装

    ./pssh -h ./host/all.txt -P mkdir /usr/local/app ./pssh -h ./host/all.txt -P tar zxf /usr/local/soft ...

  4. linux下发布的执行文件崩溃的问题定位 心得一则

    C++ Release版本发布到客户处执行时,如果程序崩溃,有什么办法能够快速的确认程序的问题呢? 如果能gdb调试的话,比较简单了,可以使用gdb命令,类似如下: gdb ##set args ** ...

  5. ZeroMq安装包的生成【ubuntu10】

    生成方法添加源sudo add-apt-repository ppa:chris-lea/zeromqsudo add-apt-repository ppa:chris-lea/libpgmsudo ...

  6. AFNetwork2.0在报错1016,3840的解决方法及一些感悟

    最近在学习AFNetwork,非常好的网络框架,能节省很多时间.不过请求网络数据时报错1016,3840. 这两个错误网上解决方法很多,http://blog.csdn.net/huifeidexin ...

  7. linker command failed with exit code 1 (use -v to see invocation)解决办法

    [cpp] view plaincopy Undefined symbols for architecture i386:     "_OBJC_CLASS_$_FMDatabase&quo ...

  8. 【代码笔记】iOS-点击任何处,显示出红色的UIView

    一,效果图. 二,工程图. 三,代码. RootViewController.h #import <UIKit/UIKit.h> //头文件 #import "MoreView. ...

  9. Swift (if while)

    Swift 分支 if if后的括号可以省略 if后只能接bool值 if后的大括号不能省略 let num1 = 3.0 let num2 = 4.0 let bool : Bool = true ...

  10. Linux下怎么查看当前系统的版本

    Linux下怎么查看当前系统的版本:   uname -r 功能说明:uname用来获取电脑和操作系统的相关信息. 语 法:uname [-amnrsvpio][--help][--version] ...