The people using PostgreSQL and the Streaming Replication feature seem to ask many of the same questions:

1. How best to monitor Streaming Replication?

2. What is the best way to do that?

3. Are there alternatives, when monitoring on Standby, to using the pg_stat_replication view on Master?

4. How should I calculate replication lag-time, in seconds, minutes, etc.?

In light of these commonly asked questions, I thought a blog would help. The following are some methods I’ve found to be useful.

Monitoring is critical for large infrastructure deployments where you have Streaming Replication for:

1. Disaster recovery

2. Streaming Replication is for High Availability

3. Load balancing, when using Streaming Replication with Hot Standby

PostgreSQL has some building blocks for replication monitoring, and the following are some important functions and views which can be use for monitoring the replication:

1. pg_stat_replication view on master/primary server.

This view helps in monitoring the standby on Master. It gives you the following details:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
pid:              Process id of walsender process
usesysid:         OID of user which is used for Streaming replication.
usename:          Name of user which is used for Streaming replication
application_name: Application name connected to master
client_addr:      Address of standby/streaming replication
client_hostname:  Hostname of standby.
client_port:      TCP port number on which standby communicating with WAL sender
backend_start:    Start time when SR connected to Master.
state:            Current WAL sender state i.e streaming
sent_location:    Last transaction location sent to standby.
write_location:   Last transaction written on disk at standby
flush_location:   Last transaction flush on disk at standby.
replay_location:  Last transaction flush on disk at standby.
sync_priority:    Priority of standby server being chosen as synchronous standby
sync_state:       Sync State of standby (is it async or synchronous).

e.g.:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
postgres=# select * from pg_stat_replication ;
-[ RECORD 1 ]----+---------------------------------
pid              | 1114
usesysid         | 16384
usename          | repuser
application_name | walreceiver
client_addr      | 172.17.0.3
client_hostname  |
client_port      | 52444
backend_start    | 15-MAY-14 19:54:05.535695 -04:00
state            | streaming
sent_location    | 0/290044C0
write_location   | 0/290044C0
flush_location   | 0/290044C0
replay_location  | 0/290044C0
sync_priority    | 0
sync_state       | async

2. pg_is_in_recovery() : Function which tells whether standby is still in recovery mode or not.

e.g.

1
2
3
4
5
postgres=# select pg_is_in_recovery();
 pg_is_in_recovery
-------------------
 t
(1 row)

3. pg_last_xlog_receive_location: Function which tells location of last transaction log which was streamed by Standby and also written on standby disk.

e.g.

1
2
3
4
5
postgres=# select pg_last_xlog_receive_location();
 pg_last_xlog_receive_location
-------------------------------
 0/29004560
(1 row)

4. pg_last_xlog_replay_location: Function which tells last transaction replayed during recovery process. e.g is given below:

1
2
3
4
5
postgres=# select pg_last_xlog_replay_location();
 pg_last_xlog_replay_location
------------------------------
 0/29004560
(1 row)

5. pg_last_xact_replay_timestamp: This function tells about the time stamp of last transaction which was replayed during recovery. Below is an example:

1
2
3
4
5
postgres=# select pg_last_xact_replay_timestamp();
  pg_last_xact_replay_timestamp
----------------------------------
 15-MAY-14 20:54:27.635591 -04:00
(1 row)

Above are some important functions/views, which are already available in PostgreSQL for monitoring the streaming replication.

So, the logical next question is, “What’s the right way to monitor the Hot Standby with Streaming Replication on Standby Server?”

If you have Hot Standby with Streaming Replication, the following are the points you should monitor:

1. Check if your Hot Standby is in recovery mode or not:

For this you can use pg_is_in_recovery() function.

2.Check whether Streaming Replication is working or not.

And easy way of doing this is checking the pg_stat_replication view on Master/Primary. This view gives information only on master if Streaming Replication is working.

3. Check If Streaming Replication is not working and Hot standby is recovering from archived WAL file.

For this, either the DBA can use the PostgreSQL Log file to monitor it or utilize the following functions provided in PostgreSQL 9.3:

1
2
pg_last_xlog_replay_location();
pg_last_xact_replay_timestamp();

4. Check how far off is the Standby from Master.

There are two ways to monitor lag for Standby.



   i. Lags in Bytes: For calculating lags in bytes, users can use thepg_stat_replication view on the master with the functionpg_xlog_location_diff function. Below is an example:

1
pg_xlog_location_diff(pg_stat_replication.sent_location, pg_stat_replication.replay_location)

which gives the lag in bytes.

  ii. Calculating lags in Seconds. The following is SQL, which most people uses to find the lag in seconds:

1
2
3
4
SELECT CASE WHEN pg_last_xlog_receive_location() = pg_last_xlog_replay_location()
              THEN 0
            ELSE EXTRACT (EPOCH FROM now() - pg_last_xact_replay_timestamp())
       END AS log_delay;

Including the above into your repertoire can give you good monitoring for PostgreSQL.

I will in a future post include the script that can be used for monitoring the Hot Standby with PostgreSQL streaming replication.

Since 9.6 this is a lot easier as it introduced the function pg_blocking_pids() to find the sessions that are blocking another session.

So you can use something like this:

How to check blocking processes in postgresql

select pid,
usename,
pg_blocking_pids(pid) as blocked_by,
query as blocked_query
from pg_stat_activity
where cardinality(pg_blocking_pids(pid)) > 0;

14 - How to check replication status的更多相关文章

  1. java.lang.IllegalStateException: Failed to check the status of the service

    java.lang.IllegalStateException: Failed to check the status of the service com.pinyougou.sellergoods ...

  2. DTM initialization: failure during startup recovery, retry failed, check segment status (cdbtm.c:1603)

    安装greenplum集群出现以下错误: 20160315:13:49:16:025696 gpinitsystem:h95:jason-[INFO]:-Checking configuration ...

  3. check failed status == cudnn_status_success (4 vs. 0) cudnn_status_internal_error

    Check failed: error == cudaSuccess (30 vs. 0) unknown error  这个有可能是显存不足造成的,或者网络参数不对造成的 check failed ...

  4. 关于Failed to check the status of the service com.taotao.service.ItemService. No provider available fo

    原文:http://www.bubuko.com/infodetail-2250226.html 项目中用dubbo发生: Failed to check the status of the serv ...

  5. Check failed: status == CUBLAS_STATUS_SUCCESS (11 vs. 0) CUBLAS_STATUS_MAPPING_ERROR

    I0930 21:23:15.115576 30918 solver.cpp:281] Learning Rate Policy: multistepF0930 21:23:17.263314 310 ...

  6. CUDA报错: Cannot create Cublas handle. Cublas won't be available. 以及:Check failed: status == CUBLAS_STATUS_SUCCESS (1 vs. 0) CUBLAS_STATUS_NOT_INITIALIZED

    Error描述: aita@aita-Alienware-Area-51-R5:~/AITA2/daisida/ssd-github/caffe$ make runtest -j8 .build_re ...

  7. node js fcoin api 出现 api key check fail : {"status":1090,"msg":"Illegal API signature"}

    //主区://ft / btc 不支持市价 买入数量不能小于5个FT 买//ft / eth 支持市价 最小买入eth不能小于0.01 买//ft / usdt 支持市价 最小买入usdt不能小于10 ...

  8. dubbo Failed to check the status of the service com.user.service.UserService. No provider available for the service

    Caused by: org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'u ...

  9. springboot 报错nested exception is java.lang.IllegalStateException: Failed to check the status of the service xxxService No provider available for the service

    spring: dubbo:#关闭所有服务的启动时检查:(没有提供者时报错) consumer: check: false timeout: 3000

随机推荐

  1. 整理一下C++语言中的头文件

    对于每一个像我一样的蒟蒻来说,C++最重要的东西就是头文件的使用了.由于初学,直到现在我打代码还是靠一些事先写好的的头文件,仍然不能做到使用自己需要的.最近看了几位大佬打代码,心中突然闪过要把自己冗长 ...

  2. centos禁止与开启ping设置

    禁止ping: echo 1 > /proc/sys/net/ipv4/icmp_echo_ignore_all 允许ping: echo 0 > /proc/sys/net/ipv4/i ...

  3. jQuery ajaxForm和 ajaxSubmit注意

    http://jquery.malsup.com/form/#file-upload 在使用jQuery异步上传时,需要注意在存在文件上传时,要将返回数据的contentType设置为text/htm ...

  4. jQuery第七章插件的编写和使用

    1.本章目标 编写jquery插件 2.插件 也称为扩展,是一种按照一定的规范的应用程序接口编写出来的程序 插件的目标是给已有的一系列函数做一个封装,以便在其他的地方复用,方便维护和开发效率 3.jq ...

  5. 百度地图api文档实现任意两点之间的最短路线规划

    两个点之间的路线是使用“Marker”点连接起来的,目前还没找到改变点颜色的方法,测试过使用setStyle没有效果. <html><head> <meta http-e ...

  6. Maven mvn install 本地jar添加到本地maven仓库中

    mvn install:install-file -DgroupId=alipay -DartifactId=taobao-sdk-java-auto -Dversion=1.0 -Dpackagin ...

  7. 利用MySQL游标进行计算排名

    SELECT a.id, a.nick_name, a.member_account, a.integral, () AS tRank #计算行号 FROM tzqc_raw_data AS a, ( ...

  8. CodeForces 528D Fuzzy Search 多项式 FFT

    原文链接http://www.cnblogs.com/zhouzhendong/p/8782849.html 题目传送门 - CodeForces 528D 题意 给你两个串$A,B(|A|\geq| ...

  9. LOJ.6160.[美团CodeM初赛 RoundA]二分图染色(容斥 组合)

    题目链接 \(Description\) 求在\(2n\)个点的完全二分图(两边各有\(n\)个点)上确定两组匹配,使得两个匹配没有交集的方案数. \(n\leq10^7\). \(Solution\ ...

  10. 解决:coursera 视频总是缓冲或者无法观看

    关于这个问题,网上有很多的答案,但是可能我是win10 最近才更新了的,网上的方法都不能完全解决,然后自己搜了哈,最后综合自己解决了.具体方法如下. 在开始菜单中打开运行命令,输入gpedit.msc ...