The people using PostgreSQL and the Streaming Replication feature seem to ask many of the same questions:

1. How best to monitor Streaming Replication?

2. What is the best way to do that?

3. Are there alternatives, when monitoring on Standby, to using the pg_stat_replication view on Master?

4. How should I calculate replication lag-time, in seconds, minutes, etc.?

In light of these commonly asked questions, I thought a blog would help. The following are some methods I’ve found to be useful.

Monitoring is critical for large infrastructure deployments where you have Streaming Replication for:

1. Disaster recovery

2. Streaming Replication is for High Availability

3. Load balancing, when using Streaming Replication with Hot Standby

PostgreSQL has some building blocks for replication monitoring, and the following are some important functions and views which can be use for monitoring the replication:

1. pg_stat_replication view on master/primary server.

This view helps in monitoring the standby on Master. It gives you the following details:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
pid:              Process id of walsender process
usesysid:         OID of user which is used for Streaming replication.
usename:          Name of user which is used for Streaming replication
application_name: Application name connected to master
client_addr:      Address of standby/streaming replication
client_hostname:  Hostname of standby.
client_port:      TCP port number on which standby communicating with WAL sender
backend_start:    Start time when SR connected to Master.
state:            Current WAL sender state i.e streaming
sent_location:    Last transaction location sent to standby.
write_location:   Last transaction written on disk at standby
flush_location:   Last transaction flush on disk at standby.
replay_location:  Last transaction flush on disk at standby.
sync_priority:    Priority of standby server being chosen as synchronous standby
sync_state:       Sync State of standby (is it async or synchronous).

e.g.:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
postgres=# select * from pg_stat_replication ;
-[ RECORD 1 ]----+---------------------------------
pid              | 1114
usesysid         | 16384
usename          | repuser
application_name | walreceiver
client_addr      | 172.17.0.3
client_hostname  |
client_port      | 52444
backend_start    | 15-MAY-14 19:54:05.535695 -04:00
state            | streaming
sent_location    | 0/290044C0
write_location   | 0/290044C0
flush_location   | 0/290044C0
replay_location  | 0/290044C0
sync_priority    | 0
sync_state       | async

2. pg_is_in_recovery() : Function which tells whether standby is still in recovery mode or not.

e.g.

1
2
3
4
5
postgres=# select pg_is_in_recovery();
 pg_is_in_recovery
-------------------
 t
(1 row)

3. pg_last_xlog_receive_location: Function which tells location of last transaction log which was streamed by Standby and also written on standby disk.

e.g.

1
2
3
4
5
postgres=# select pg_last_xlog_receive_location();
 pg_last_xlog_receive_location
-------------------------------
 0/29004560
(1 row)

4. pg_last_xlog_replay_location: Function which tells last transaction replayed during recovery process. e.g is given below:

1
2
3
4
5
postgres=# select pg_last_xlog_replay_location();
 pg_last_xlog_replay_location
------------------------------
 0/29004560
(1 row)

5. pg_last_xact_replay_timestamp: This function tells about the time stamp of last transaction which was replayed during recovery. Below is an example:

1
2
3
4
5
postgres=# select pg_last_xact_replay_timestamp();
  pg_last_xact_replay_timestamp
----------------------------------
 15-MAY-14 20:54:27.635591 -04:00
(1 row)

Above are some important functions/views, which are already available in PostgreSQL for monitoring the streaming replication.

So, the logical next question is, “What’s the right way to monitor the Hot Standby with Streaming Replication on Standby Server?”

If you have Hot Standby with Streaming Replication, the following are the points you should monitor:

1. Check if your Hot Standby is in recovery mode or not:

For this you can use pg_is_in_recovery() function.

2.Check whether Streaming Replication is working or not.

And easy way of doing this is checking the pg_stat_replication view on Master/Primary. This view gives information only on master if Streaming Replication is working.

3. Check If Streaming Replication is not working and Hot standby is recovering from archived WAL file.

For this, either the DBA can use the PostgreSQL Log file to monitor it or utilize the following functions provided in PostgreSQL 9.3:

1
2
pg_last_xlog_replay_location();
pg_last_xact_replay_timestamp();

4. Check how far off is the Standby from Master.

There are two ways to monitor lag for Standby.



   i. Lags in Bytes: For calculating lags in bytes, users can use thepg_stat_replication view on the master with the functionpg_xlog_location_diff function. Below is an example:

1
pg_xlog_location_diff(pg_stat_replication.sent_location, pg_stat_replication.replay_location)

which gives the lag in bytes.

  ii. Calculating lags in Seconds. The following is SQL, which most people uses to find the lag in seconds:

1
2
3
4
SELECT CASE WHEN pg_last_xlog_receive_location() = pg_last_xlog_replay_location()
              THEN 0
            ELSE EXTRACT (EPOCH FROM now() - pg_last_xact_replay_timestamp())
       END AS log_delay;

Including the above into your repertoire can give you good monitoring for PostgreSQL.

I will in a future post include the script that can be used for monitoring the Hot Standby with PostgreSQL streaming replication.

Since 9.6 this is a lot easier as it introduced the function pg_blocking_pids() to find the sessions that are blocking another session.

So you can use something like this:

How to check blocking processes in postgresql

select pid,
usename,
pg_blocking_pids(pid) as blocked_by,
query as blocked_query
from pg_stat_activity
where cardinality(pg_blocking_pids(pid)) > 0;

14 - How to check replication status的更多相关文章

  1. java.lang.IllegalStateException: Failed to check the status of the service

    java.lang.IllegalStateException: Failed to check the status of the service com.pinyougou.sellergoods ...

  2. DTM initialization: failure during startup recovery, retry failed, check segment status (cdbtm.c:1603)

    安装greenplum集群出现以下错误: 20160315:13:49:16:025696 gpinitsystem:h95:jason-[INFO]:-Checking configuration ...

  3. check failed status == cudnn_status_success (4 vs. 0) cudnn_status_internal_error

    Check failed: error == cudaSuccess (30 vs. 0) unknown error  这个有可能是显存不足造成的,或者网络参数不对造成的 check failed ...

  4. 关于Failed to check the status of the service com.taotao.service.ItemService. No provider available fo

    原文:http://www.bubuko.com/infodetail-2250226.html 项目中用dubbo发生: Failed to check the status of the serv ...

  5. Check failed: status == CUBLAS_STATUS_SUCCESS (11 vs. 0) CUBLAS_STATUS_MAPPING_ERROR

    I0930 21:23:15.115576 30918 solver.cpp:281] Learning Rate Policy: multistepF0930 21:23:17.263314 310 ...

  6. CUDA报错: Cannot create Cublas handle. Cublas won't be available. 以及:Check failed: status == CUBLAS_STATUS_SUCCESS (1 vs. 0) CUBLAS_STATUS_NOT_INITIALIZED

    Error描述: aita@aita-Alienware-Area-51-R5:~/AITA2/daisida/ssd-github/caffe$ make runtest -j8 .build_re ...

  7. node js fcoin api 出现 api key check fail : {"status":1090,"msg":"Illegal API signature"}

    //主区://ft / btc 不支持市价 买入数量不能小于5个FT 买//ft / eth 支持市价 最小买入eth不能小于0.01 买//ft / usdt 支持市价 最小买入usdt不能小于10 ...

  8. dubbo Failed to check the status of the service com.user.service.UserService. No provider available for the service

    Caused by: org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'u ...

  9. springboot 报错nested exception is java.lang.IllegalStateException: Failed to check the status of the service xxxService No provider available for the service

    spring: dubbo:#关闭所有服务的启动时检查:(没有提供者时报错) consumer: check: false timeout: 3000

随机推荐

  1. spring事务源码分析结合mybatis源码(一)

    最近想提升,苦逼程序猿,想了想还是拿最熟悉,之前也一直想看但没看的spring源码来看吧,正好最近在弄事务这部分的东西,就看了下,同时写下随笔记录下,以备后查. spring tx源码分析 这里只分析 ...

  2. yum安装mysql

    安装 CentOS7默认数据库是mariadb,配置等用着不习惯,因此决定改成mysql,但是CentOS7的yum源中默认好像是没有mysql的.为了解决这个问题,我们要先下载mysql的repo源 ...

  3. Redis学习之二 数据类型和相关命令

    原文:https://www.cnblogs.com/lonelyxmas/p/9073928.html 如果还不懂安装的,请看 Windows环境下安装Redis Redis一共支持五种数据类型 1 ...

  4. 如何开启远程debug调试功能?

    远程debug步骤: 1.vi /usr/local/sa/tomcat-ui/bin/catalina.sh 最顶上加export JPDA_ADDRESS=12345 2.vi /usr/loca ...

  5. python正则表达式--编译正则表达式re.compile

    编译正则表达式-- re.compile 使用re的一般步骤是先将正则表达式的字符串形 式编译为pattern实例,然后使用pattern实例处理文本并获取匹配结果(一个Match实例(值为True) ...

  6. vue——loading组件

    <template> <div class="loading" :style="{height:loadingRadiusVal+'px',width: ...

  7. 【git】将本地项目上传到远程仓库

    飞机票 一. 首先你需要一个github账号,所有还没有的话先去注册吧! https://github.com/ 我们使用git需要先安装git工具,这里给出下载地址,下载后一路直接安装即可: htt ...

  8. C# .Net String字符串效率提高-字符串拼接

    字符串操作是编程中非常频繁的操作,特别是在拼接字符串的时候.下面来说说字符串拼接的效率提升. 1. 减少装箱 值类型与引用类型之间的转换存在装箱与拆箱操作:将值类型转换成引用类型的操作叫装箱,将引用类 ...

  9. windows无法安装msi文件

    命令提示符(管理员身份运行): 输入:msiexec /i e:\spark\scala-2.11.12.msi 其中e:\spark\scala-2.11.12.msi:就是安装文件的位置.

  10. kubernetes核心概念

    摘抄自:  https://www.cnblogs.com/zhenyuyaodidiao/p/6500720.html 1.基础架构 1.1 Master Master节点上面主要由四个模块组成:A ...