PostreSQL崩溃试验全记录

健哥的数据花园 2024-09-29 17:30:48 原文

磨砺技术珠矶，践行数据之道，追求卓越价值

回到上一级页面： PostgreSQL基础知识与基本操作索引页回到顶级页面：PostgreSQL索引页

[作者高健@博客园 luckyjackgao@gmail.com]

由于客户提到，运行某些大运算量批处理时，PostgreSQL突发性使用大量资源的问题，

进行了如下的调查，发现确实发生了崩溃。PostgreSQL 需要资源控制方案啊。

现在正在考虑是否必须要用 os 级别的限制方案：

过程如下：

测试环境：

内存：大约1024MB

postgresql.conf 设定：

使用缺省值：checkpoint_segments =3 shard_buffers=32MB

这些是有意为之，就是想看看数据量大、shared_buffer小的状况：

建立表(一条记录大约1024字节)：

postgres=# create table test01(id integer, val char(1024));

向表中插入大量数据(总共插入2400MB)

postgres=# insert into test01 values(generate_series(1,2457600),repeat( chr(int4(random()*26)+65),1024));

插入操作会花费一些时间，此时用ps 命令看PostgreSQL个进程对内存使用状况：‘

[root@server ~]# ps aux | grep post

root        0.0  0.0     pts/    S    :   : su - postgres

postgres    0.0  0.0      pts/    S+   :   : -bash

postgres    0.0  0.2    pts/    S    :   : /usr/local/pgsql/bin/postgres -D /gao/data

postgres    0.4  3.0   ?        Ss   :   : postgres: writer process                  

postgres    0.2  0.1    ?        Ds   :   : postgres: wal writer process              

postgres    0.0  0.0    ?        Ss   :   : postgres: autovacuum launcher process     

postgres    0.0  0.0      ?        Ss   :   : postgres: stats collector process         

root        0.0  0.0     pts/    S    :   : su - postgres

postgres    0.0  0.0      pts/    S    :   : -bash

postgres    0.0  0.0      pts/    S+   :   : ./psql

postgres   14.8 80.2   ?      Ds   :   : postgres: postgres postgres [local] INSERT

root        0.0  0.0      pts/    S+   :   : grep post

可以发现，INSERT操作正在消耗80%以上的内存。

再等片刻，发现psql端传来反馈：

WARNING:  terminating connection because of crash of another server process

DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.

HINT:  In a moment you should be able to reconnect to the database and repeat your command.

The connection to the server was lost. Attempting reset: Failed.

!>

此时看看Log，可以看到：background writer (3321) 已经被干掉了，所有连接被重置。

LOG: autovacuum launcher started

LOG: database system is ready to accept connections

LOG: checkpoints are occurring too frequently ( seconds apart)

HINT: Consider increasing the configuration parameter "checkpoint_segments".

LOG: checkpoints are occurring too frequently ( second apart)

HINT: Consider increasing the configuration parameter "checkpoint_segments".

LOG: checkpoints are occurring too frequently ( seconds apart)

HINT: Consider increasing the configuration parameter "checkpoint_segments".

LOG: checkpoints are occurring too frequently ( second apart)

HINT: Consider increasing the configuration parameter "checkpoint_segments".

LOG: checkpoints are occurring too frequently ( seconds apart)

HINT: Consider increasing the configuration parameter "checkpoint_segments".

LOG: checkpoints are occurring too frequently ( second apart)

HINT: Consider increasing the configuration parameter "checkpoint_segments".

LOG: checkpoints are occurring too frequently ( seconds apart)

HINT: Consider increasing the configuration parameter "checkpoint_segments".

LOG: checkpoints are occurring too frequently ( seconds apart)

HINT: Consider increasing the configuration parameter "checkpoint_segments".

LOG: checkpoints are occurring too frequently ( seconds apart)

HINT: Consider increasing the configuration parameter "checkpoint_segments".

LOG: checkpoints are occurring too frequently ( seconds apart)

HINT: Consider increasing the configuration parameter "checkpoint_segments".

LOG: checkpoints are occurring too frequently ( seconds apart)

HINT: Consider increasing the configuration parameter "checkpoint_segments".

LOG: checkpoints are occurring too frequently ( seconds apart)

HINT: Consider increasing the configuration parameter "checkpoint_segments".

LOG: checkpoints are occurring too frequently ( seconds apart)

HINT: Consider increasing the configuration parameter "checkpoint_segments".

LOG: checkpoints are occurring too frequently ( seconds apart)

HINT: Consider increasing the configuration parameter "checkpoint_segments".

LOG: checkpoints are occurring too frequently ( seconds apart)

HINT: Consider increasing the configuration parameter "checkpoint_segments".

LOG: checkpoints are occurring too frequently ( seconds apart)

HINT: Consider increasing the configuration parameter "checkpoint_segments".

LOG: checkpoints are occurring too frequently ( seconds apart)

HINT: Consider increasing the configuration parameter "checkpoint_segments".

LOG: checkpoints are occurring too frequently ( seconds apart)

HINT: Consider increasing the configuration parameter "checkpoint_segments".

LOG: checkpoints are occurring too frequently ( seconds apart)

HINT: Consider increasing the configuration parameter "checkpoint_segments".

LOG: checkpoints are occurring too frequently ( seconds apart)

HINT: Consider increasing the configuration parameter "checkpoint_segments".

LOG: checkpoints are occurring too frequently ( seconds apart)

HINT: Consider increasing the configuration parameter "checkpoint_segments".

LOG: checkpoints are occurring too frequently ( seconds apart)

HINT: Consider increasing the configuration parameter "checkpoint_segments".

LOG: checkpoints are occurring too frequently ( seconds apart)

HINT: Consider increasing the configuration parameter "checkpoint_segments".

LOG: checkpoints are occurring too frequently ( seconds apart)

HINT: Consider increasing the configuration parameter "checkpoint_segments".

LOG: checkpoints are occurring too frequently ( seconds apart)

HINT: Consider increasing the configuration parameter "checkpoint_segments".

LOG: checkpoints are occurring too frequently ( seconds apart)

HINT: Consider increasing the configuration parameter "checkpoint_segments".

LOG: checkpoints are occurring too frequently ( seconds apart)

HINT: Consider increasing the configuration parameter "checkpoint_segments".

LOG: checkpoints are occurring too frequently ( seconds apart)

HINT: Consider increasing the configuration parameter "checkpoint_segments".

LOG: checkpoints are occurring too frequently ( seconds apart)

HINT: Consider increasing the configuration parameter "checkpoint_segments".

LOG: background writer process (PID ) was terminated by signal : Killed

LOG: terminating any other active server processes

WARNING: terminating connection because of crash of another server process

DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.

HINT: In a moment you should be able to reconnect to the database and repeat your command.

WARNING: terminating connection because of crash of another server process

DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.

HINT: In a moment you should be able to reconnect to the database and repeat your command.

WARNING: terminating connection because of crash of another server process

DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.

HINT: In a moment you should be able to reconnect to the database and repeat your command.

WARNING: terminating connection because of crash of another server process

DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.

HINT: In a moment you should be able to reconnect to the database and repeat your command.

LOG: all server processes terminated; reinitializing

FATAL: the database system is in recovery mode

LOG: database system was interrupted; last known up at -- :: CST

LOG: database system was not properly shut down; automatic recovery in progress

LOG: consistent recovery state reached at /B7657BD0

LOG: redo starts at /B60FE2B8

LOG: unexpected pageaddr /B044C000 in log file , segment , offset

LOG: redo done at /B844B940

LOG: autovacuum launcher started

LOG: database system is ready to accept connections

各个进程都重新生成了：

[root@server ~]# ps aux | grep post

root        0.0  0.0     pts/    S    :   : su - postgres

postgres    0.0  0.0      pts/    S+   :   : -bash

postgres    0.0  0.5    pts/    S    :   : /usr/local/pgsql/bin/postgres -D /gao/data

root        0.0  0.0     pts/    S    :   : su - postgres

postgres    0.0  0.0      pts/    S    :   : -bash

postgres    0.0  0.0     pts/    S+   :   : ./psql

postgres    0.0  0.0     ?        Ss   :   : postgres: writer process

postgres    0.0  0.0     ?        Ss   :   : postgres: wal writer process

postgres    0.0  0.1    ?        Ss   :   : postgres: autovacuum launcher process

postgres    0.0  0.0      ?        Ss   :   : postgres: stats collector process

root        0.0  0.0      pts/    R+   :   : grep post

[root@server ~]#

回到psql端再看，发现连接已经失效了：

!> \

Invalid command \. Try \? for help.

!> \dt;

You are currently not connected to a database.

!>

根据向社区提问，据说是因为OS级别的OOM错误发生,所以杀死了Postmaster进程。

总之，此种情况表明，如果没有对资源消费总量的控制，那么突发性的用户访问会导致崩溃的。

[作者高健@博客园 luckyjackgao@gmail.com]

回到上一级页面： PostgreSQL基础知识与基本操作索引页回到顶级页面：PostgreSQL索引页

磨砺技术珠矶，践行数据之道，追求卓越价值

PostreSQL崩溃试验全记录的更多相关文章

MonkeyImage API 实践全记录
1. 背景鉴于网上使用MonkeyImage的实例除了方法sameAs外很难找到,所以本人把实践各个API的过程记录下来然自己有更感性的认识,也为往后的工作打下更好的基础.同时也和上一篇文章& ...
在CentOS6上配置MHA过程全记录
在CentOS6上配置MHA过程全记录 MHA(Master High Availability)是一款开源的MariaDB or MySQL高可用程序,为MariaDB or MySQL主从复制架构 ...
在CentOS7上通过RPM安装实现LAMP+phpMyAdmin过程全记录
在CentOS7上通过RPM安装实现LAMP+phpMyAdmin过程全记录时间:2017年9月20日一.软件环境: IP:192.168.1.71 Hostname:centos73-2.sur ...
一次线上Mysql数据库崩溃事故的记录
文章简介工作这几年,技术栈在不断更新,项目管理心得也增加了不少,写代码的速度也在提升,感觉很欣慰,毕竟是在一直进步,但是过程中也有许许多多的曲折,也踩过了数不尽的坑坑洼洼,从一个连百度都不知道用的萌 ...
SAP S4HANA1610/Fiori安装过程全记录
经历各种坑,从硬件到文件,终于安装成功. 有需要安装或使用S4HANA(含Fiori)的同学可以参考. 安装文件分享给大家链接:http://pan.baidu.com/s/1mi7LfIS 密码: ...
Spring+SpringMVC+MyBatis+easyUI整合进阶篇（七）一次线上Mysql数据库崩溃事故的记录
作者:13 GitHub:https://github.com/ZHENFENG13 版权声明:本文为原创文章,未经允许不得转载. 文章简介工作这几年,技术栈在不断更新,项目管理心得也增加了不少,写 ...
Express+Mongoose(MongoDB)+Vue2全栈微信商城项目全记录(二)
用mogoose搭建restful测试接口接着上一篇(Express+Mongoose(MongoDB)+Vue2全栈微信商城项目全记录(一))记录,今天单独搭建一个restful测试接口,和项目前 ...
Linux下搭建tomcat集群全记录
(转) Linux下搭建tomcat集群全记录 2011-10-12 10:23 6133人阅读评论(1) 收藏举报 tomcatlinuxapacheinterceptorsession集群 1 ...
lubuntu踩坑全记录
为了降低系统占用,毕业之后一直用lubuntu不用ubuntu...操作其实差不多,就是lubuntu有一些小坑坑:P 本文是我的踩坑全记录.长期更新. 调分辨率升级命令lubuntu不出登录页面 ...

随机推荐

FileTable的创建和使用
1. 首先要在实例级别启用FILESTREAM,打开Sql Server Configuration Manager窗口,双击需要设置的SQL SERVER实例进行设置. 2. 需要更改 SQL SE ...
批处理文件（Batch Files ）
后缀是bat的文件就是批处理文件,是一种文本文件.简单的说,它的作用就是自动的连续执行多条命令,批处理文件的内容就是一条一条的命令. 新建一个批处理abc.bat,里面内容如下:@echo offec ...
Linux学习之路-2017/12/25
三章命令通配符 .PATH变量支持多种文本的通配符通配符含义 * 匹配零个或多个字符 ? 匹配任意单个字符 [0-9] 匹配范围内的数字 [ ...
python面试十题
问题1: 请问如何修改以下python代码,使得下面的代码调用类A的show方法? class A(): def show(self): print("base show") cl ...
分析 org.hibernate.HibernateException: No Session found for current thread
/** * * org.hibernate.HibernateException: No Session found for current thread * 分析:ge ...
September 17th 2017 Week 38th Sunday
Distance could make you forget about them, but the memories would always be there. 距离会让你遗忘,但是回忆却始终在那 ...
【接口】常见接口集合（返回JSON）
转<JSON校验网站…>http://www.bejson.com/go.html?u=http://www.bejson.com/webInterface.html 这里为大家搜集了一些 ...
codeforces 1007B Pave the Parallelepiped
codeforces 1007B Pave the Parallelepiped 题意题解代码 #include<bits/stdc++.h> using namespace std; ...
张高兴的 Windows 10 IoT 开发笔记：使用 MAX7219 驱动数码管
This is a Windows 10 IoT Core project on the Raspberry Pi 2/3, coded by C#. GitHub:https://github.co ...
【洛谷】【线段树】P1886 滑动窗口
[题目描述:] 现在有一堆数字共N个数字(N<=10^6),以及一个大小为k的窗口.现在这个从左边开始向右滑动,每次滑动一个单位,求出每次滑动后窗口中的最大值和最小值. [输入格式:] 输入一共 ...