案例说明:

如下图所示:KingbaseES服务进程结构

   KingbaseES使用客户端/服务器的模型。 对于每个客户端的连接,KingbaseES主进程接收到客户端连接后,会为其创建一个新的服务进程。 KingbaseES 用服务进程来处理连接到数据库服务的客户端请求。 该进程负责实际处理客户端的数据库请求,连接断开时退出。当Client连接到数据库时,会有对应的kingbase的服务进程为其提供服务,如以下Client查询访问:

   如下所示,操作系统对应的服务进程(backend process):

    当Client结束访问正常退出数据库连接时,对应的kingbase的服务进程也将结束;但是当客户端异常退出时,会导致数据库端的kingbase服务进程没有正常结束,并占用数据库资源,本案例将详细描述手工方式对服务进程(backend process)终止。手工结束backend process可以使用数据库工具或者操作系统的kill进程方式,但是不同方式对数据库造成的影响不同。

适用版本:

KingbaseES V8R3/R6

系统架构:

一、客户端访问

[kingbase@node1 bin]$ ./ksql -h 192.168.8.201 -U system -W prod
Password:
ksql (V8.0)
Type "help" for help. prod=# select count(*) from t1;
count
--------
100000
(1 row)

二、backend process终止方案

1、pg_terminate_backend(pid)方式

Tips:函数 pg_terminate_backend() 实际上是给进程发送了一个 SIGTERM 信号。

# 查询backend process状态信息
prod=# select * from sys_stat_activity where client_addr='192.168.8.200';
datid | datname | pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | xact_start | quer
y_start | state_change | wait_event_type | wait_event | state | backend_xid | backend_xmin | query | backend_type
-------+---------+-------+----------+---------+------------------+---------------+-----------------+-------------+-------------------------------+------------+--------------
-----------------+-------------------------------+-----------------+------------+-------+-------------+--------------+--------------------------+----------------
16385 | prod | 17100 | 10 | system | kingbase_*&+_ | 192.168.8.200 | | 57476 | 2022-11-29 15:04:57.131987+08 | | 2022-11-29 15
:05:05.526379+08 | 2022-11-29 15:05:05.539018+08 | Client | ClientRead | idle | | | select count(*) from t1; | client backend
(1 row) #终止backend process对应的pid
prod=# select pg_terminate_backend(17100);
pg_terminate_backend
----------------------
t
(1 row) #进程被终止
prod=# select * from sys_stat_activity where client_addr='192.168.8.200';
datid | datname | pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | xact_start | query_start | state_change | wait
_event_type | wait_event | state | backend_xid | backend_xmin | query | backend_type
-------+---------+-----+----------+---------+------------------+-------------+-----------------+-------------+---------------+------------+-------------+--------------+-----
------------+------------+-------+-------------+--------------+-------+--------------
(0 rows) #如下所示,数据库进程正常,对应的backend pross被安全终止
[root@node2 sys_log]# ps -ef |grep kingbase
kingbase 13089 1 0 14:39 ? 00:00:00 /opt/Kingbase/ES/V8R6_054/KESRealPro/V008R006C005B0054/Server/bin/kingbase -D /db/kingbase/v8r6_054/data
kingbase 13090 13089 0 14:39 ? 00:00:00 kingbase: logger
kingbase 16474 13089 0 14:59 ? 00:00:00 kingbase: checkpointer
kingbase 16475 13089 0 14:59 ? 00:00:00 kingbase: background writer
kingbase 16476 13089 0 14:59 ? 00:00:00 kingbase: walwriter
kingbase 16477 13089 0 14:59 ? 00:00:00 kingbase: autovacuum launcher
kingbase 16478 13089 0 14:59 ? 00:00:00 kingbase: stats collector
kingbase 16479 13089 0 14:59 ? 00:00:00 kingbase: ksh writer
kingbase 16480 13089 0 14:59 ? 00:00:00 kingbase: ksh collector
kingbase 16481 13089 0 14:59 ? 00:00:00 kingbase: kwr collector
kingbase 16482 13089 0 14:59 ? 00:00:00 kingbase: logical replication launcher
kingbase 17100 13089 0 15:04 ? 00:00:00 kingbase: system prod 192.168.8.200(57476) idle
kingbase 17212 12416 0 15:05 pts/0 00:00:00 ./ksql -U system test
kingbase 17214 13089 0 15:05 ? 00:00:00 kingbase: system prod [local] idle

2、操作系统kill pid方式

#查询backend process状态信息
prod=# select * from sys_stat_activity where client_addr='192.168.8.200';
datid | datname | pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | xact_start | quer
y_start | state_change | wait_event_type | wait_event | state | backend_xid | backend_xmin | query | backend_type
-------+---------+-------+----------+---------+------------------+---------------+-----------------+-------------+-------------------------------+------------+--------------
-----------------+-------------------------------+-----------------+------------+-------+-------------+--------------+--------------------------+----------------
16385 | prod | 18424 | 10 | system | kingbase_*&+_ | 192.168.8.200 | | 57480 | 2022-11-29 15:15:26.813035+08 | | 2022-11-29 15
:15:28.912910+08 | 2022-11-29 15:15:28.922719+08 | Client | ClientRead | idle | | | select count(*) from t1; | client backend
(1 row) #操作系统下执行kill pid结束进程
[root@node2 sys_log]# kill 18424 #如下所示,数据库进程正常,对应的backend pross被安全kill
[root@node2 sys_log]# ps -ef |grep kingbase
kingbase 13089 1 0 14:39 ? 00:00:00 /opt/Kingbase/ES/V8R6_054/KESRealPro/V008R006C005B0054/Server/bin/kingbase -D /db/kingbase/v8r6_054/data
kingbase 13090 13089 0 14:39 ? 00:00:00 kingbase: logger
kingbase 16474 13089 0 14:59 ? 00:00:00 kingbase: checkpointer
kingbase 16475 13089 0 14:59 ? 00:00:00 kingbase: background writer
kingbase 16476 13089 0 14:59 ? 00:00:00 kingbase: walwriter
kingbase 16477 13089 0 14:59 ? 00:00:00 kingbase: autovacuum launcher
kingbase 16478 13089 0 14:59 ? 00:00:00 kingbase: stats collector
kingbase 16479 13089 0 14:59 ? 00:00:00 kingbase: ksh writer
kingbase 16480 13089 0 14:59 ? 00:00:00 kingbase: ksh collector
kingbase 16481 13089 0 14:59 ? 00:00:00 kingbase: kwr collector
kingbase 16482 13089 0 14:59 ? 00:00:00 kingbase: logical replication launcher
kingbase 17212 12416 0 15:05 pts/0 00:00:00 ./ksql -U system test
kingbase 17214 13089 0 15:05 ? 00:00:00 kingbase: system prod [local] idle #在数据库视图中已经无此backend process记录
prod=# select * from sys_stat_activity where client_addr='192.168.8.200';
datid | datname | pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | xact_start | query_start | state_change | wait
_event_type | wait_event | state | backend_xid | backend_xmin | query | backend_type
-------+---------+-----+----------+---------+------------------+-------------+-----------------+-------------+---------------+------------+-------------+--------------+-----
------------+------------+-------+-------------+--------------+-------+--------------
(0 rows)

3、操作系统kill -15 pid和数据库sys_ctl kill TERM PID

1)操作系统kill -15 pid

#查询backend process状态信息
test=# select * from sys_stat_activity where client_addr='192.168.8.200';
datid | datname | pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | xact_start | quer
y_start | state_change | wait_event_type | wait_event | state | backend_xid | backend_xmin | query | backend_type
-------+---------+-------+----------+---------+------------------+---------------+-----------------+-------------+-------------------------------+------------+--------------
-----------------+-------------------------------+-----------------+------------+-------+-------------+--------------+--------------------------+----------------
16385 | prod | 22955 | 10 | system | kingbase_*&+_ | 192.168.8.200 | | 57498 | 2022-11-29 15:44:29.993305+08 | | 2022-11-29 15
:44:32.090913+08 | 2022-11-29 15:44:32.100617+08 | Client | ClientRead | idle | | | select count(*) from t1; | client backend
(1 row) #操作系统下执行kill -15 pid结束进程
[kingbase@node2 bin]$ kill -15 22955 [kingbase@node2 bin]$ ps -ef |grep kingbase #如下所示,数据库进程正常,对应的backend pross被安全kill
kingbase 13089 1 0 14:39 ? 00:00:00 /opt/Kingbase/ES/V8R6_054/KESRealPro/V008R006C005B0054/Server/bin/kingbase -D /db/kingbase/v8r6_054/data
kingbase 22197 13089 0 15:38 ? 00:00:00 kingbase: checkpointer
kingbase 22198 13089 0 15:38 ? 00:00:00 kingbase: background writer
kingbase 22199 13089 0 15:38 ? 00:00:00 kingbase: walwriter
kingbase 22200 13089 0 15:38 ? 00:00:00 kingbase: autovacuum launcher
kingbase 22201 13089 0 15:38 ? 00:00:00 kingbase: stats collector
kingbase 22202 13089 0 15:38 ? 00:00:00 kingbase: ksh writer
kingbase 22203 13089 0 15:38 ? 00:00:00 kingbase: ksh collector
kingbase 22204 13089 0 15:38 ? 00:00:00 kingbase: kwr collector
kingbase 22205 13089 0 15:38 ? 00:00:00 kingbase: logical replication launcher
kingbase 22444 12416 0 15:40 pts/0 00:00:00 ./ksql -U system test
kingbase 22445 13089 0 15:40 ? 00:00:00 kingbase: system test [local] idle #在数据库视图中已经无此backend process记录
test=# select * from sys_stat_activity where client_addr='192.168.8.200';
datid | datname | pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | xact_start | query_start | state_change | wait
_event_type | wait_event | state | backend_xid | backend_xmin | query | backend_type
-------+---------+-----+----------+---------+------------------+-------------+-----------------+-------------+---------------+------------+-------------+--------------+-----
------------+------------+-------+-------------+--------------+-------+--------------
(0 rows)

2)数据库sys_ctl kill TERM pid

#查询backend process状态信息
test=# select * from sys_stat_activity where client_addr='192.168.8.200';
datid | datname | pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | xact_start | quer
y_start | state_change | wait_event_type | wait_event | state | backend_xid | backend_xmin | query | backend_type
-------+---------+-------+----------+---------+------------------+---------------+-----------------+-------------+-------------------------------+------------+--------------
-----------------+-------------------------------+-----------------+------------+-------+-------------+--------------+--------------------------+----------------
16385 | prod | 22443 | 10 | system | kingbase_*&+_ | 192.168.8.200 | | 57494 | 2022-11-29 15:40:42.804868+08 | | 2022-11-29 15
:40:44.972533+08 | 2022-11-29 15:40:44.985340+08 | Client | ClientRead | idle | | | select count(*) from t1; | client backend
(1 row) #执行数据库命令kill进程
[kingbase@node2 bin]$ ./sys_ctl kill TERM 22443 #如下所示,数据库进程正常,对应的backend pross被安全kill
[kingbase@node2 bin]$ ps -ef |grep kingbase
kingbase 13089 1 0 14:39 ? 00:00:00 /opt/Kingbase/ES/V8R6_054/KESRealPro/V008R006C005B0054/Server/bin/kingbase -D /db/kingbase/v8r6_054/data
kingbase 22197 13089 0 15:38 ? 00:00:00 kingbase: checkpointer
kingbase 22198 13089 0 15:38 ? 00:00:00 kingbase: background writer
kingbase 22199 13089 0 15:38 ? 00:00:00 kingbase: walwriter
kingbase 22200 13089 0 15:38 ? 00:00:00 kingbase: autovacuum launcher
kingbase 22201 13089 0 15:38 ? 00:00:00 kingbase: stats collector
kingbase 22202 13089 0 15:38 ? 00:00:00 kingbase: ksh writer
kingbase 22203 13089 0 15:38 ? 00:00:00 kingbase: ksh collector
kingbase 22204 13089 0 15:38 ? 00:00:00 kingbase: kwr collector
kingbase 22205 13089 0 15:38 ? 00:00:00 kingbase: logical replication launcher
kingbase 22444 12416 0 15:40 pts/0 00:00:00 ./ksql -U system test
kingbase 22445 13089 0 15:40 ? 00:00:00 kingbase: system test [local] idle #在数据库视图中已经无此backend process记录
test=# select * from sys_stat_activity where client_addr='192.168.8.200';
datid | datname | pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | xact_start | query_start | state_change | wait
_event_type | wait_event | state | backend_xid | backend_xmin | query | backend_type
-------+---------+-----+----------+---------+------------------+-------------+-----------------+-------------+---------------+------------+-------------+--------------+-----
------------+------------+-------+-------------+--------------+-------+--------------
(0 rows)

4、操作系统kill -3 pid和数据库sys_ctl kill QUIT PID

1)操作系统kill -3 pid

#查询backend process状态信息
prod=# select * from sys_stat_activity where client_addr='192.168.8.200';
datid | datname | pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | xact_start | quer
y_start | state_change | wait_event_type | wait_event | state | backend_xid | backend_xmin | query
| backend_type
-------+---------+-------+----------+---------+------------------+---------------+-----------------+-------------+-------------------------------+------------+--------------
-----------------+-------------------------------+-----------------+------------+-------+-------------+--------------+-------------------------------------------------------
--------------+----------------
16385 | prod | 18666 | 10 | system | kingbase_*&+_ | 192.168.8.200 | | 57486 | 2022-11-29 15:17:34.726155+08 | | 2022-11-29 15
:17:34.736202+08 | 2022-11-29 15:17:34.740584+08 | Client | ClientRead | idle | | | select setting from pg_settings where name = 'enable_u
pper_colname' | client backend
(1 row) #操作系统下执行kill -3 pid结束进程
[root@node2 sys_log]# kill -3 18666 #如下所示,除了对应的backend process被终止外,其余的后台辅助进程也被终止并重启
[root@node2 sys_log]# ps -ef |grep kingbase
kingbase 13089 1 0 14:39 ? 00:00:00 /opt/Kingbase/ES/V8R6_054/KESRealPro/V008R006C005B0054/Server/bin/kingbase -D /db/kingbase/v8r6_054/data
kingbase 13090 13089 0 14:39 ? 00:00:00 kingbase: logger
kingbase 17212 12416 0 15:05 pts/0 00:00:00 ./ksql -U system test
kingbase 18795 13089 0 15:18 ? 00:00:00 kingbase: startup #查看对应的sys_log日志(backend process被终止导致数据库的辅助进程也将重启)
prod=# select * from sys_stat_activity where client_addr='192.168.8.200';
WARNING: terminating connection because of crash of another server process
DETAIL: The kingbase has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and repeat your command.
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
2022-11-29 15:18:06.741 CST [18666] WARNING: terminating connection because of crash of another server process
2022-11-29 15:18:06.741 CST [18666] DETAIL: The kingbase has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2022-11-29 15:18:06.741 CST [18666] HINT: In a moment you should be able to reconnect to the database and repeat your command.
2022-11-29 15:18:06.742 CST [13089] LOG: server process (PID 18666) exited with exit code 2
2022-11-29 15:18:06.742 CST [13089] DETAIL: Failed process was running: select setting from pg_settings where name = 'enable_upper_colname'
2022-11-29 15:18:06.742 CST [13089] LOG: terminating any other active server processes
2022-11-29 15:18:06.743 CST [17214] WARNING: terminating connection because of crash of another server process
2022-11-29 15:18:06.743 CST [17214] DETAIL: The kingbase has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2022-11-29 15:18:06.743 CST [17214] HINT: In a moment you should be able to reconnect to the database and repeat your command.
2022-11-29 15:18:06.744 CST [16477] WARNING: terminating connection because of crash of another server process
2022-11-29 15:18:06.744 CST [16477] DETAIL: The kingbase has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2022-11-29 15:18:06.744 CST [16477] HINT: In a moment you should be able to reconnect to the database and repeat your command.
2022-11-29 15:18:06.745 CST [13089] LOG: all server processes terminated; reinitializing
2022-11-29 15:18:06.823 CST [18795] LOG: database system was interrupted; last known up at 2022-11-29 14:59:17 CST
2022-11-29 15:19:29.897 CST [18795] LOG: database system was not properly shut down; automatic recovery in progress
2022-11-29 15:19:30.063 CST [18795] LOG: redo starts at 0/22935D8
2022-11-29 15:19:30.063 CST [18795] LOG: redo wal segment count 2
2022-11-29 15:19:30.063 CST [18795] LOG: invalid record length at 0/2293608: wanted 24, got 0
2022-11-29 15:19:30.063 CST [18795] LOG: redo done at 0/22935D8
2022-11-29 15:19:30.739 CST [13089] LOG: database system is ready to accept connections #数据库服务重启后
[root@node2 sys_log]# ps -ef |grep kingbase
kingbase 13089 1 0 14:39 ? 00:00:00 /opt/Kingbase/ES/V8R6_054/KESRealPro/V008R006C005B0054/Server/bin/kingbase -D /db/kingbase/v8r6_054/data
kingbase 13090 13089 0 14:39 ? 00:00:00 kingbase: logger
kingbase 17212 12416 0 15:05 pts/0 00:00:00 ./ksql -U system test
kingbase 18953 13089 0 15:19 ? 00:00:00 kingbase: checkpointer
kingbase 18954 13089 0 15:19 ? 00:00:00 kingbase: background writer
kingbase 18955 13089 0 15:19 ? 00:00:00 kingbase: walwriter
kingbase 18957 13089 0 15:19 ? 00:00:00 kingbase: stats collector
kingbase 18958 13089 0 15:19 ? 00:00:00 kingbase: ksh writer
kingbase 18959 13089 0 15:19 ? 00:00:00 kingbase: ksh collector
kingbase 18960 13089 0 15:19 ? 00:00:00 kingbase: kwr collector

---如上所示,kill -3 pid用于终止backend process将给数据库带来极大的风险 。

2)数据库sys_ctl kill QUIT PID

#查询backend process状态信息
test=# select * from sys_stat_activity where client_addr='192.168.8.200';
datid | datname | pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | xact_start | quer
y_start | state_change | wait_event_type | wait_event | state | backend_xid | backend_xmin | query | backend_type
-------+---------+-------+----------+---------+------------------+---------------+-----------------+-------------+-------------------------------+------------+--------------
-----------------+-------------------------------+-----------------+------------+-------+-------------+--------------+--------------------------+----------------
16385 | prod | 21894 | 10 | system | kingbase_*&+_ | 192.168.8.200 | | 57490 | 2022-11-29 15:36:10.034020+08 | | 2022-11-29 15
:36:13.902728+08 | 2022-11-29 15:36:13.917841+08 | Client | ClientRead | idle | | | select count(*) from t1; | client backend
(1 row) #执行数据库命令sys_ctl kill QUIT终止backend process
[kingbase@node2 bin]$ ./sys_ctl kill QUIT 21894 #如下所示,除了对应的backend process被终止外,其余的后台辅助进程也被终止并重启
[kingbase@node2 bin]$ ps -ef |grep kingbase
kingbase 13089 1 0 14:39 ? 00:00:00 /opt/Kingbase/ES/V8R6_054/KESRealPro/V008R006C005B0054/Server/bin/kingbase -D /db/kingbase/v8r6_054/data
kingbase 13090 13089 0 14:39 ? 00:00:00 kingbase: logger
kingbase 21895 12416 0 15:36 pts/0 00:00:00 ./ksql -U system test
kingbase 22071 13089 0 15:37 ? 00:00:00 kingbase: startup #查看对应的sys_log日志(backend process被终止导致数据库的辅助进程也将重启)
test=# select * from sys_stat_activity where client_addr='192.168.8.200';
WARNING: terminating connection because of crash of another server process
DETAIL: The kingbase has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and repeat your command.
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request. 2022-11-29 15:37:13.828 CST [21894] WARNING: terminating connection because of crash of another server process
2022-11-29 15:37:13.828 CST [21894] DETAIL: The kingbase has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2022-11-29 15:37:13.828 CST [21894] HINT: In a moment you should be able to reconnect to the database and repeat your command.
2022-11-29 15:37:13.829 CST [13089] LOG: server process (PID 21894) exited with exit code 2
2022-11-29 15:37:13.829 CST [13089] DETAIL: Failed process was running: select count(*) from t1;
2022-11-29 15:37:13.829 CST [13089] LOG: terminating any other active server processes
2022-11-29 15:37:13.830 CST [21896] WARNING: terminating connection because of crash of another server process
2022-11-29 15:37:13.830 CST [21896] DETAIL: The kingbase has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2022-11-29 15:37:13.830 CST [21896] HINT: In a moment you should be able to reconnect to the database and repeat your command.
2022-11-29 15:37:13.831 CST [18956] WARNING: terminating connection because of crash of another server process
2022-11-29 15:37:13.831 CST [18956] DETAIL: The kingbase has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2022-11-29 15:37:13.831 CST [18956] HINT: In a moment you should be able to reconnect to the database and repeat your command.
2022-11-29 15:37:13.833 CST [13089] LOG: all server processes terminated; reinitializing
2022-11-29 15:37:13.933 CST [22071] LOG: database system was interrupted; last known up at 2022-11-29 15:19:30 CST
2022-11-29 15:38:29.154 CST [22071] LOG: database system was not properly shut down; automatic recovery in progress
2022-11-29 15:38:29.232 CST [22071] LOG: redo starts at 0/2293680
2022-11-29 15:38:29.232 CST [22071] LOG: redo wal segment count 2
2022-11-29 15:38:29.232 CST [22071] LOG: invalid record length at 0/22936B0: wanted 24, got 0
2022-11-29 15:38:29.232 CST [22071] LOG: redo done at 0/2293680
2022-11-29 15:38:29.767 CST [13089] LOG: database system is ready to accept connection

---如上所示,sys_ctl kill QUIT pid用于终止backend process将给数据库带来极大的风险 。

5、操作系统kill -9 pid

#查询backend process状态信息
prod=# select * from sys_stat_activity where client_addr='192.168.8.200';
datid | datname | pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | xact_start
| query_start | state_change | wait_event_type | wait_event | state | backend_xid | backend_xmin | query
| backend_type
-------+---------+-------+----------+---------+-------------------+---------------+-----------------+-------------+-------------------------------+--------------------------
-----+-------------------------------+-------------------------------+-----------------+---------------------+--------+-------------+--------------+-------------------------
---------+------------------------------ 16385 | prod | 14114 | 10 | system | kingbase_*&+_ | 192.168.8.200 | | 57472 | 2022-11-29 14:48:06.372451+08 |
| 2022-11-29 14:50:16.618969+08 | 2022-11-29 14:50:16.627572+08 | Client | ClientRead | idle | | | select count(*) from t1;
| client backend (1 rows) #操作系统下执行kill -9 pid结束进程
[root@node2 ~]# kill -9 14114 #如下所示,除了对应的backend process被终止外,其余的后台辅助进程也被终止并重启
[root@node2 ~]# ps -ef |grep kingbase
....... kingbase 13089 1 0 14:39 ? 00:00:00 /opt/Kingbase/ES/V8R6_054/KESRealPro/V008R006C005B0054/Server/bin/kingbase -D /db/kingbase/v8r6_054/data
kingbase 13090 13089 0 14:39 ? 00:00:00 kingbase: logger
kingbase 14010 12416 0 14:47 pts/0 00:00:00 ./ksql -U system test
kingbase 15651 13089 0 14:57 ? 00:00:00 kingbase: startup #查看对应的sys_log日志(backend process被终止导致数据库的辅助进程也将重启)
2022-11-29 14:58:00.093 CST [13089] LOG: server process (PID 14114) was terminated by signal 9: Killed
2022-11-29 14:58:00.093 CST [13089] DETAIL: Failed process was running: select count(*) from t1;
2022-11-29 14:58:00.093 CST [13089] LOG: terminating any other active server processes
2022-11-29 14:58:00.093 CST [14602] WARNING: terminating connection because of crash of another server process
2022-11-29 14:58:00.093 CST [14602] DETAIL: The kingbase has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2022-11-29 14:58:00.093 CST [14602] HINT: In a moment you should be able to reconnect to the database and repeat your command.
2022-11-29 14:58:00.095 CST [13095] WARNING: terminating connection because of crash of another server process
2022-11-29 14:58:00.095 CST [13095] DETAIL: The kingbase has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2022-11-29 14:58:00.095 CST [13095] HINT: In a moment you should be able to reconnect to the database and repeat your command.
2022-11-29 14:58:00.384 CST [13089] LOG: all server processes terminated; reinitializing
2022-11-29 14:58:00.464 CST [15651] LOG: database system was interrupted; last known up at 2022-11-29 14:53:42 CST
2022-11-29 14:59:16.706 CST [15651] LOG: database system was not properly shut down; automatic recovery in progress
2022-11-29 14:59:16.806 CST [15651] LOG: redo starts at 0/2293488
2022-11-29 14:59:16.806 CST [15651] LOG: redo wal segment count 2
2022-11-29 14:59:16.806 CST [15651] LOG: invalid record length at 0/2293560: wanted 24, got 0
2022-11-29 14:59:16.806 CST [15651] LOG: redo done at 0/2293530
2022-11-29 14:59:17.563 CST [13089] LOG: database system is ready to accept connections

---如上所示,kill -9 pid用于终止backend process将给数据库带来极大的风险 。

三、总结

对于KingbaseES数据库中异常的backend proces可以采用手工方式终止,执行时选择的方式要注意对数据库带来的风险:

1)相对安全方式:pg_terminate_backend(pid);kill pid;kill -15 pid;sys_ctl kill  TERM pid。
2)不安全方式: kill -3 pid;sys_ctl kill QUIT pid;kill -9 pid。

注意:千万不要kill -9,SIGKILL没有信号处理函数,OS会直接停掉进程;Kingbase父进程发现子进程异常退出,会停掉所有进程,释放共享内存,

再重新申请共享内存,拉起所有进程。效果就等于异常重启,启动时肯定会需要时间redo,可能造成几分钟的停止服务。(除非后果可以接受,否则不要kill -9)。

KingbaseES运维案例之---服务进程(backend process)终止的更多相关文章

  1. 运维案例 | Exchange2010数据库损坏的紧急修复思路

    ​​关注嘉为科技,获取运维新知 Exchange后端数据库故障,一般都会是比较严重的紧急故障,因为这会直接影响到大面积用户的正常使用,而且涉及到用户数据.一旦遇到这种级别的故障,管理员往往都是在非常紧 ...

  2. KingbaseES V8R3集群运维案例之---主库系统down failover切换过程分析

    ​ 案例说明: KingbaseES V8R3集群failover时两个cluster都会触发,但只有一个cluster会调用脚本去执行真正的切换流程,另一个有对应的打印,但不会调用脚本,只是走相关的 ...

  3. KingbaseES V8R3集群运维案例之---用户自定义表空间管理

    ​案例说明: KingbaseES 数据库支持用户自定义表空间的创建,并建议表空间的文件存储路径配置到数据库的data目录之外.本案例复现了,当用户自定义表空间存储路径配置到data下时,出现的故障问 ...

  4. KingbaseES V8R6集群管理运维案例之---repmgr standby switchover故障

    案例说明: 在KingbaseES V8R6集群备库执行"repmgr standby switchover"时,切换失败,并且在执行过程中,伴随着"repmr stan ...

  5. KingbaseES V8R3集群运维案例之---kingbase_monitor.sh启动”two master“案例

    案例说明: KingbaseES V8R3集群,执行kingbase_monitor.sh启动集群,出现"two master"节点的故障,启动集群失败:通过手工sys_ctl启动 ...

  6. KingbaseES V8R6集群运维案例之---repmgr standby promote应用案例

    案例说明: 在容灾环境中,跨区域部署的异地备节点不会自主提升为主节点,在主节点发生故障或者人为需要切换时需要手动执行切换操作.若主节点已经失效,希望将异地备机提升为主节点. $bin/repmgr s ...

  7. KingbaseES V8R3集群运维案例之---cluster.log ERROR: md5 authentication failed

    案例说明: 在KingbaseES V8R3集群的cluster.log日志中,经常会出现"ERROR: md5 authentication failed:DETAIL: password ...

  8. 企业运维案例:xxx is not in the sudoers file.This incident will be reported” 错误解决方法

    CentOS6系统下,普通用户使用sudo执行命令时报错: xxx is not in the sudoers file.This incident will be reported" 解决 ...

  9. Nginx下关于缓存控制字段cache-control的配置说明 - 运维小结

    HTTP协议的Cache -Control指定请求和响应遵循的缓存机制.在请求消息或响应消息中设置 Cache-Control并不会影响另一个消息处理过程中的缓存处理过程.请求时的缓存指令包括: no ...

  10. 虎牙直播运维负责人张观石 | SRE实践指南

    虎牙直播运维负责人张观石 本文是根据虎牙直播运维负责人张观石10月20日在msup携手魅族.Flyme.百度云主办的第十三期魅族开放日<虎牙直播平台SRE实践>演讲中的分享内容整理而成. ...

随机推荐

  1. Spring Boot图书管理系统项目实战-10.借还统计

    导航: pre:  9.归还图书 next:11.检索图书 只挑重点的讲,具体的请看项目源码. 1.项目源码 需要源码的朋友,请捐赠任意金额后留下邮箱发送:) 2.页面设计 2.1 bookStat. ...

  2. 一次nginx返回422状态码的经历

    故事背景 后端使用Docker Compose部署一个代码片段管理应用:snibox,某天因为云服务卡死重启之后再次访问时,登录或退出都返回422状态码. 界面提示如下: 不过奇怪的是:直接通过IP+ ...

  3. 异步aioredis连接时报错TypeError: duplicate base class TimeoutError问题

    版本 python3.11版本,aioredis 2.0.1版本,redis 7.x版本 redis.conf配置文件 daemonize yes bind 0.0.0.0 port 6379 pro ...

  4. Kotlin 协程三 —— 数据流 Flow

    目录 一.Flow 的基本使用 1.1 Sequence 与 Flow 1.2 Flow 的简单使用 1.3 创建常规 Flow 的常用方式: 1.4 Flow 是冷流(惰性的) 1.5 Flow 的 ...

  5. 【Azure App Service for Container】记一次拉取镜像失败的特殊情况

    问题描述 使用Azure App Service For Container 拉取 应用镜像,发现拉取失败. 错误消息: "Image pull failed since Inspect i ...

  6. Jmeter Jsonpath 语法你了解多少?

  7. cpu过高什么原因?怎么排查?

    运行大型程序或应用程序:当计算机运行大型程序或应用程序时,CPU需要处理更多的数据和指令,因此CPU占用率会相应地增加. 病毒或恶意软件:某些病毒或恶意软件会占用计算机的CPU资源来执行恶意任务,例如 ...

  8. Java 关于继承小练习2

    1 package com.bytezero.inherit2; 2 3 4 public class KidsTest 5 { 6 public static void main(String[] ...

  9. supervisor的使用与配置说明

    Supervisor 是用 Python 开发的一套通用的 进程管理程序 ,能将一个普通的命令行进程变为后台daemon,并监控进程状态,异常退出时能自动重启. 一. 安装 1.1 安装 # 根目录下 ...

  10. hadoop 3.3.5伪分布式集群部署以及遇到的问题解决

    hadoop包下载 https://archive.apache.org/dist/hadoop/common/ 安装好jdk并配置环境变量 下载hadoop压缩包并放至 /data/hadoop目录 ...