案例说明:

如下图所示:KingbaseES服务进程结构

   KingbaseES使用客户端/服务器的模型。 对于每个客户端的连接,KingbaseES主进程接收到客户端连接后,会为其创建一个新的服务进程。 KingbaseES 用服务进程来处理连接到数据库服务的客户端请求。 该进程负责实际处理客户端的数据库请求,连接断开时退出。当Client连接到数据库时,会有对应的kingbase的服务进程为其提供服务,如以下Client查询访问:

   如下所示,操作系统对应的服务进程(backend process):

    当Client结束访问正常退出数据库连接时,对应的kingbase的服务进程也将结束;但是当客户端异常退出时,会导致数据库端的kingbase服务进程没有正常结束,并占用数据库资源,本案例将详细描述手工方式对服务进程(backend process)终止。手工结束backend process可以使用数据库工具或者操作系统的kill进程方式,但是不同方式对数据库造成的影响不同。

适用版本:

KingbaseES V8R3/R6

系统架构:

一、客户端访问

[kingbase@node1 bin]$ ./ksql -h 192.168.8.201 -U system -W prod
Password:
ksql (V8.0)
Type "help" for help. prod=# select count(*) from t1;
count
--------
100000
(1 row)

二、backend process终止方案

1、pg_terminate_backend(pid)方式

Tips:函数 pg_terminate_backend() 实际上是给进程发送了一个 SIGTERM 信号。

# 查询backend process状态信息
prod=# select * from sys_stat_activity where client_addr='192.168.8.200';
datid | datname | pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | xact_start | quer
y_start | state_change | wait_event_type | wait_event | state | backend_xid | backend_xmin | query | backend_type
-------+---------+-------+----------+---------+------------------+---------------+-----------------+-------------+-------------------------------+------------+--------------
-----------------+-------------------------------+-----------------+------------+-------+-------------+--------------+--------------------------+----------------
16385 | prod | 17100 | 10 | system | kingbase_*&+_ | 192.168.8.200 | | 57476 | 2022-11-29 15:04:57.131987+08 | | 2022-11-29 15
:05:05.526379+08 | 2022-11-29 15:05:05.539018+08 | Client | ClientRead | idle | | | select count(*) from t1; | client backend
(1 row) #终止backend process对应的pid
prod=# select pg_terminate_backend(17100);
pg_terminate_backend
----------------------
t
(1 row) #进程被终止
prod=# select * from sys_stat_activity where client_addr='192.168.8.200';
datid | datname | pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | xact_start | query_start | state_change | wait
_event_type | wait_event | state | backend_xid | backend_xmin | query | backend_type
-------+---------+-----+----------+---------+------------------+-------------+-----------------+-------------+---------------+------------+-------------+--------------+-----
------------+------------+-------+-------------+--------------+-------+--------------
(0 rows) #如下所示,数据库进程正常,对应的backend pross被安全终止
[root@node2 sys_log]# ps -ef |grep kingbase
kingbase 13089 1 0 14:39 ? 00:00:00 /opt/Kingbase/ES/V8R6_054/KESRealPro/V008R006C005B0054/Server/bin/kingbase -D /db/kingbase/v8r6_054/data
kingbase 13090 13089 0 14:39 ? 00:00:00 kingbase: logger
kingbase 16474 13089 0 14:59 ? 00:00:00 kingbase: checkpointer
kingbase 16475 13089 0 14:59 ? 00:00:00 kingbase: background writer
kingbase 16476 13089 0 14:59 ? 00:00:00 kingbase: walwriter
kingbase 16477 13089 0 14:59 ? 00:00:00 kingbase: autovacuum launcher
kingbase 16478 13089 0 14:59 ? 00:00:00 kingbase: stats collector
kingbase 16479 13089 0 14:59 ? 00:00:00 kingbase: ksh writer
kingbase 16480 13089 0 14:59 ? 00:00:00 kingbase: ksh collector
kingbase 16481 13089 0 14:59 ? 00:00:00 kingbase: kwr collector
kingbase 16482 13089 0 14:59 ? 00:00:00 kingbase: logical replication launcher
kingbase 17100 13089 0 15:04 ? 00:00:00 kingbase: system prod 192.168.8.200(57476) idle
kingbase 17212 12416 0 15:05 pts/0 00:00:00 ./ksql -U system test
kingbase 17214 13089 0 15:05 ? 00:00:00 kingbase: system prod [local] idle

2、操作系统kill pid方式

#查询backend process状态信息
prod=# select * from sys_stat_activity where client_addr='192.168.8.200';
datid | datname | pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | xact_start | quer
y_start | state_change | wait_event_type | wait_event | state | backend_xid | backend_xmin | query | backend_type
-------+---------+-------+----------+---------+------------------+---------------+-----------------+-------------+-------------------------------+------------+--------------
-----------------+-------------------------------+-----------------+------------+-------+-------------+--------------+--------------------------+----------------
16385 | prod | 18424 | 10 | system | kingbase_*&+_ | 192.168.8.200 | | 57480 | 2022-11-29 15:15:26.813035+08 | | 2022-11-29 15
:15:28.912910+08 | 2022-11-29 15:15:28.922719+08 | Client | ClientRead | idle | | | select count(*) from t1; | client backend
(1 row) #操作系统下执行kill pid结束进程
[root@node2 sys_log]# kill 18424 #如下所示,数据库进程正常,对应的backend pross被安全kill
[root@node2 sys_log]# ps -ef |grep kingbase
kingbase 13089 1 0 14:39 ? 00:00:00 /opt/Kingbase/ES/V8R6_054/KESRealPro/V008R006C005B0054/Server/bin/kingbase -D /db/kingbase/v8r6_054/data
kingbase 13090 13089 0 14:39 ? 00:00:00 kingbase: logger
kingbase 16474 13089 0 14:59 ? 00:00:00 kingbase: checkpointer
kingbase 16475 13089 0 14:59 ? 00:00:00 kingbase: background writer
kingbase 16476 13089 0 14:59 ? 00:00:00 kingbase: walwriter
kingbase 16477 13089 0 14:59 ? 00:00:00 kingbase: autovacuum launcher
kingbase 16478 13089 0 14:59 ? 00:00:00 kingbase: stats collector
kingbase 16479 13089 0 14:59 ? 00:00:00 kingbase: ksh writer
kingbase 16480 13089 0 14:59 ? 00:00:00 kingbase: ksh collector
kingbase 16481 13089 0 14:59 ? 00:00:00 kingbase: kwr collector
kingbase 16482 13089 0 14:59 ? 00:00:00 kingbase: logical replication launcher
kingbase 17212 12416 0 15:05 pts/0 00:00:00 ./ksql -U system test
kingbase 17214 13089 0 15:05 ? 00:00:00 kingbase: system prod [local] idle #在数据库视图中已经无此backend process记录
prod=# select * from sys_stat_activity where client_addr='192.168.8.200';
datid | datname | pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | xact_start | query_start | state_change | wait
_event_type | wait_event | state | backend_xid | backend_xmin | query | backend_type
-------+---------+-----+----------+---------+------------------+-------------+-----------------+-------------+---------------+------------+-------------+--------------+-----
------------+------------+-------+-------------+--------------+-------+--------------
(0 rows)

3、操作系统kill -15 pid和数据库sys_ctl kill TERM PID

1)操作系统kill -15 pid

#查询backend process状态信息
test=# select * from sys_stat_activity where client_addr='192.168.8.200';
datid | datname | pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | xact_start | quer
y_start | state_change | wait_event_type | wait_event | state | backend_xid | backend_xmin | query | backend_type
-------+---------+-------+----------+---------+------------------+---------------+-----------------+-------------+-------------------------------+------------+--------------
-----------------+-------------------------------+-----------------+------------+-------+-------------+--------------+--------------------------+----------------
16385 | prod | 22955 | 10 | system | kingbase_*&+_ | 192.168.8.200 | | 57498 | 2022-11-29 15:44:29.993305+08 | | 2022-11-29 15
:44:32.090913+08 | 2022-11-29 15:44:32.100617+08 | Client | ClientRead | idle | | | select count(*) from t1; | client backend
(1 row) #操作系统下执行kill -15 pid结束进程
[kingbase@node2 bin]$ kill -15 22955 [kingbase@node2 bin]$ ps -ef |grep kingbase #如下所示,数据库进程正常,对应的backend pross被安全kill
kingbase 13089 1 0 14:39 ? 00:00:00 /opt/Kingbase/ES/V8R6_054/KESRealPro/V008R006C005B0054/Server/bin/kingbase -D /db/kingbase/v8r6_054/data
kingbase 22197 13089 0 15:38 ? 00:00:00 kingbase: checkpointer
kingbase 22198 13089 0 15:38 ? 00:00:00 kingbase: background writer
kingbase 22199 13089 0 15:38 ? 00:00:00 kingbase: walwriter
kingbase 22200 13089 0 15:38 ? 00:00:00 kingbase: autovacuum launcher
kingbase 22201 13089 0 15:38 ? 00:00:00 kingbase: stats collector
kingbase 22202 13089 0 15:38 ? 00:00:00 kingbase: ksh writer
kingbase 22203 13089 0 15:38 ? 00:00:00 kingbase: ksh collector
kingbase 22204 13089 0 15:38 ? 00:00:00 kingbase: kwr collector
kingbase 22205 13089 0 15:38 ? 00:00:00 kingbase: logical replication launcher
kingbase 22444 12416 0 15:40 pts/0 00:00:00 ./ksql -U system test
kingbase 22445 13089 0 15:40 ? 00:00:00 kingbase: system test [local] idle #在数据库视图中已经无此backend process记录
test=# select * from sys_stat_activity where client_addr='192.168.8.200';
datid | datname | pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | xact_start | query_start | state_change | wait
_event_type | wait_event | state | backend_xid | backend_xmin | query | backend_type
-------+---------+-----+----------+---------+------------------+-------------+-----------------+-------------+---------------+------------+-------------+--------------+-----
------------+------------+-------+-------------+--------------+-------+--------------
(0 rows)

2)数据库sys_ctl kill TERM pid

#查询backend process状态信息
test=# select * from sys_stat_activity where client_addr='192.168.8.200';
datid | datname | pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | xact_start | quer
y_start | state_change | wait_event_type | wait_event | state | backend_xid | backend_xmin | query | backend_type
-------+---------+-------+----------+---------+------------------+---------------+-----------------+-------------+-------------------------------+------------+--------------
-----------------+-------------------------------+-----------------+------------+-------+-------------+--------------+--------------------------+----------------
16385 | prod | 22443 | 10 | system | kingbase_*&+_ | 192.168.8.200 | | 57494 | 2022-11-29 15:40:42.804868+08 | | 2022-11-29 15
:40:44.972533+08 | 2022-11-29 15:40:44.985340+08 | Client | ClientRead | idle | | | select count(*) from t1; | client backend
(1 row) #执行数据库命令kill进程
[kingbase@node2 bin]$ ./sys_ctl kill TERM 22443 #如下所示,数据库进程正常,对应的backend pross被安全kill
[kingbase@node2 bin]$ ps -ef |grep kingbase
kingbase 13089 1 0 14:39 ? 00:00:00 /opt/Kingbase/ES/V8R6_054/KESRealPro/V008R006C005B0054/Server/bin/kingbase -D /db/kingbase/v8r6_054/data
kingbase 22197 13089 0 15:38 ? 00:00:00 kingbase: checkpointer
kingbase 22198 13089 0 15:38 ? 00:00:00 kingbase: background writer
kingbase 22199 13089 0 15:38 ? 00:00:00 kingbase: walwriter
kingbase 22200 13089 0 15:38 ? 00:00:00 kingbase: autovacuum launcher
kingbase 22201 13089 0 15:38 ? 00:00:00 kingbase: stats collector
kingbase 22202 13089 0 15:38 ? 00:00:00 kingbase: ksh writer
kingbase 22203 13089 0 15:38 ? 00:00:00 kingbase: ksh collector
kingbase 22204 13089 0 15:38 ? 00:00:00 kingbase: kwr collector
kingbase 22205 13089 0 15:38 ? 00:00:00 kingbase: logical replication launcher
kingbase 22444 12416 0 15:40 pts/0 00:00:00 ./ksql -U system test
kingbase 22445 13089 0 15:40 ? 00:00:00 kingbase: system test [local] idle #在数据库视图中已经无此backend process记录
test=# select * from sys_stat_activity where client_addr='192.168.8.200';
datid | datname | pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | xact_start | query_start | state_change | wait
_event_type | wait_event | state | backend_xid | backend_xmin | query | backend_type
-------+---------+-----+----------+---------+------------------+-------------+-----------------+-------------+---------------+------------+-------------+--------------+-----
------------+------------+-------+-------------+--------------+-------+--------------
(0 rows)

4、操作系统kill -3 pid和数据库sys_ctl kill QUIT PID

1)操作系统kill -3 pid

#查询backend process状态信息
prod=# select * from sys_stat_activity where client_addr='192.168.8.200';
datid | datname | pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | xact_start | quer
y_start | state_change | wait_event_type | wait_event | state | backend_xid | backend_xmin | query
| backend_type
-------+---------+-------+----------+---------+------------------+---------------+-----------------+-------------+-------------------------------+------------+--------------
-----------------+-------------------------------+-----------------+------------+-------+-------------+--------------+-------------------------------------------------------
--------------+----------------
16385 | prod | 18666 | 10 | system | kingbase_*&+_ | 192.168.8.200 | | 57486 | 2022-11-29 15:17:34.726155+08 | | 2022-11-29 15
:17:34.736202+08 | 2022-11-29 15:17:34.740584+08 | Client | ClientRead | idle | | | select setting from pg_settings where name = 'enable_u
pper_colname' | client backend
(1 row) #操作系统下执行kill -3 pid结束进程
[root@node2 sys_log]# kill -3 18666 #如下所示,除了对应的backend process被终止外,其余的后台辅助进程也被终止并重启
[root@node2 sys_log]# ps -ef |grep kingbase
kingbase 13089 1 0 14:39 ? 00:00:00 /opt/Kingbase/ES/V8R6_054/KESRealPro/V008R006C005B0054/Server/bin/kingbase -D /db/kingbase/v8r6_054/data
kingbase 13090 13089 0 14:39 ? 00:00:00 kingbase: logger
kingbase 17212 12416 0 15:05 pts/0 00:00:00 ./ksql -U system test
kingbase 18795 13089 0 15:18 ? 00:00:00 kingbase: startup #查看对应的sys_log日志(backend process被终止导致数据库的辅助进程也将重启)
prod=# select * from sys_stat_activity where client_addr='192.168.8.200';
WARNING: terminating connection because of crash of another server process
DETAIL: The kingbase has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and repeat your command.
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
2022-11-29 15:18:06.741 CST [18666] WARNING: terminating connection because of crash of another server process
2022-11-29 15:18:06.741 CST [18666] DETAIL: The kingbase has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2022-11-29 15:18:06.741 CST [18666] HINT: In a moment you should be able to reconnect to the database and repeat your command.
2022-11-29 15:18:06.742 CST [13089] LOG: server process (PID 18666) exited with exit code 2
2022-11-29 15:18:06.742 CST [13089] DETAIL: Failed process was running: select setting from pg_settings where name = 'enable_upper_colname'
2022-11-29 15:18:06.742 CST [13089] LOG: terminating any other active server processes
2022-11-29 15:18:06.743 CST [17214] WARNING: terminating connection because of crash of another server process
2022-11-29 15:18:06.743 CST [17214] DETAIL: The kingbase has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2022-11-29 15:18:06.743 CST [17214] HINT: In a moment you should be able to reconnect to the database and repeat your command.
2022-11-29 15:18:06.744 CST [16477] WARNING: terminating connection because of crash of another server process
2022-11-29 15:18:06.744 CST [16477] DETAIL: The kingbase has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2022-11-29 15:18:06.744 CST [16477] HINT: In a moment you should be able to reconnect to the database and repeat your command.
2022-11-29 15:18:06.745 CST [13089] LOG: all server processes terminated; reinitializing
2022-11-29 15:18:06.823 CST [18795] LOG: database system was interrupted; last known up at 2022-11-29 14:59:17 CST
2022-11-29 15:19:29.897 CST [18795] LOG: database system was not properly shut down; automatic recovery in progress
2022-11-29 15:19:30.063 CST [18795] LOG: redo starts at 0/22935D8
2022-11-29 15:19:30.063 CST [18795] LOG: redo wal segment count 2
2022-11-29 15:19:30.063 CST [18795] LOG: invalid record length at 0/2293608: wanted 24, got 0
2022-11-29 15:19:30.063 CST [18795] LOG: redo done at 0/22935D8
2022-11-29 15:19:30.739 CST [13089] LOG: database system is ready to accept connections #数据库服务重启后
[root@node2 sys_log]# ps -ef |grep kingbase
kingbase 13089 1 0 14:39 ? 00:00:00 /opt/Kingbase/ES/V8R6_054/KESRealPro/V008R006C005B0054/Server/bin/kingbase -D /db/kingbase/v8r6_054/data
kingbase 13090 13089 0 14:39 ? 00:00:00 kingbase: logger
kingbase 17212 12416 0 15:05 pts/0 00:00:00 ./ksql -U system test
kingbase 18953 13089 0 15:19 ? 00:00:00 kingbase: checkpointer
kingbase 18954 13089 0 15:19 ? 00:00:00 kingbase: background writer
kingbase 18955 13089 0 15:19 ? 00:00:00 kingbase: walwriter
kingbase 18957 13089 0 15:19 ? 00:00:00 kingbase: stats collector
kingbase 18958 13089 0 15:19 ? 00:00:00 kingbase: ksh writer
kingbase 18959 13089 0 15:19 ? 00:00:00 kingbase: ksh collector
kingbase 18960 13089 0 15:19 ? 00:00:00 kingbase: kwr collector

---如上所示,kill -3 pid用于终止backend process将给数据库带来极大的风险 。

2)数据库sys_ctl kill QUIT PID

#查询backend process状态信息
test=# select * from sys_stat_activity where client_addr='192.168.8.200';
datid | datname | pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | xact_start | quer
y_start | state_change | wait_event_type | wait_event | state | backend_xid | backend_xmin | query | backend_type
-------+---------+-------+----------+---------+------------------+---------------+-----------------+-------------+-------------------------------+------------+--------------
-----------------+-------------------------------+-----------------+------------+-------+-------------+--------------+--------------------------+----------------
16385 | prod | 21894 | 10 | system | kingbase_*&+_ | 192.168.8.200 | | 57490 | 2022-11-29 15:36:10.034020+08 | | 2022-11-29 15
:36:13.902728+08 | 2022-11-29 15:36:13.917841+08 | Client | ClientRead | idle | | | select count(*) from t1; | client backend
(1 row) #执行数据库命令sys_ctl kill QUIT终止backend process
[kingbase@node2 bin]$ ./sys_ctl kill QUIT 21894 #如下所示,除了对应的backend process被终止外,其余的后台辅助进程也被终止并重启
[kingbase@node2 bin]$ ps -ef |grep kingbase
kingbase 13089 1 0 14:39 ? 00:00:00 /opt/Kingbase/ES/V8R6_054/KESRealPro/V008R006C005B0054/Server/bin/kingbase -D /db/kingbase/v8r6_054/data
kingbase 13090 13089 0 14:39 ? 00:00:00 kingbase: logger
kingbase 21895 12416 0 15:36 pts/0 00:00:00 ./ksql -U system test
kingbase 22071 13089 0 15:37 ? 00:00:00 kingbase: startup #查看对应的sys_log日志(backend process被终止导致数据库的辅助进程也将重启)
test=# select * from sys_stat_activity where client_addr='192.168.8.200';
WARNING: terminating connection because of crash of another server process
DETAIL: The kingbase has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and repeat your command.
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request. 2022-11-29 15:37:13.828 CST [21894] WARNING: terminating connection because of crash of another server process
2022-11-29 15:37:13.828 CST [21894] DETAIL: The kingbase has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2022-11-29 15:37:13.828 CST [21894] HINT: In a moment you should be able to reconnect to the database and repeat your command.
2022-11-29 15:37:13.829 CST [13089] LOG: server process (PID 21894) exited with exit code 2
2022-11-29 15:37:13.829 CST [13089] DETAIL: Failed process was running: select count(*) from t1;
2022-11-29 15:37:13.829 CST [13089] LOG: terminating any other active server processes
2022-11-29 15:37:13.830 CST [21896] WARNING: terminating connection because of crash of another server process
2022-11-29 15:37:13.830 CST [21896] DETAIL: The kingbase has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2022-11-29 15:37:13.830 CST [21896] HINT: In a moment you should be able to reconnect to the database and repeat your command.
2022-11-29 15:37:13.831 CST [18956] WARNING: terminating connection because of crash of another server process
2022-11-29 15:37:13.831 CST [18956] DETAIL: The kingbase has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2022-11-29 15:37:13.831 CST [18956] HINT: In a moment you should be able to reconnect to the database and repeat your command.
2022-11-29 15:37:13.833 CST [13089] LOG: all server processes terminated; reinitializing
2022-11-29 15:37:13.933 CST [22071] LOG: database system was interrupted; last known up at 2022-11-29 15:19:30 CST
2022-11-29 15:38:29.154 CST [22071] LOG: database system was not properly shut down; automatic recovery in progress
2022-11-29 15:38:29.232 CST [22071] LOG: redo starts at 0/2293680
2022-11-29 15:38:29.232 CST [22071] LOG: redo wal segment count 2
2022-11-29 15:38:29.232 CST [22071] LOG: invalid record length at 0/22936B0: wanted 24, got 0
2022-11-29 15:38:29.232 CST [22071] LOG: redo done at 0/2293680
2022-11-29 15:38:29.767 CST [13089] LOG: database system is ready to accept connection

---如上所示,sys_ctl kill QUIT pid用于终止backend process将给数据库带来极大的风险 。

5、操作系统kill -9 pid

#查询backend process状态信息
prod=# select * from sys_stat_activity where client_addr='192.168.8.200';
datid | datname | pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | xact_start
| query_start | state_change | wait_event_type | wait_event | state | backend_xid | backend_xmin | query
| backend_type
-------+---------+-------+----------+---------+-------------------+---------------+-----------------+-------------+-------------------------------+--------------------------
-----+-------------------------------+-------------------------------+-----------------+---------------------+--------+-------------+--------------+-------------------------
---------+------------------------------ 16385 | prod | 14114 | 10 | system | kingbase_*&+_ | 192.168.8.200 | | 57472 | 2022-11-29 14:48:06.372451+08 |
| 2022-11-29 14:50:16.618969+08 | 2022-11-29 14:50:16.627572+08 | Client | ClientRead | idle | | | select count(*) from t1;
| client backend (1 rows) #操作系统下执行kill -9 pid结束进程
[root@node2 ~]# kill -9 14114 #如下所示,除了对应的backend process被终止外,其余的后台辅助进程也被终止并重启
[root@node2 ~]# ps -ef |grep kingbase
....... kingbase 13089 1 0 14:39 ? 00:00:00 /opt/Kingbase/ES/V8R6_054/KESRealPro/V008R006C005B0054/Server/bin/kingbase -D /db/kingbase/v8r6_054/data
kingbase 13090 13089 0 14:39 ? 00:00:00 kingbase: logger
kingbase 14010 12416 0 14:47 pts/0 00:00:00 ./ksql -U system test
kingbase 15651 13089 0 14:57 ? 00:00:00 kingbase: startup #查看对应的sys_log日志(backend process被终止导致数据库的辅助进程也将重启)
2022-11-29 14:58:00.093 CST [13089] LOG: server process (PID 14114) was terminated by signal 9: Killed
2022-11-29 14:58:00.093 CST [13089] DETAIL: Failed process was running: select count(*) from t1;
2022-11-29 14:58:00.093 CST [13089] LOG: terminating any other active server processes
2022-11-29 14:58:00.093 CST [14602] WARNING: terminating connection because of crash of another server process
2022-11-29 14:58:00.093 CST [14602] DETAIL: The kingbase has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2022-11-29 14:58:00.093 CST [14602] HINT: In a moment you should be able to reconnect to the database and repeat your command.
2022-11-29 14:58:00.095 CST [13095] WARNING: terminating connection because of crash of another server process
2022-11-29 14:58:00.095 CST [13095] DETAIL: The kingbase has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2022-11-29 14:58:00.095 CST [13095] HINT: In a moment you should be able to reconnect to the database and repeat your command.
2022-11-29 14:58:00.384 CST [13089] LOG: all server processes terminated; reinitializing
2022-11-29 14:58:00.464 CST [15651] LOG: database system was interrupted; last known up at 2022-11-29 14:53:42 CST
2022-11-29 14:59:16.706 CST [15651] LOG: database system was not properly shut down; automatic recovery in progress
2022-11-29 14:59:16.806 CST [15651] LOG: redo starts at 0/2293488
2022-11-29 14:59:16.806 CST [15651] LOG: redo wal segment count 2
2022-11-29 14:59:16.806 CST [15651] LOG: invalid record length at 0/2293560: wanted 24, got 0
2022-11-29 14:59:16.806 CST [15651] LOG: redo done at 0/2293530
2022-11-29 14:59:17.563 CST [13089] LOG: database system is ready to accept connections

---如上所示,kill -9 pid用于终止backend process将给数据库带来极大的风险 。

三、总结

对于KingbaseES数据库中异常的backend proces可以采用手工方式终止,执行时选择的方式要注意对数据库带来的风险:

1)相对安全方式:pg_terminate_backend(pid);kill pid;kill -15 pid;sys_ctl kill  TERM pid。
2)不安全方式: kill -3 pid;sys_ctl kill QUIT pid;kill -9 pid。

注意:千万不要kill -9,SIGKILL没有信号处理函数,OS会直接停掉进程;Kingbase父进程发现子进程异常退出,会停掉所有进程,释放共享内存,

再重新申请共享内存,拉起所有进程。效果就等于异常重启,启动时肯定会需要时间redo,可能造成几分钟的停止服务。(除非后果可以接受,否则不要kill -9)。

KingbaseES运维案例之---服务进程(backend process)终止的更多相关文章

  1. 运维案例 | Exchange2010数据库损坏的紧急修复思路

    ​​关注嘉为科技,获取运维新知 Exchange后端数据库故障,一般都会是比较严重的紧急故障,因为这会直接影响到大面积用户的正常使用,而且涉及到用户数据.一旦遇到这种级别的故障,管理员往往都是在非常紧 ...

  2. KingbaseES V8R3集群运维案例之---主库系统down failover切换过程分析

    ​ 案例说明: KingbaseES V8R3集群failover时两个cluster都会触发,但只有一个cluster会调用脚本去执行真正的切换流程,另一个有对应的打印,但不会调用脚本,只是走相关的 ...

  3. KingbaseES V8R3集群运维案例之---用户自定义表空间管理

    ​案例说明: KingbaseES 数据库支持用户自定义表空间的创建,并建议表空间的文件存储路径配置到数据库的data目录之外.本案例复现了,当用户自定义表空间存储路径配置到data下时,出现的故障问 ...

  4. KingbaseES V8R6集群管理运维案例之---repmgr standby switchover故障

    案例说明: 在KingbaseES V8R6集群备库执行"repmgr standby switchover"时,切换失败,并且在执行过程中,伴随着"repmr stan ...

  5. KingbaseES V8R3集群运维案例之---kingbase_monitor.sh启动”two master“案例

    案例说明: KingbaseES V8R3集群,执行kingbase_monitor.sh启动集群,出现"two master"节点的故障,启动集群失败:通过手工sys_ctl启动 ...

  6. KingbaseES V8R6集群运维案例之---repmgr standby promote应用案例

    案例说明: 在容灾环境中,跨区域部署的异地备节点不会自主提升为主节点,在主节点发生故障或者人为需要切换时需要手动执行切换操作.若主节点已经失效,希望将异地备机提升为主节点. $bin/repmgr s ...

  7. KingbaseES V8R3集群运维案例之---cluster.log ERROR: md5 authentication failed

    案例说明: 在KingbaseES V8R3集群的cluster.log日志中,经常会出现"ERROR: md5 authentication failed:DETAIL: password ...

  8. 企业运维案例:xxx is not in the sudoers file.This incident will be reported” 错误解决方法

    CentOS6系统下,普通用户使用sudo执行命令时报错: xxx is not in the sudoers file.This incident will be reported" 解决 ...

  9. Nginx下关于缓存控制字段cache-control的配置说明 - 运维小结

    HTTP协议的Cache -Control指定请求和响应遵循的缓存机制.在请求消息或响应消息中设置 Cache-Control并不会影响另一个消息处理过程中的缓存处理过程.请求时的缓存指令包括: no ...

  10. 虎牙直播运维负责人张观石 | SRE实践指南

    虎牙直播运维负责人张观石 本文是根据虎牙直播运维负责人张观石10月20日在msup携手魅族.Flyme.百度云主办的第十三期魅族开放日<虎牙直播平台SRE实践>演讲中的分享内容整理而成. ...

随机推荐

  1. JVM详解

    1 JVM运行机制概述 JVM运行机制 类加载机制: 类加载过程由类加载器来完成,即由ClassLoader及其子类实现,有隐式加载和显式加载两种方式.隐式加载是指在使用new等方式创建对象时会隐式调 ...

  2. Swoole从入门到入土(28)——协程[核心API]

    本节专门介绍swoole提供的协程机制中核心的API 类方法: 1) set():协程设置,设置协程相关选项. Swoole\Coroutine::set(array $options); 2) ge ...

  3. Swoole从入门到入土(24)——多进程[进程管理器Process\Manager]

    Swoole提供的进程管理器Process\Manage,基于 Process\Pool 实现.可以管理多个进程.相比与 Process\Pool,可以非常方便的创建多个执行不同任务的进程,并且可以控 ...

  4. Oracle 中LONG RAW BLOB CLOB类型介绍

    说明: RAW: 未加工类型,可存储二进制数据或字节符 LONG: 可变长的字符串数据,最长2G,LONG具有VARCHAR2列的特性,可以存储长文本一个表中最多一个LONG列[不建议使用] LONG ...

  5. 启动MySQL5.7服务无法启动或Table 'mysql.plugin' doesn't exist

    首先说一下我这个是mysql5.7.16免安装版,不过这个问题对于5.7版本应该都适用. 问题重现: 安装过程也说一下吧: 1.将下载的压缩文件解压到指定目录,     我的是:E:\program\ ...

  6. Kotlin 协程二 —— 通道 Channel

    目录 一. Channel 基本使用 1.1 Channel 的概念 1.2 Channel 的简单使用 1.3 Channel 的迭代 1.4 close 关闭 Channel 1.5 Channe ...

  7. React 中 Ref 引用

    不要因为别人的评价而改变自己的想法,因为你的生活是你自己的. 1. React 中 Ref 的应用 1.1 给标签设置 ref 给标签设置 ref,ref="username", ...

  8. 【Azure Redis 缓存】Azure Redis服务开启了SSL(6380端口), PHP如何访问缓存呢?

    问题描述 使用6379端口连接Azure Redis服务,连接失败.因为默认情况下Azure Redis的设置没有打开6379的端口.需要使用SSL(6380端口)进行连接,但是遇见了无法连接的问题. ...

  9. 聊聊 HTTP 性能优化

    哈喽大家好,我是咸鱼. 作为用户的我们在 "上网冲浪" 的时候总是希望快一点,尤其是抢演唱会门票的时候,但是现实并非如此,有时候我们会遇到页面加载缓慢.响应延迟的情况. 而 HTT ...

  10. Android Handler实现子线程与子线程、主线程之间通信

    一.子线程向主线程传值: 首选在主线程里创建一个Handler 1 Handler mHandler = new Handler(){ 2 @Override 3 public void handle ...