inux and UNIX® like operating systems commonly use signals to communicate between processes. The use of the command line kill is widely known. WebSphere Application Servers on Linux and UNIX by default respond to kill -3 by producing a javacore, and to kill -11 by creating s system core and exiting. There are in fact a lot of signals that may be sent and acted on.

In some cases, we determine that a signal has unexpectedly come to a WebSphere Application Server and we need to determine which process/user sent the signal. This is possible in most cases with strace command for kill -3, but kill -9 and kill -11 are not usually reported.

The strace utility is fairly universal and starting it with this line will generally find the source of kill -3 and so on:

strace -tt -o /tmp/traceit -p <pid> &

This results in volumes of output that do include the source of most signals:

strace -tt -o /tmp/traceit -p <pid> &
16:08:45.388961 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
16:08:45.389113 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=21398, si_uid=1000, si_status=0, si_utime=0, si_stime=0} ---

16:09:01.210200 --- SIGTTOU {si_signo=SIGTTOU, si_code=SI_USER, si_pid=829, si_uid=1000} ---

In case you do not recognize SIGTTOU   use kill -l to list signals on your environment:

kill -l

1) SIGHUP 2) SIGINT 3) SIGQUIT 4) SIGILL 5) SIGTRAP

6) SIGABRT 7) SIGBUS 8) SIGFPE 9) SIGKILL 10) SIGUSR1

11) SIGSEGV 12) SIGUSR2 13) SIGPIPE 14) SIGALRM 15) SIGTERM

16) SIGSTKFLT 17) SIGCHLD 18) SIGCONT 19) SIGSTOP 20) SIGTSTP

21) SIGTTIN 22) SIGTTOU 23) SIGURG 24) SIGXCPU 25) SIGXFSZ

26) SIGVTALRM 27) SIGPROF 28) SIGWINCH 29) SIGIO 30) SIGPWR

31) SIGSYS 34) SIGRTMIN 35) SIGRTMIN+1 36) SIGRTMIN+2 37) SIGRTMIN+3

38) SIGRTMIN+4 39) SIGRTMIN+5 40) SIGRTMIN+6 41) SIGRTMIN+7 42) SIGRTMIN+8

43) SIGRTMIN+9 44) SIGRTMIN+10 45) SIGRTMIN+11 46) SIGRTMIN+12 47) SIGRTMIN+13

48) SIGRTMIN+14 49) SIGRTMIN+15 50) SIGRTMAX-14 51) SIGRTMAX-13 52) SIGRTMAX-12

53) SIGRTMAX-11 54) SIGRTMAX-10 55) SIGRTMAX-9 56) SIGRTMAX-8 57) SIGRTMAX-7

58) SIGRTMAX-6 59) SIGRTMAX-5 60) SIGRTMAX-4 61) SIGRTMAX-3 62) SIGRTMAX-2

63) SIGRTMAX-1 64) SIGRTMAX

which may be a surprise if you never used anything but 3, 9, and 11.   kill -22 is SIGTHOU and the process id and userid of the sender are listed. Unfortunately, most of the time strace does not show kill -9 and kill -11 as they are not trapped and all you get is this line:

++++  killed by SIGKILL  +++

There are 2 available tools that are not usually installed and/or active on Linux but have so much functionality, they should be. These tools are included in the Linux repositories for the RHEL, SUSE, and Fedora distributions and are installed as any other software package would be using the usual Linux install tools. Since they are very functional at the system level, root or elevated access rights are needed. However, the install process is quite simple and the functionality is worthwhile.

AUDIT

Auditd is a daemon process or service that does as the name implies and produces audit logs of System level activities. It is installed from the usual repository as the audit package and then is configured in /etc/audit/auditd.conf and the rules are in /etc/audit/audit.rules.

Example entry for kill signal logging:

-a entry,always -F arch=b64 -S kill -k kill_signals

then the command: sevice auditd start

will log all signals in /ver/audit/audit.log with a key of kill_signals for searching by your favorite editor or you may use ausearch -k kill_signals

Of course, this example captures all signals and is quite verbose. The usual output will look like this:

time->Wed Jun  3 16:34:08 2015
type=SYSCALL msg=audit(1433363648.091:6342): arch=c000003e syscall=62 success=no exit=-3 a0=1e06 a1=0 a2=1e06 a3=fffffffffffffff0 items=0 ppid=10044 pid=10140 auid=500 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=pts0 ses=2 comm=4174746163682041504920696E6974 exe="/opt/ibm/WebSphere/AppServer/java/jre/bin/java" subj=unconfined_u:unconfined_r:unconfined_java_t:s0-s0:c0.c1023 key="kill_signals"
----
time->Wed Jun  3 16:34:08 2015
type=OBJ_PID msg=audit(1433363648.130:6343): opid=27307 oauid=-1 ouid=0 oses=-1 obj=system_u:system_r:initrc_t:s0 ocomm="symcfgd"
type=SYSCALL msg=audit(1433363648.130:6343): arch=c000003e syscall=62 success=yes exit=0 a0=6aab a1=12 a2=f a3=50d items=0 ppid=1 pid=27214 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="sav-limitcpu" exe="/usr/bin/sav-limitcpu" subj=system_u:system_r:initrc_t:s0 key="kill_signals"
----
time->Wed Jun  3 16:34:08 2015

Stop the logging with service auditd stop command and see this link from RedHat for more information: How to use audit to monitor a specific SYSCALL

System Tap

This tool is relatively more complex and flexible than the audit tool. The tool provide probe and taps that are written in a script that is remarkably C like. It is similar to Dtrace on Solaris in that regard. It is also similar to Dtrace in that it offers a lot of probes to look at performance and memory as well as network activity. It too is easily installed (for example on RHEL yum install systemtap does it). Root access does seem to be required. Good news, it comes with a set of taps that will perform a comprehesive set of tracing. These live in /usr/share/systemtap. Root access is required or you may be a member of a group with the privileges.

The basic command:

stap sigkill.stp gets very verbose

even on lab systems while the same script can be filtered. An example to trace kill commands for a specific pid and a specific command:

stap sigkill.stp -x <pid> SIGKILL

which logs:

SIGKILL was sent to java (pid:<pid>) by bash uid:0

on testing on a command sent from the command line.

So you do need the script sigkill.stp which is created by RedHat and looks like this:

#! /usr/bin/env stap
# sigkill.stp
# Copyright (C) 2007 Red Hat, Inc., Eugene Teo <eteo@redhat.com>
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License version 2 as
# published by the Free Software Foundation.
#
# /usr/share/systemtap/tapset/signal.stp:
# [...]
# probe signal.send = _signal.send.*
# {
#     sig=$sig
#     sig_name = _signal_name($sig)
#     sig_pid = task_pid(task)
#     pid_name = task_execname(task)
# [...]
probe signal.send {
  if (sig_name == "SIGKILL")
    printf("%s was sent to %s (pid:%d) by %s uid:%d\n",
           sig_name, pid_name, sig_pid, execname(), uid())
}

Here is a very useful link for System Tap. It shows some useful tools for tracking down most signals (strace) or all of them (audit and system tap):
Red Hat Enterprise Linux 6 SystemTap Beginners Guide Introduction to SystemTap

https://www.ibm.com/developerworks/community/blogs/aimsupport/entry/Finding_the_source_of_signals_on_Linux_with_strace_auditd_or_Systemtap?lang=en

Finding the source of signals on Linux with strace, auditd, or systemtap的更多相关文章

  1. Linux利器 strace [看出process呼叫哪個system call]

    Linux利器 strace strace常用来跟踪进程执行时的系统调用和所接收的信号. 在Linux世界,进程不能直接访问硬件设备,当进程需要访问硬件设备(比如读取磁盘文件,接收网络数据等等)时,必 ...

  2. linux神器strace

    man strace: strace - trace system calls and signals DESCRIPTION In the simplest case strace runs the ...

  3. linux神器 strace解析

    除了人格以外,人最大的损失,莫过于失掉自信心了. 前言 strace可以说是神器一般的存在了,对于研究代码调用,内核级调用.系统级调用有非常重要的作用.打算了一周了,只有原文,一直没有梳理,拖延症犯了 ...

  4. linux申请strace ,lstrace, ptrace, dtrace

    ltrace命令是用来跟踪进程调用库函数的情况. ltrace -hUsage: ltrace [option ...] [command [arg ...]]Trace library calls ...

  5. Linux 的 strace 命令

    https://linux.cn/article-3935-1.html http://www.cnblogs.com/ggjucheng/archive/2012/01/08/2316692.htm ...

  6. Linux调试工具strace和gdb常用命令小结

    strace和gdb是Linux环境下的两个常用调试工具,这里是个人在使用过程中对这两个工具常用参数的总结,留作日后查看使用. strace调试工具 strace工具用于跟踪进程执行时的系统调用和所接 ...

  7. linux下strace命令详解

    简介 strace常用来跟踪进程执行时的系统调用和所接收的信号. 在Linux世界,进程不能直接访问硬件设备,当进程需要访问硬件设备(比如读取磁盘文件,接收网络数据等等)时,必须由用户态模式切换至内核 ...

  8. 使用 Linux 的 strace 命令跟踪/调试程序的常用选项

    原文:http://linoxide.com/linux-command/linux-strace-command-examples/作者: Raghu 在调试的时候,strace能帮助你追踪到一个程 ...

  9. Linux利器strace

    strace常用来跟踪进程执行时的系统调用和所接收的信号. 在Linux世界,进程不能直接访问硬件设备,当进程需要访问硬件设备(比如读取磁盘文件,接收网络数据等等)时,必须由用户态模式切换至内核态模式 ...

随机推荐

  1. javascript实现代码高亮-wangHighLighter.js

    1. 引言 (先贴出wangHighLighter.js的github地址:https://github.com/wangfupeng1988/wangHighLighter注意,程序和使用说明的更新 ...

  2. C++版Hello World

    代码 #include <iostream> using namespace std; int main() { cout << ; } 开头那两句代码 暂时先记住吧 #inc ...

  3. Java NIO系列教程(九) ServerSocketChannel

    Java NIO中的 ServerSocketChannel 是一个可以监听新进来的TCP连接的通道, 就像标准IO中的ServerSocket一样.ServerSocketChannel类在 jav ...

  4. java虚拟机学习-触摸java常量池(13-1)

    java虚拟机学习-深入理解JVM(1) java虚拟机学习-慢慢琢磨JVM(2) java虚拟机学习-慢慢琢磨JVM(2-1)ClassLoader的工作机制 java虚拟机学习-JVM内存管理:深 ...

  5. elasticSearch6源码分析(5)gateway模块

    1.gateway概述 The local gateway module stores the cluster state and shard data across full cluster res ...

  6. dom操作------创建节点/插入节点

    <section> <div id="box" style="position: relative;"> <p id=" ...

  7. C#使用命令编译代码

    1.在路径%SystemRoot%\Microsoft.NET\Framework\vX.X.X(安装的.net版本号)下找到csc.exe,在cmd窗口cd到该路径下. ps(在该路径下有一个CSC ...

  8. Java集合类源码解析:LinkedHashMap

    前言 今天继续学习关于Map家族的另一个类 LinkedHashMap .先说明一下,LinkedHashMap 是继承于 HashMap 的,所以本文只针对 LinkedHashMap 的特性学习, ...

  9. java.lang.ClassCastException: java.lang.Short cannot be cast to java.lang.String(Short类型无法强转成String类型)

    有一行Java代码如下: String code1 = (String)qTable1.getValueAt(i, 0); 这是一个Java的图形界面获取表格中值的代码,其中qTable1.getVa ...

  10. js获取指定格式的时间字符串

    如下: // 对Date的扩展,将 Date 转化为指定格式的String // 月(M).日(d).小时(h).分(m).秒(s).季度(q) 可以用 1-2 个占位符, // 年(y)可以用 1- ...