Finding the source of signals on Linux with strace, auditd, or systemtap
inux and UNIX® like operating systems commonly use signals to communicate between processes. The use of the command line kill is widely known. WebSphere Application Servers on Linux and UNIX by default respond to kill -3 by producing a javacore, and to kill -11 by creating s system core and exiting. There are in fact a lot of signals that may be sent and acted on.
In some cases, we determine that a signal has unexpectedly come to a WebSphere Application Server and we need to determine which process/user sent the signal. This is possible in most cases with strace command for kill -3, but kill -9 and kill -11 are not usually reported.
The strace utility is fairly universal and starting it with this line will generally find the source of kill -3 and so on:
strace -tt -o /tmp/traceit -p <pid> &
This results in volumes of output that do include the source of most signals:
strace -tt -o /tmp/traceit -p <pid> &
16:08:45.388961 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
16:08:45.389113 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=21398, si_uid=1000, si_status=0, si_utime=0, si_stime=0} ---
16:09:01.210200 --- SIGTTOU {si_signo=SIGTTOU, si_code=SI_USER, si_pid=829, si_uid=1000} ---
In case you do not recognize SIGTTOU use kill -l to list signals on your environment:
kill -l
1) SIGHUP 2) SIGINT 3) SIGQUIT 4) SIGILL 5) SIGTRAP
6) SIGABRT 7) SIGBUS 8) SIGFPE 9) SIGKILL 10) SIGUSR1
11) SIGSEGV 12) SIGUSR2 13) SIGPIPE 14) SIGALRM 15) SIGTERM
16) SIGSTKFLT 17) SIGCHLD 18) SIGCONT 19) SIGSTOP 20) SIGTSTP
21) SIGTTIN 22) SIGTTOU 23) SIGURG 24) SIGXCPU 25) SIGXFSZ
26) SIGVTALRM 27) SIGPROF 28) SIGWINCH 29) SIGIO 30) SIGPWR
31) SIGSYS 34) SIGRTMIN 35) SIGRTMIN+1 36) SIGRTMIN+2 37) SIGRTMIN+3
38) SIGRTMIN+4 39) SIGRTMIN+5 40) SIGRTMIN+6 41) SIGRTMIN+7 42) SIGRTMIN+8
43) SIGRTMIN+9 44) SIGRTMIN+10 45) SIGRTMIN+11 46) SIGRTMIN+12 47) SIGRTMIN+13
48) SIGRTMIN+14 49) SIGRTMIN+15 50) SIGRTMAX-14 51) SIGRTMAX-13 52) SIGRTMAX-12
53) SIGRTMAX-11 54) SIGRTMAX-10 55) SIGRTMAX-9 56) SIGRTMAX-8 57) SIGRTMAX-7
58) SIGRTMAX-6 59) SIGRTMAX-5 60) SIGRTMAX-4 61) SIGRTMAX-3 62) SIGRTMAX-2
63) SIGRTMAX-1 64) SIGRTMAX
which may be a surprise if you never used anything but 3, 9, and 11. kill -22 is SIGTHOU and the process id and userid of the sender are listed. Unfortunately, most of the time strace does not show kill -9 and kill -11 as they are not trapped and all you get is this line:
++++ killed by SIGKILL +++
There are 2 available tools that are not usually installed and/or active on Linux but have so much functionality, they should be. These tools are included in the Linux repositories for the RHEL, SUSE, and Fedora distributions and are installed as any other software package would be using the usual Linux install tools. Since they are very functional at the system level, root or elevated access rights are needed. However, the install process is quite simple and the functionality is worthwhile.
AUDIT
Auditd is a daemon process or service that does as the name implies and produces audit logs of System level activities. It is installed from the usual repository as the audit package and then is configured in /etc/audit/auditd.conf and the rules are in /etc/audit/audit.rules.
Example entry for kill signal logging:
-a entry,always -F arch=b64 -S kill -k kill_signals
then the command: sevice auditd start
will log all signals in /ver/audit/audit.log with a key of kill_signals for searching by your favorite editor or you may use ausearch -k kill_signals
Of course, this example captures all signals and is quite verbose. The usual output will look like this:
time->Wed Jun 3 16:34:08 2015
type=SYSCALL msg=audit(1433363648.091:6342): arch=c000003e syscall=62 success=no exit=-3 a0=1e06 a1=0 a2=1e06 a3=fffffffffffffff0 items=0 ppid=10044 pid=10140 auid=500 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=pts0 ses=2 comm=4174746163682041504920696E6974 exe="/opt/ibm/WebSphere/AppServer/java/jre/bin/java" subj=unconfined_u:unconfined_r:unconfined_java_t:s0-s0:c0.c1023 key="kill_signals"
----
time->Wed Jun 3 16:34:08 2015
type=OBJ_PID msg=audit(1433363648.130:6343): opid=27307 oauid=-1 ouid=0 oses=-1 obj=system_u:system_r:initrc_t:s0 ocomm="symcfgd"
type=SYSCALL msg=audit(1433363648.130:6343): arch=c000003e syscall=62 success=yes exit=0 a0=6aab a1=12 a2=f a3=50d items=0 ppid=1 pid=27214 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="sav-limitcpu" exe="/usr/bin/sav-limitcpu" subj=system_u:system_r:initrc_t:s0 key="kill_signals"
----
time->Wed Jun 3 16:34:08 2015
Stop the logging with service auditd stop command and see this link from RedHat for more information: How to use audit to monitor a specific SYSCALL
System Tap
This tool is relatively more complex and flexible than the audit tool. The tool provide probe and taps that are written in a script that is remarkably C like. It is similar to Dtrace on Solaris in that regard. It is also similar to Dtrace in that it offers a lot of probes to look at performance and memory as well as network activity. It too is easily installed (for example on RHEL yum install systemtap does it). Root access does seem to be required. Good news, it comes with a set of taps that will perform a comprehesive set of tracing. These live in /usr/share/systemtap. Root access is required or you may be a member of a group with the privileges.
The basic command:
stap sigkill.stp gets very verbose
even on lab systems while the same script can be filtered. An example to trace kill commands for a specific pid and a specific command:
stap sigkill.stp -x <pid> SIGKILL
which logs:
SIGKILL was sent to java (pid:<pid>) by bash uid:0
on testing on a command sent from the command line.
So you do need the script sigkill.stp which is created by RedHat and looks like this:
#! /usr/bin/env stap
# sigkill.stp
# Copyright (C) 2007 Red Hat, Inc., Eugene Teo <eteo@redhat.com>
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License version 2 as
# published by the Free Software Foundation.
#
# /usr/share/systemtap/tapset/signal.stp:
# [...]
# probe signal.send = _signal.send.*
# {
# sig=$sig
# sig_name = _signal_name($sig)
# sig_pid = task_pid(task)
# pid_name = task_execname(task)
# [...]
probe signal.send {
if (sig_name == "SIGKILL")
printf("%s was sent to %s (pid:%d) by %s uid:%d\n",
sig_name, pid_name, sig_pid, execname(), uid())
}
Here is a very useful link for System Tap. It shows some useful tools for tracking down most signals (strace) or all of them (audit and system tap):
Red Hat Enterprise Linux 6 SystemTap Beginners Guide Introduction to SystemTap
Finding the source of signals on Linux with strace, auditd, or systemtap的更多相关文章
- Linux利器 strace [看出process呼叫哪個system call]
Linux利器 strace strace常用来跟踪进程执行时的系统调用和所接收的信号. 在Linux世界,进程不能直接访问硬件设备,当进程需要访问硬件设备(比如读取磁盘文件,接收网络数据等等)时,必 ...
- linux神器strace
man strace: strace - trace system calls and signals DESCRIPTION In the simplest case strace runs the ...
- linux神器 strace解析
除了人格以外,人最大的损失,莫过于失掉自信心了. 前言 strace可以说是神器一般的存在了,对于研究代码调用,内核级调用.系统级调用有非常重要的作用.打算了一周了,只有原文,一直没有梳理,拖延症犯了 ...
- linux申请strace ,lstrace, ptrace, dtrace
ltrace命令是用来跟踪进程调用库函数的情况. ltrace -hUsage: ltrace [option ...] [command [arg ...]]Trace library calls ...
- Linux 的 strace 命令
https://linux.cn/article-3935-1.html http://www.cnblogs.com/ggjucheng/archive/2012/01/08/2316692.htm ...
- Linux调试工具strace和gdb常用命令小结
strace和gdb是Linux环境下的两个常用调试工具,这里是个人在使用过程中对这两个工具常用参数的总结,留作日后查看使用. strace调试工具 strace工具用于跟踪进程执行时的系统调用和所接 ...
- linux下strace命令详解
简介 strace常用来跟踪进程执行时的系统调用和所接收的信号. 在Linux世界,进程不能直接访问硬件设备,当进程需要访问硬件设备(比如读取磁盘文件,接收网络数据等等)时,必须由用户态模式切换至内核 ...
- 使用 Linux 的 strace 命令跟踪/调试程序的常用选项
原文:http://linoxide.com/linux-command/linux-strace-command-examples/作者: Raghu 在调试的时候,strace能帮助你追踪到一个程 ...
- Linux利器strace
strace常用来跟踪进程执行时的系统调用和所接收的信号. 在Linux世界,进程不能直接访问硬件设备,当进程需要访问硬件设备(比如读取磁盘文件,接收网络数据等等)时,必须由用户态模式切换至内核态模式 ...
随机推荐
- Java反射机制二 获取方法的返回值或参数的泛型信息
在使用反射机制时,我们经常需要知道方法的参数和返回值类型,很简单 ,下面上示例,示例中的两个方法非常相似 package deadLockThread; import java.lang.refle ...
- Spring总结 3.AOP(xml)
本随笔内容要点如下: 什么是AOP AOP术语解释 Spring中AOP的xml实现 一.什么是AOP AOP(Aspect Oriented Programming),即面向切面编程.那什么是面向切 ...
- Tomcat学习总结(10)——Tomcat多实例冗余部署
昨天在跟群友做技术交流的时候,了解到,有很多大公司都是采用了高可用的,分布式的,实例沉余1+台.但是在小公司的同学也很多,他们反映并不是所有公司都有那样的资源来供你调度.往往公司只会给你一台机器,因为 ...
- UVa 10129 Play on Words(并查集+欧拉路径)
题目链接: https://cn.vjudge.net/problem/UVA-10129 Some of the secret doors contain a very interesting wo ...
- UVA 227 Puzzle(基础字符串处理)
题目链接: https://cn.vjudge.net/problem/UVA-227 /* 问题 输入一个5*5的方格,其中有一些字母填充,还有一个空白位置,输入一连串 的指令,如果指令合法,能够得 ...
- 【MongoDB学习-在.NET中的简单操作】
1.新建MVC项目, 管理NuGet包,进入下载MongDB.net库文件 2.新增项目DAL数据访问层,引用以下库文件: 3.C# 访问MongoDB通用方法类: using MongoDB.Dri ...
- EWS 流通知订阅邮件
摘要 查找一些关于流通知订阅邮件的资料,这里整理一下. 核心代码块 using System; using System.Collections.Generic; using System.Linq; ...
- JS forEach()与map() 用法(转载)
JavaScript中的数组遍历forEach()与map()方法以及兼容写法 原理: 高级浏览器支持forEach方法语法:forEach和map都支持2个参数:一个是回调函数(item,ind ...
- 37.Linux驱动调试-根据oops的栈信息,确定函数调用过程
上章链接入口: http://www.cnblogs.com/lifexy/p/8006748.html 在上章里,我们分析了oops的PC值在哪个函数出错的 本章便通过栈信息来分析函数调用过程 1. ...
- Java开发中常用的设计模式(一)---工厂模式
一. 准备工作 1. 本文参考自 自己理解的工厂模式,希望对大家有所帮助 二. 开始 以汽车工厂为例,首先有个汽车类的接口 Car,里面有个开车的方法 drive(),然后有个宝马车的类 BMW 和 ...