#定位系统性能瓶颈# strace & ltrace
strace和ltrace分别相应的是系统调用和库函数调用,
比如正常时制和夏时制之间的转换。由内核处理或要求人为干预。U N I X则不同,它仅仅提供一
条系统调用,该系统调用返回国际标准时间1 9 7 0年1月1日零点以来所经过的秒数。对该值的
不论什么解释。比如将其变换成人们可读的。使用本地时区的时间和日期,都留给用户进程执行。
在标准C库中,提供了若干例程以处理大多数情况。这些库函数处理各种细节,比如各种夏时
制算法。
应用程序能够调用系统调用或者库函数。而非常多库函数则会调用系统调用。
这在图1 - 2中
显示。
系统调用和库函数之间的还有一个区别是:系统调用通常提供一种最小界面。而库函数通常
提供比較复杂的功能。
我们从s b r k系统调用和m a l l o c库函数之间的区别中能够看
到这一点。在以后当比較不带缓存的I / O函数(见第3章)以及标准I / O函数(见第5章)
时,还将看到这样的区别。
进程控制系统调用( fork, exec和w a i t)通常由用户的应用程序直接调用(请回顾
程序1 - 5中的基本s h e l l)。
可是为了简化某些常见的情况,U N I X系统也提供了一些库
函数;比如s y s t e m和p o p e n。8 . 1 2节将说明s y s t e m函数的一种实现,它使用主要的进
程控制系统调用。1 0 . 1 8节还将强化这一实例以正确地处理信号。
为使读者了解大多数程序猿应用的U N I X系统界面,我们不得不既说明系统调用,仅仅介绍某些库函数。
比如若仅仅说明s b r k系统调用,那么就会忽略非常多应用程序使用的m a l l o c库函数。
本书除了必需要区分两者时,都将使用术语函数( f u n c t i o n)来指代系统调用和库函数两者。
CPU硬件决定了这些(这就是为什么它被称作"保护模式")。系统调用是这些规则的一个例外。
其原理是进程先用适当的值填充寄存器,然后调用一个特殊的指令。这个指令会跳到一个事先定义的内核中的一个位置(当然,这个位置是用户进程可读可是不可写的)。在Intel
CPU中,这个由中断0x80实现。
硬件知道一旦你跳到这个位置,你就不是在限制模式下执行的用户。而是作为操作系统的内核--所以你就能够为所欲为。
进程能够跳转到的内核位置叫做sysem_call。这个过程检查系统调用号。这个号码告诉内核进程请求哪种服务。然后。它查看系统调用表(sys_call_table)找到所调用的内核函数入口地址。
接着。就调用函数,等返回后。做一些系统检查,最后返回到进程(或到其它进程,假设这个进程时间用尽)。假设你希望读这段代码,它在<内核源代码文件夹>/kernel/entry.S,Entry(system_call)的下一行。
为防止和正常的返回值混淆,系统调用并不直接返回错误码,而是将错误码放入一个名为errno的全局变量中。
假设一个系统调用失败,你能够读出errno的值来确定问题所在。
errno不同数值所代表的错误消息定义在errno.h中。你也能够通过命令"man 3 errno"来察看它们。
须要注意的是,errno的值仅仅在函数错误发生时设置。假设函数不错误发生,errno的值就无定义,并不会被置为0。另外。在处理errno前最好先把它的值存入还有一个变量,由于在错误处理过程中。即使像printf()这种函数出错时也会改变errno的值。
<span style="font-family:Microsoft YaHei;font-size:12px;">yum install strace ltrace</span>
然后我会man一个命令的具体描写叙述,由于man已经写得很好,尽管是英语会影响阅读速度,可是与其看网络上3、4手的转载资料。还不如正正经经把官方man文档看一遍,这里把man的内容摘录下来不是为了添加本文文字长度,而是强迫读者把英文官方文档看一遍。
man strace
DESCRIPTION
In the simplest case strace runs the specified command until it exits. It intercepts and records the system
calls which are called by a process and the signals which are received by a process. The name of each system
call, its arguments and its return value are printed on standard error or to the file specified with the
-o option.
strace is a useful diagnostic, instructional, and debugging tool. System administrators, diagnosticians and
trouble-shooters will find it invaluable for solving problems with programs for which the source is not readily
available since they do not need to be recompiled in order to trace them. Students, hackers and the overly-
curious will find that a great deal can be learned about a system and its system calls by tracing even ordinary
programs. And programmers will find that since system calls and signals are events that happen at the
user/kernel interface, a close examination of this boundary is very useful for bug isolation, sanity checking
and attempting to capture race conditions.
Each line in the trace contains the system call name, followed by its arguments in parentheses and its return
value. An example from stracing the command ‘‘cat /dev/null’’ is:
open("/dev/null", O_RDONLY) = 3
Errors (typically a return value of -1) have the errno symbol and error string appended.
open("/foo/bar", O_RDONLY) = -1 ENOENT (No such file or directory)
Signals are printed as a signal symbol and a signal string. An excerpt from stracing and interrupting the com-
mand ‘‘sleep 666’’ is:
sigsuspend([] <unfinished ...>
--- SIGINT (Interrupt) ---
+++ killed by SIGINT +++
If a system call is being executed and meanwhile another one is being called from a different thread/process
then strace will try to preserve the order of those events and mark the ongoing call as being unfinished. When
the call returns it will be marked as resumed.
[pid 28772] select(4, [3], NULL, NULL, NULL <unfinished ...>
[pid 28779] clock_gettime(CLOCK_REALTIME, {1130322148, 939977000}) = 0
[pid 28772] <... select resumed> ) = 1 (in [3])
Interruption of a (restartable) system call by a signal delivery is processed differently as kernel terminates
the system call and also arranges its immediate reexecution after the signal handler completes.
read(0, 0x7ffff72cf5cf, 1) = ? ERESTARTSYS (To be restarted)
--- SIGALRM (Alarm clock) @ 0 (0) ---
rt_sigreturn(0xe) = 0
read(0, ""..., 1) = 0
Arguments are printed in symbolic form with a passion. This example shows the shell performing ‘‘>>xyzzy’’
output redirection:
open("xyzzy", O_WRONLY|O_APPEND|O_CREAT, 0666) = 3
Here the three argument form of open is decoded by breaking down the flag argument into its three bitwise-OR
constituents and printing the mode value in octal by tradition. Where traditional or native usage differs from
ANSI or POSIX, the latter forms are preferred. In some cases, strace output has proven to be more readable
than the source.
Structure pointers are dereferenced and the members are displayed as appropriate. In all cases arguments are
formatted in the most C-like fashion possible. For example, the essence of the command ‘‘ls -l /dev/null’’ is
captured as:
lstat("/dev/null", {st_mode=S_IFCHR|0666, st_rdev=makedev(1, 3), ...}) = 0
Notice how the ‘struct stat’ argument is dereferenced and how each member is displayed symbolically. In par-
ticular, observe how the st_mode member is carefully decoded into a bitwise-OR of symbolic and numeric values.
Also notice in this example that the first argument to lstat is an input to the system call and the second
argument is an output. Since output arguments are not modified if the system call fails, arguments may not
always be dereferenced. For example, retrying the ‘‘ls -l’’ example with a non-existent file produces the fol-
lowing line:
lstat("/foo/bar", 0xb004) = -1 ENOENT (No such file or directory)
In this case the porch light is on but nobody is home.
Character pointers are dereferenced and printed as C strings. Non-printing characters in strings are normally
represented by ordinary C escape codes. Only the first strsize (32 by default) bytes of strings are printed;
longer strings have an ellipsis appended following the closing quote. Here is a line from ‘‘ls -l’’ where the
getpwuid library routine is reading the password file:
read(3, "root::0:0:System Administrator:/"..., 1024) = 422
While structures are annotated using curly braces, simple pointers and arrays are printed using square brackets
with commas separating elements. Here is an example from the command ‘‘id’’ on a system with supplementary
group ids:
getgroups(32, [100, 0]) = 2
On the other hand, bit-sets are also shown using square brackets but set elements are separated only by a
space. Here is the shell preparing to execute an external command:
sigprocmask(SIG_BLOCK, [CHLD TTOU], []) = 0
Here the second argument is a bit-set of two signals, SIGCHLD and SIGTTOU. In some cases the bit-set is so
full that printing out the unset elements is more valuable. In that case, the bit-set is prefixed by a tilde
like this:
sigprocmask(SIG_UNBLOCK, ~[], NULL) = 0
Here the second argument represents the full set of all signals.
<span style="font-family:Microsoft YaHei;font-size:12px;">[root@localhost ~]# strace ls
execve("/bin/ls", ["ls"], [/* 41 vars */]) = 0
brk(0) = 0xe67000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f1176dfa000
access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=38574, ...}) = 0
mmap(NULL, 38574, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f1176df0000
close(3) = 0
open("/lib64/libselinux.so.1", O_RDONLY) = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0PX`\312=\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=124624, ...}) = 0
mmap(0x3dca600000, 2221912, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x3dca600000
mprotect(0x3dca61d000, 2093056, PROT_NONE) = 0
mmap(0x3dca81c000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1c000) = 0x3dca81c000
mmap(0x3dca81e000, 1880, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x3dca81e000
close(3) = 0
open("/lib64/librt.so.1", O_RDONLY) = 3</span>
因为输出内容比較长。我就不所有拷贝下来了,我们能够看看一个简单的ls命令,究竟做了哪些系统调用
<span style="font-family:Microsoft YaHei;font-size:12px;">strace -e 'select' -p 18846
Process 18846 attached - interrupt to quit
select(0, NULL, NULL, NULL, {0, 165000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {1, 0} <unfinished ...>
Process 18846 detached</span>
上面的样例是对18846进程的select调用的显示,这个功能能够非常方便的定位调用的方法
<span style="font-family:Microsoft YaHei;font-size:12px;">[root@localhost ~]# strace -o /tmp/a.txt -c hostname
localhost
[root@localhost ~]# cat /tmp/a.txt
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
-nan 0.000000 0 5 read
-nan 0.000000 0 1 write
-nan 0.000000 0 6 open
-nan 0.000000 0 6 close
-nan 0.000000 0 7 fstat
-nan 0.000000 0 15 mmap
-nan 0.000000 0 7 mprotect
-nan 0.000000 0 2 munmap
-nan 0.000000 0 3 brk
-nan 0.000000 0 1 1 access
-nan 0.000000 0 1 execve
-nan 0.000000 0 1 uname
-nan 0.000000 0 1 statfs
-nan 0.000000 0 1 arch_prctl
------ ----------- ----------- --------- --------- ----------------
100.00 0.000000 57 1 total</span>
-o选项是把输出到文件,-c选项是统计所有系统调用的耗时和次数,这也是很有用的功能
<span style="font-family:Microsoft YaHei;font-size:12px;">DESCRIPTION
ltrace is a program that simply runs the specified command until it exits. It intercepts and records the
dynamic library calls which are called by the executed process and the signals which are received by that pro-
cess. It can also intercept and print the system calls executed by the program. Its use is very similar to strace(1).</span>
它的man很短。直接说和strace很相似
<span style="font-family:Microsoft YaHei;font-size:12px;">[root@localhost ~]# ltrace ls
(0, 0, 0, 0x7fc84b52bac0, 88) = 0x3dc8c21160
__libc_start_main(0x408480, 1, 0x7fff73282908, 0x412110, 0x412100 <unfinished ...>
strrchr("ls", '/') = NULL
setlocale(6, "") = "zh_CN.UTF-8"
bindtextdomain("coreutils", "/usr/share/locale") = "/usr/share/locale"
textdomain("coreutils") = "coreutils"
__cxa_atexit(0x40bb20, 0, 0, 0x736c6974756572, 0x3dc958fee8) = 0
isatty(1) = 1
getenv("QUOTING_STYLE") = NULL
getenv("LS_BLOCK_SIZE") = NULL
getenv("BLOCK_SIZE") = NULL
getenv("BLOCKSIZE") = NULL
getenv("POSIXLY_CORRECT") = NULL
getenv("BLOCK_SIZE") = NULL
getenv("COLUMNS") = NULL
ioctl(1, 21523, 0x7fff732827d0) = 0
getenv("TABSIZE") = NULL
getopt_long(1, 0x7fff73282908, "abcdfghiklmnopqrstuvw:xABCDFGHI:"..., 0x619040, 0x7fff732827e8) = -1
__errno_location() = 0x7fc84b5296a0
malloc(56) = 0x104d050
memcpy(0x104d050, "", 56) = 0x104d050</span>
和strace一样先拿ls来測试,能够看出非常多库函数的调用如strrchr、setlocale、bindtextdomain等,详细含义能够利用网络搜索
#定位系统性能瓶颈# strace & ltrace的更多相关文章
- 使用truss、strace或ltrace诊断软件的“疑难杂症”
简介 进程无法启动,软件运行速度突然变慢,程序的"Segment Fault"等等都是让每个Unix系统用户头痛的问题,本文通过三个实际案例演示如何使用truss.strace和l ...
- 使用truss、strace或ltrace诊断软件的"疑难杂症"
原文链接 简介 进程无法启动,软件运行速度突然变慢,程序的"Segment Fault"等等都是让每个Unix系统用户头痛的问题,本文通过三个实际案例演示如何使用truss.str ...
- 使用truss、strace或ltrace诊断软件问题-转
http://blog.itpub.net/35489/viewspace-84293 进程无法启动,软件运行速度突然变慢,程序的"Segment Fault"等等都是让每个Uni ...
- #定位系统性能瓶颈# sysdig
安装方法: curl -s https://s3.amazonaws.com/download.draios.com/stable/install-sysdig | sudo bash [root@l ...
- #定位系统性能瓶颈# perf
perf是一个基于Linux 2.6+的调优工具,在liunx性能測量抽象出一套适应于各种不同CPU硬件的通用測量方法,其数据来源于比較新的linux内核提供的 perf_event 接口 系统事件: ...
- strace使用详解
(一) strace 命令 用途:打印 STREAMS 跟踪消息. 语法:strace [ mid sid level ] ... 描述:没有参数的 strace 命令将所有的驱动程序和模块中的所 ...
- strace使用详解(转) 分类: shell ubuntu 2014-11-27 17:48 134人阅读 评论(0) 收藏
(一) strace 命令 用途:打印 STREAMS 跟踪消息. 语法:strace [ mid sid level ] ... 描述:没有参数的 strace 命令将所有的驱动程序和模块中的 ...
- strace排除Linux服务器故障
strace是一个有用的小工具 – 大多数Linux系统默认已经安装 – 可以通过跟踪系统调用来让你知道一个程序在后台所做的事情.Strace是一个基础的调试工具;但是即便你不是在跟踪一个问题的时候它 ...
- strace命令详解
转自: http://www.cnblogs.com/ahuo/p/4150623.html 备注: 这篇博文学到的不仅仅是 strace 这个命令,还有前辈的排错思路,致敬! strace 命令是一 ...
随机推荐
- 【ASP.NET MVC】"[A]System.Web.WebPages.Razor.Configuration.HostSection 无法强制转换为 ..."的解决办法
1.错误页面: “/”应用程序中的服务器错误. [A]System.Web.WebPages.Razor.Configuration.HostSection 无法强制转换为 [B]System.Web ...
- javascript一些有用但又不常用的特性
1.onclick="save();return false;" 取消“浏览器默认行为”. 比如一个链接 <a href="http://zhida ...
- 自定义TreeList单元格 z
DevExpress Treelist自定义单元格,加注释和行序号.以上一节的列表为例,实现以下效果:预算大于110万的单元格突出显示,加上行序号以及注释,如下图: 添加行序号要用到CustomDra ...
- C语言内存地址基础
来源:http://blog.jobbole.com/44845/ 从计算机内存的角度思考C语言中的一切东东,是挺有帮助的.我们可以把计算机内存想象成一个字节数组,内存中每一个地址表示 1 字节.比方 ...
- POJ 2096-Collecting Bugs(概率dp入门)
题意: 有n种bug和s种系统bug,每天发现一种bug(可能已经发现过了)所有种bug被发现的概率相同,求所有bug被发现的期望天数. 分析: dp[i][j]发现i种bug,j种系统bug期望天数 ...
- 【转】javascript-图片预加载技术
1,脚本代码: /** * 图片头数据加载就绪事件 - 更快获取图片尺寸 * @version 2011.05.27 * @author TangBin * @see http://www.plane ...
- Codeforces Educational Codeforces Round 15 A. Maximum Increase
A. Maximum Increase time limit per test 1 second memory limit per test 256 megabytes input standard ...
- Codeforces 381 简要题解
做的太糟糕了...第一题看成两人都取最优策略,写了个n^2的dp,还好pre-test良心(感觉TC和CF的pretest还是很靠谱的),让我反复过不去,仔细看题原来是取两边最大的啊!!!前30分钟就 ...
- Firefox 火狐网址生成二维码扩展推荐
该扩展并未在火狐官方的扩展站中上线,在火狐中国的站点中也几乎很难找到,只是作为火狐中国版的一个集成扩展. 各位想要使用该扩展,但又不愿安装火狐中国版的的浏览迷可以通过以下地址独立安装该扩展. 扩展地址 ...
- J2EE开发常用开源框架技术(转)
1持久层:1)Hibernate这个不用介绍了,用的很频繁,用的比较多的是映射,包括继承映射和父子表映射对 于DAO在这里介绍个在它基础上开发的包bba96,目前最新版本是bba96 2.0它对Hib ...