how to use perf
Since I did't see here anything about perf which is a relatively new tool for profiling the kernel and user applications on Linux I decided to add this information.
First of all - this is a tutorial about Linux profiling with perf
You can use perf if your Linux Kernel is greater than 2.6.32 or oprofile if it is older. Both programs don't require from you to instrument your program (like gprof requires). However in order to get call graph correctly in perf you need to build you program with -fno-omit-frame-pointer. For example: g++ -fno-omit-frame-pointer -O2 main.cpp.
You can see "live" analysis of your application with perf top:
sudo perf top -p `pidof a.out` -K
Or you can record performance data of a running application and analyze them after that:
1) To record performance data:
perf record -p `pidof a.out`
or to record for 10 secs:
perf record -p `pidof a.out` sleep 10
or to record with call graph ()
perf record -g -p `pidof a.out`
2) To analyze the recorded data
perf report --stdio
perf report --stdio --sort=dso -g none
perf report --stdio -g none
perf report --stdio -g
Or you can record performace data of a application and analyze them after that just by launching the application in this way and waiting for it to exit:
perf record ./a.out
This is an example of profiling a test program
The test program is in file main.cpp (I will put main.cpp at the bottom of the message):
I compile it in this way:
g++ -m64 -fno-omit-frame-pointer -g main.cpp -L. -ltcmalloc_minimal -o my_test
I use libmalloc_minimial.so since it is compiled with -fno-omit-frame-pointer while libc malloc seems to be compiled without this option. Then I run my test program
./my_test 100000000
Then I record performance data of a running process:
perf record -g -p `pidof my_test` -o ./my_test.perf.data sleep 30
Then I analyze load per module:
perf report --stdio -g none --sort comm,dso -i ./my_test.perf.data
# Overhead Command Shared Object
# ........ ....... ............................
#
70.06% my_test my_test
and so on ...
Then call chains are analyzed:
perf report --stdio -g graph -i ./my_test.perf.data | c++filt
0.16% my_test [kernel.kallsyms] [k] _spin_lock
and so on ...
So at this point you know where your program spends time.
And this is main.cpp for the test:
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
time_t f1(time_t time_value)
{
for (int j =0; j < 10; ++j) {
++time_value;
if (j%5 == 0) {
double *p = new double;
delete p;
}
}
return time_value;
}
time_t f2(time_t time_value)
{
for (int j =0; j < 40; ++j) {
++time_value;
}
time_value=f1(time_value);
return time_value;
}
time_t process_request(time_t time_value)
{
for (int j =0; j < 10; ++j) {
int *p = new int;
delete p;
for (int m =0; m < 10; ++m) {
++time_value;
}
}
for (int i =0; i < 10; ++i) {
time_value=f1(time_value);
time_value=f2(time_value);
}
return time_value;
}
int main(int argc, char* argv2[])
{
int number_loops = argc > 1 ? atoi(argv2[1]) : 1;
time_t time_value = time(0);
printf("number loops %d\n", number_loops);
printf("time_value: %d\n", time_value );
for (int i =0; i < number_loops; ++i) {
time_value = process_request(time_value);
}
printf("time_value: %ld\n", time_value );
return 0;
}
原文
http://stackoverflow.com/questions/1777556/alternatives-to-gprof#comment3480484_1779343
how to use perf的更多相关文章
- 系统级性能分析工具perf的介绍与使用
测试环境:Ubuntu16.04(在VMWare虚拟机使用perf top存在无法显示问题) Kernel:3.13.0-32 系统级性能优化通常包括两个阶段:性能剖析(performance pro ...
- 玩 perf
有一个进程happy在执行,另一个进程spy发送了一个信号把happy给杀死了 我怎么能通过perf抓到spy进程? happy进程一直执行 在spy进程中调用kill(happy's pid) ,发 ...
- linux perf - 性能测试和优化工具
Perf简介 Perf是Linux kernel自带的系统性能优化工具.虽然它的版本还只是0.0.2,Perf已经显现出它强大的实力,足以与目前Linux流行的OProfile相媲美了. Perf 的 ...
- Linux 性能优化工具 perf top
1. perf perf 是一个调查 Linux 中各种性能问题的有力工具. NAME perf - Performance analysis tools for Linux SYNOPSIS per ...
- 【转】Profiling application LLC cache misses under Linux using Perf Events
转自:http://ariasprado.name/2011/11/30/profiling-application-llc-cache-misses-under-linux-using-perf-e ...
- Linux/Android 性能优化工具 perf
/***************************************************************************** * Linux/Android 性能优化工 ...
- Linux下的内核测试工具——perf使用简介
Perf是Linux kernel自带的系统性能优化工具.Perf的优势在于与Linux Kernel的紧密结合,它可以最先应用到加入Kernel的new feature.pef可以用于查看热点函数, ...
- 系统级性能分析工具 — Perf
从2.6.31内核开始,linux内核自带了一个性能分析工具perf,能够进行函数级与指令级的热点查找. perf Performance analysis tools for Linux. Perf ...
- Linux Kernel ‘perf’ Utility 本地提权漏洞
漏洞名称: Linux Kernel ‘perf’ Utility 本地提权漏洞 CNNVD编号: CNNVD-201309-050 发布时间: 2013-09-09 更新时间: 2013-09-09 ...
- 使用perf生成Flame Graph(火焰图)
具体的步骤参见这里: <flame graph:图形化perf call stack数据的小工具> 使用SystemTap脚本制作火焰图,内存较少时,分配存储采样的数组可能失败,需 ...
随机推荐
- 000 Jquery的Hello World程序
1.介绍Jquery 2.简介 3.新建一个静态项目,并粘贴jquery库 4.程序 <!DOCTYPE html> <html> <head> <meta ...
- ZOJ Monthly, March 2018 题解
[题目链接] A. ZOJ 4004 - Easy Number Game 首先肯定是选择值最小的 $2*m$ 进行操作,这些数在操作的时候每次取一个最大的和最小的相乘是最优的. #include & ...
- 阿里云 rds python sdk不支持python3处理
阿里云文档中心的python版本aliyun-python-sdk-rds不支持python3处理 问题:默认情况下文档中心的python版本只支持python2,不兼容python3版本 需要稍微修 ...
- 十个问题带你了解和掌握java HashMap
十个问题带你了解和掌握java HashMap 一.前言 本篇内容是源于 " 由阿里巴巴Java开发规约HashMap条目引发的故事",并在此基础上加了自己的对HashMap更多的 ...
- MySQL服务器发生OOM的案例分析
[问题] 有一台MySQL5.6.21的服务器发生OOM,分析下来与多种因素有关 [分析过程] 1.服务器物理内存相对热点数据文件偏小,62G物理内存+8G的SWAP,数据文件大小约550G 触发OO ...
- leetcode 岛屿的个数 python
岛屿的个数 给定一个由 '1'(陆地)和 '0'(水)组成的的二维网格,计算岛屿的数量.一个岛被水包围,并且它是通过水平方向或垂直方向上相邻的陆地连接而成的.你可以假设网格的四个边均被水包 ...
- QT学习笔记9:QTableWidget的用法总结
最近用QT中表格用的比较多,使用的是QTableWidget这个控件,总结一下QTableWidget的一些相关函数. 1.将表格变为禁止编辑: tableWidget->setEditTrig ...
- [HDU2874]Connections between cities
思路:LCA裸题.本来是帮pechpo调错,结果自己写了半天… 设$dis_x$是点$x$到根结点距离,不难想到两点$u$.$v$之间最短距离等于$dis_u+dis_v-dis_{LCA(u,v)} ...
- BZOJ3779 : 重组病毒
一个点的感染时间为它到根路径上虚边数+1. 用Link-Cut Tree模拟虚实边切换,每次切换时等价于在一段或两段DFS序区间更新,线段树维护即可. 时间复杂度$O(n\log^2n)$. #inc ...
- AutoMapper实际项目运用
AutoMapper是对象到对象的映射工具.在完成映射规则之后,AutoMapper可以将源对象转换为目标对象. 配置AutoMapper映射规则 AutoMapper是基于约定的,因此在实用映射之前 ...