Xperf Basics: Recording a Trace(转)
http://randomascii.wordpress.com/2011/08/18/xperf-basics-recording-a-trace/
This post is obsolete – deprecated. For information on newer/easier/better ways of recording xperf traces see Xperf Basics: Recording a Trace (the easy way).
The Windows Performance Toolkit, also known as xperf, is a powerful (and free!) set of tools from Microsoft that allow profiling of all aspects of a Windows computer by using ETW (Event Tracing for Windows). Whether your performance issues are caused by excess CPU usage, waiting on file I/O, or interactions with drivers and other software, xperf usually provides the information needed to diagnose what is going on.
In addition to using xperf to diagnose dozens of tricky performance problems in the software that I work on I have used it to find (and both workaround and report) performance problems in:
- VirtualAlloc
- PowerPoint
- Visual Studio (breakpoint related hangs, SQL access hangs, network access hangs, etc.)
- Windows Live Photo Gallery
But, there is a problem. Xperf is difficult to learn, and the documentation is, well,imperfect. I hope to share some of what I have learned so that this valuable tool can be used by more developers, to make their software more awesome.
This post gives some resources on that most basic of problems, “how do I record a useful xperf trace that contains the information I’m likely to need.”
Recording a trace
Xperf is a command line tool with a bewildering array of options. Some of the things that you might need to specify when recording a trace are:
- What kernel providers (context switches, virtual allocs, sampling profiler, disk I/O) do you want to record?
- What events do you want call stacks for?
- How many memory buffers do you want, and what size do you want them to be?
For best results you should also record product-specific events from a user-mode provider. This requires that you learn:
- How to write a provider manifest
- How to ensure portability to non-Windows platforms (compile-time checks) and pre-Vista platforms (run-time checks)
And so on. It’s a lot to learn and while the basics of recording a kernel trace with a couple of kernel providers are covered pretty well it can be difficult to even find out what other options might be useful.
I have no intention of writing full documentation for xperf. Instead I am going to provide and explain the user-mode providers and the batch files that I use to record traces. That at least should get those interested in xperf off to a better start.
Step 1 – get xperf
Xperf is distributed as part of the Windows Software Development Kit. There are many valuable tools in there, but that’s the topic of another post. For now, run the installer for the latest Platform SDK (currently version 8.0, found here as of May 2012). You will find Windows Performance Toolkit which is the official name for xperf. Install it. You can do what you want with the rest of the Platform SDK. The appropriate version for your operating system (32-bit or 64-bit) will be installed – I found it hidden in “C:\Program Files\Windows Kits\8.0\Windows Performance Toolkit”. The redistributable packages for all Windows flavors, to make installing on other machines easier, should be found at “C:\Program Files (x86)\Windows Kits\8.0\Windows Performance Toolkit\Redistributables”. Make sure the install directory is in your path, then move on.
Step 2 – get the sample providers and batch files
You can download my sample user-mode ETW providers from ftp://ftp.cygnus-software.com/pub/MultiProvider.zip. I’ll wait.
When you unzip the file you’ll find a Visual Studio 2010 solution file. Build the debug or release configuration. You might want to poke around and look at the ReadMe.txt file, the provider manifest file (etwprovider.man) and the batch files.
Step 3 – record a trace
Now things start getting messy. I’m going to try to document all of the necessary steps and gotchas but it’s hard to make it really fool proof.
If you are running 64-bit Windows then there is a registry key that you need to set. And then you need to reboot. The registry key tells Windows to keep information needed for stack walking in non-pageable memory. If you run the command below from an elevated command prompt (yes, it is all one line that is excessively wordwrapped) and then reboot then your call stacks will thank you.
REG ADD "HKLM\System\CurrentControlSet\Control\Session Manager\Memory Management" -v DisablePagingExecutive -d 0x1 -t REG_DWORD -f
In order to use user-mode providers you have to register them. The etwregister.bat file is for that purpose. Like most things xperf you need to open up an administrator command prompt. Navigate to the directory containing etwregister.bat and run it. It should be able to find the MultiProvider.exe that you built (which contains the providers) and etwprovider.man (which defines them) and register them with wevtutil.exe.
Now run etwrecord.bat. Normally I give it the name of a trace file to record to like “etwrecord.bat c:\temp\testtrace.etl”. If you don’t give it a name then it will manufacture one. Either way be careful not to have spaces in your path. At this point tracing has been started. Detailed information about the operation of your computer is being recorded. If you get error messages at this point then read them carefully. Make sure the etwregister.bat step worked properly. Make sure you are running from an administrator command prompt. Make sure you are running Windows 7 (ideal) or Windows Vista in a pinch. ETW tracing is very limited in XP and I have done zero testing on that platform.
Now run the MultiProvider.exe that you built way back in step 2. It will emit some ETW events that your trace is set up to record.
Once MultiProvider.exe has exited you should return to the command prompt and press the ‘any key’ to continue. At this point your trace will be saved to disk. Make sure your trace file name doesn’t contain any spaces because it’s very difficult to get batch files to handle that correctly – and mine don’t. Again, if anything goes wrong then you’ll have to examine the error messages and suggestions very carefully.
In order to be helpful the batch file will launch xperfview to view the trace. Since the batch file is running as administrator, and since xperfview doesn’t process traces while elevatedyou will get this dialog. Read the link above for details, or just click Yes to view the trace or No to cancel.

Step 4 – looking at the user events
To find the user-provider events – which are great for offering context when looking at the kernel trace data – go to the Generic Events graph in xperfview. It should be at the bottom. You can hover over the diamonds to get a summary of each event. Some of the events are random Windows events with minimal value so I usually use the ProviderIDs configuration dropdown to turn off all except for mine, which are creatively named Multi-Main, etc.

The blue diamonds are designed for worker thread events, the two purple diamonds (your colours may vary) in the screen shot above are faked up input events. The green diamonds are faked up events that indicate the beginning of a frame, as in a video game. The brown diamonds are generic events. The whole purpose of having multiple providers is so that they will show up on different rows in this view. By carefully selecting when to use each provider you can make patterns (such as high and low frame rates, or high and low packet frequencies) visually obvious.
To get more details you need to select a range of time on the graph, right click, and request a summary table. Summary tables are a complex topic, but for now suffice it to say that by enabling and disabling columns to show the ones that you care about, by reordering columns (paying close attention to the gold bar – columns to the left of it are used for hierarchical grouping), and by careful sorting you can answer many questions. Effective use of pivot tables is an art form and is crucial to getting full value from xperf.
On the summary table below we can see, for instance, that the obviously important task “Busy work…” started 5.462635 seconds into the trace, and ended 5.477991493 into the trace. That information then helps us make sense of the other graphs on the main window because they all share a common timeline. Or at least it would if we were profiling something real. Use your imagination and pretend that those Begin/End markers are helping you identify when your AI code is running, or when map loading occurred, or whatever significant event that is taking too long that you want to investigate.

Step 5 – customize it for your purposes
In order to integrate this into your projects and start squashing performance bugs you need to change some things.
- Copy etwprof.* and ETWProvider.man to one of your projects – the DLL or EXE that will contain the providers. You’ll need to copy over the custom build steps for ETWProvider.man and include ETWProviderGenerated.rc (created by those custom build steps) into your existing resource file.
- Modify ETWProvider.man and ETWRegister.bat to change the name of the DLL or EXE that will contain the providers
- Change all of the GUIDs and provider names in ETWProvider.man
- Make the same provider name changes in ETWCommonSettings.bat (so that tracing enables the correct providers) and etwprof.cpp (to adjust the run-time registering of the providers and the emitting of events)
- Run the updated ETWRegister.bat to register your providers – make sure this succeeds.
At that point you should be able to build your code, register your providers, add some calls to your event emitting functions, and record traces that contain your user events.
Step 6 – bonus!
As an extra bonus, when etwrecord.bat finishes it actually leaves tracing running. Tracing continues to a circular buffer and at any time you can save that buffer (retroactive profiling!) to disk with “etwcirc.bat c:\temp\retrotrace.etl”. If you don’t like wasting memory on those circular buffers, “etwcirc.bat stop” will fix things for you.
Step 7 – detailed analysis
Analyzing traces is a huge topic, full of undocumented summary tables, and it will have to be saved for another post.
Step 8 – understanding how it works
This isn’t really a step, but a section to explain some of the details of how this works, to aid in customizing the batch files and providers. There are also lots of comments in the batch files and source code which should be used as a resource.
If you type “etwrecord test.etl” then xperf.exe will be invoked a few times. Once immediately:
xperf -on Latency+POWER+DISPATCHER+FILE_IO+FILE_IO_INIT -stackwalk PROFILE+CSWITCH+READYTHREAD -buffersize 1024 -minbuffers 300 -start gamesession -on Microsoft-Windows-Win32k+Multi-MAIN+Multi-FrameRate+Multi-Input+Multi-Worker
The first command starts tracing. “-on” indicates that the kernel provider should be started, and the plus-sign separated words that follow are individual kernel providers (POWER+DISPATCHER+FILE_IO+FILE_IO_INIT) and kernel provider groups (Latency). See “xperf -providers k” for a list of kernel providers and “xperf -help start” for information on the very complex “-on” syntax.
Then, “-stackwalk” indicates which events should have call stacks recorded for them. Call stacks are very useful, but are also expensive. Note that if you turn on stack walks for an event that is not enabled then nothing happens. See Pigs Can Fly for details of this or look at “xperf -help stackwalk” to see the full list of flags.
Then the size and minimum number of buffers is specified. Setting this too high will waste memory, and too low risks losing events. Note that the buffer size is in KB, so the settings above request 300 MB of buffers for the kernel events. See “xperf -help start” for details.
Then we get to “-start gamesession” which requests that we start a user-mode logging session called gamesession. Any name will do, but only one session of that name can be running at a time. Following that is “-on” and a list of user-mode providers. The first one is a built-in Windows 7 provider. See “xperf -providers” for a list of user-mode providers. The others are the providers that are defined in MultiProvider.exe and registered with ETWregister.bat. Customizing these, in ETWCommonSettings.bat can be important.
Xperf is invoked again after you hit a key to stop tracing:
xperf -stop gamesession -stop -d test.etl
The “-stop gamesession” tells xperf to end our user-mode logging session. The “-stop” by itself then tells xperf to halt the kernel session. The “-d test.etl” tells xperf to take those just-stopped sessions and merge them into a single trace file, for integrated analysis.
There are also a couple of “xperf –loggers” commands. Those are there for diagnostic purposes. When things go awry this undocumented command gives a list of all active ETW sessions that can help understand what is going on.
Xperf Basics: Recording a Trace(转)的更多相关文章
- Xperf Basics: Recording a Trace (the easy way)(转)
http://randomascii.wordpress.com/2013/04/20/xperf-basics-recording-a-trace-the-easy-way/ Some ti ...
- Xperf Analysis Basics(转)
FQ不易,转载 http://randomascii.wordpress.com/2011/08/23/xperf-analysis-basics/ I started writing a des ...
- Microsoft SQL Server Trace Flags
Complete list of Microsoft SQL Server trace flags (585 trace flags) REMEMBER: Be extremely careful w ...
- 设置peoplecode trace
Configuring PeopleCode Trace Select PeopleTools, Utilities, Debug, Trace PeopleCode to access the Tr ...
- 使用PerfView监测.NET程序性能(一):Event Trace for Windows
前言: 在日常项目开发中,我们时不时会遇到程序占用了很高CPU的情况,可能是程序里某些未经优化的代码或者Bug,或者是程序运行压力太大.无论是什么原因,我们总希望能看到到底是哪个方法占用了如此高的CP ...
- HTTP Method详细解读(`GET` `HEAD` `POST` `OPTIONS` `PUT` `DELETE` `TRACE` `CONNECT`)
前言 HTTP Method的历史: HTTP 0.9 这个版本只有GET方法 HTTP 1.0 这个版本有GET HEAD POST这三个方法 HTTP 1.1 这个版本是当前版本,包含GET HE ...
- 安卓开发error opening trace file: No such file or directory (2)报错原因
error opening trace file: No such file or directory (2) 这个问题的出现是因为运行的测试机android系统版本和项目api不一致导致. 改成一样 ...
- Logging with Debug And Trace (一)
对于一个应用程序而言,Log 必不可少. 在.net 里面,最简单的方式就是用Console 来输出 信息了,例如下面的例子: public class Program { public static ...
- 使用WCF的Trace与Message Log功能
原创地址:http://www.cnblogs.com/jfzhu/p/4030008.html 转载请注明出处 前面介绍过如何创建一个WCF Service http://www.cnblo ...
随机推荐
- Dictionary和IDictionary
Dictionary<string> s = new Dictionary<string>(); 这个是s是Dictionary类型.是个类 类型,实现了接口,提供了更多的方法 ...
- Autofac 的属性注入方式
介绍 该篇文章通过一个简单的 ASP.NET MVC 项目进行介绍如何使用 autofac 及 autofac 的 MVC 模块进行依赖注入.注入方式通过构造函数.在编写 aufofac 的依赖注入代 ...
- Brn系列网上商城数据库说明文档
单店版BrnShop_1.9.351数据字典:点击下载 多店版BrnMall_1.9.496数据字典:点击下载 有对网上商城程序设计感兴趣的朋友,欢迎加入QQ群:235274151,大家可以交流下!
- python语句表达式——黑板客老师课程学习
1.赋值 多重赋值: a,b=1,2 a,b=’beijing’,’sh’ a,b=’bj’ a,b=(1,2) a,b=[1,2] …… 2.输入输出 输入: raw_input() 原始输入 ...
- RedHat下编译安装Boost
1.解压boost_1_54_0.tar.gz 2.进入目录后,运行 ./bootstrap.sh ,会生成一个 bjam 的可执行程序 3.运行 ./bjam release install 进行编 ...
- css属性word-spacing和letter-spacing的区别
word-spacing和letter-spacing用来定义单词或者字母之间的水平空白间隔.顾名思义,word-spacing定义了单词之间的空白,例如: <div style="w ...
- iOS开发-Alcatraz插件管理
CocoaPod负责iOS开发中的引用类库的管理,Alcatraz中文翻译阿尔卡特拉斯岛,也有人称之为恶魔岛,主要是负责管理第三方Xcode 插件.模版以及颜色配置的工具,直接集成到 Xcode 的图 ...
- deep learning
今天跑一个模型,程序都没变,就配置文件变了.但是总是很快就显示loss为nan. 检查配置文件还是不行,把其中loss改为0还是不行.最后搁置了一下,再回头对比一下电脑上的和服务器上的,发现一个配置文 ...
- 搭建angular前端框架 命令
首先必备的工具都下下好. 然后现在开始输入命令行创建angular 项目 1.node cd .. 2.yo bower grunt 3.npm install -g generator-angula ...
- 根据IP获取省市 .
public class IPAddress { /// <summary> /// 得到真实IP以及所在地详细信息(Porschev) /// </summary> /// ...