PAGELATCH_EX Contention on 2:1:103
This blog post is meant to help people troubleshoot page latch contention on 2:1:103. If that’s what brought you to this page, then hopefully you find it useful. But first…
Initial Investigation
Last week I was asked to help tackle a production outage. Queries were slow enough that the system was considered unavailable. And just like any other performance problem I started by having SQL Server tell me what was wrong with itself.
- I first started with sp_whoisactive but it took about 30 seconds for it to return. Several queries were blocked by others and some of the lead blockers were waiting for a PAGELATCH_EX on 2:1:103
- Then I checked on the most common waits using a query found at Paul Randal’s article: Wait statistics, or please tell me where it hurts. (I don’t have that page bookmarked, every time I’m in trouble, I google “Tell me where it hurts Paul Randal”). I learned that PAGELATCH_EX contention was our most serious wait type. Paul mentions that he sees this kind of contention most commonly on tempdb.
- I followed a link from Paul’s article to Robert Davis’s Breaking Down TempDB Contention. A script there helped me to discover that while there was lots of tempdb contention, it was all on page 2:1:103, which is not PFS, GAM or SGAM.
Finding More Info
So I looked for more help.
- Web Search: My work colleague found and then pointed me to Latch waits on 2:1:103? You are probably creating too many temp tables in SQL Server by Microsoft’s Matt Wrock who faced the same problem. If you found this page because you’re facing the same issue then stop reading this article and go read Matt’s first. He explains what’s going on better than I could. Think of this article as a kind of sequel to his article.
- StackOverflow: (serverfault in this case) has TempDB contention on sysmultiobjrefs SQL 2005. The best answer there is Matt pointing back to his blog article.
- Microsoft Support who helped us out, but maybe not as quick as we’d like. To be fair, the turnaround time that we would have been happy with was measured in nanoseconds.
With that information it became clear to us that we were creating temp tables too often. And we were creating them in a way that made it impossible for SQL Server to cache. Did you know that? That SQL Server can cache temp tables? When a query is done with a temp table, SQL Server can truncate that table and give it to the next query to avoid having to create it again. Cached temp tables reduce tempdb contention including contention on this page. But as it turns out, SQL Server cannot cache temp tables from ad hoc queries.
But who creates temp tables that often? We did, just by using a table valued parameter in a parameterized query. Since SQL Server began supporting table valued parameters (introduced in SQL Server 2008), we have been gradually moving towards this practice in lieu of sending xml to be shredded.
… And More Info Including Some Other Links
I can’t help including some extra articles I found on 2:1:103 contention.
Robert Davis blogged about 2:1:103 contention in Tempdb Contention That Can’t Be Soothed. His advice is to remove statements in code like “SELECT … INTO”. However, I believe that such statements contribute to PAGELATCH contention only if the statement is not part of a stored procedure (i.e. can’t be cached). I also think there are more common causes of 2:1:103 PAGELATCH contention than SELECT … INTO statements.
But Robert did link to a demo by Paul Randal (an absolutely amazing 5 minute demo linked from a 2011 issue of his newsletter). Paul tells us
- That “the SQL Team knows about this. It’s a known issue. Hopefully something will be done about it in one of the future releases.”
- But unlike SGAM or GAM contention, there’s absolutely nothing you can do about this to spread the contention around.
From a DBA point of view there’s not much that can be done. So as DB Developers, we have to find a workaround. But before I get to that, I want to mention some things that were not so useful.
… And Some Bad Info
There was some red herrings out there…
- Add more tempdb files (nope, wrong kind of tempdb contention).
- Many resources suggested enabling trace flag T1118. It’s a trace flag that eases tempdb contention, but not this kind.
- Get faster tempdb disks? No, PAGELATCH contention is for in-memory copies of pages, not disk (that’s PAGEIOLATCH).
- Tune the queries in question? Not a bad idea, but this problem is about the number of problem queries, not the performance of each.
- A fix from Microsoft Support: Fix: Poor performance in SQL Server 2008 R2 when table-valued functions use many table variables. Oooh… so close, but we weren’t using table-valued functions. And the workarounds they list (disabling AUTO_UPDATE_STATISTICS) did not help. But maybe it might help you?
- The Object:Created event. Whether traced with Profiler or collected with an Extended Events session, this event can report on created temp tables. Maybe I can use this event to tell me which queries are creating the tempdb tables. Nope, not this time! This event has two drawbacks which make it useless for troubleshooting 2:1:103 contention:
- The Object:Created event reports the creation of temp tables even when they’re cached (which don’t need a latch on 2:1:103).
- The Object:Created event doesn’t report the declaration of any table variables (which may need a latch on 2:1:103).
Strategy: So What Do We Do?
Knowing is half the battle right? But that means we still have a lot of work to do. I’m going to recap what we know so far. We know that we have trouble when there are queries that:
- are executed frequently
- create temp tables (either explicitly or by declaring table valued parameters).
- are not cached (Microsoft explains when temp tables are not cached. In my case, it was because they were ad-hoc queries)
- require a page latch on 2:1:103.
There is a performance counter that can track the all of the above (except maybe for that last bullet). It’s called Temp Tables Creation Rate and it’s found in the perf counter category “General Statistics”. Now this is a metric you can trust. We found that a high temp table creation rate was tightly correlated to the trouble we were seeing. So when troubleshooting, look at this performance counter (and leave the “Object:Create” event alone).
So now what do we do? First, we must find the ad hoc queries that create these temp tables. Then, we have to put them in stored procedures so that the temp tables can be cached. Alternatively, we could reduce the need for creating them. It’s a workaround, but it’s what we’ve got.
Finding Such Queries Is Difficult
But here’s the hard part. In a high volume system, it’s difficult to identify exactly which queries are causing the most trouble. Microsoft support can go through tons of collected trace and performance data to try to find such queries, but it’s a long process. On our side, we looked at sys.dm_os_waiting_tasks:
Select * |
And we saw all the contention on 2:1:103, but when we tried to look up the SQL text for it
Select wt.*, st.text |
The text was often unavailable. Basically, I’m guessing maybe the dmv’s I was using weren’t quick enough to tell me which queries were suffering from (or causing) contention on 2:1:103. So I decided to look through the cache for query candidates that might create temp tables. here’s what I came up with. It’s not a comprehensive list and there might be false positives, but it might be enough to go on. If you know your applications well, you can tailor the filters below to something more relevant for you.
select cp.plan_handle, sql_handle, text, refcounts, usecounts |
Going Forward
Personally after helping identify and implement the workarounds. I’m doing a couple things:
- I’m recommending that developers not create temp tables or declare table variables that cannot be cached. For now, this means we use stored procedures for any query that uses table variables or temp tables.
- We now have thresholds on our performance tests which look at the performance counter Temp Tables Creation Rate.
- I created a Microsoft Connect item. If you’re troubleshooting the same problem, head over there and let Microsoft know you’re having trouble too.
PAGELATCH_EX Contention on 2:1:103的更多相关文章
- 记一则update 发生enq: TX - row lock contention 的处理方法
根据事后在虚拟机中复现客户现场发生的情况,做一次记录(简化部分过程,原理不变) 客户端1执行update语句 SQL> select * from test; ID NAME --------- ...
- Entity Framework 6 Recipes 2nd Edition(10-3)译 -> 返回结果是一个标量值
10-3. 返回结果是一个标量值 问题 想取得存储过程返回的一个标量值. 解决方案 假设我们有如Figure 10-2所示的ATM机和ATM机取款记录的模型 Figure 10-2. 一个ATM机和A ...
- 解决一则enq: TX – row lock contention的性能故障
上周二早上,收到项目组的一封邮件: 早上联代以下时间点用户有反馈EDI导入"假死",我们跟踪了EDI导入服务,服务是正常在跑,可能是处理的慢所以用户感觉是"假死" ...
- ORACLE等待事件:enq: TX - row lock contention
enq: TX - row lock contention等待事件,这个是数据库里面一个比较常见的等待事件.enq是enqueue的缩写,它是一种保护共享资源的锁定机制,一个排队机制,先进先出(FIF ...
- ORACLE AWR结合ASH诊断分析enq: TX - row lock contention
公司用户反馈一系统在14:00~15:00(2016-08-16)这个时间段反应比较慢,于是生成了这个时间段的AWR报告, 如上所示,通过Elapsed Time和DB Time对比分析,可以看出在这 ...
- ORA-00600: internal error code, arguments: [kcblasm_1], [103], [], [], [], [], [], []
一ORACLE 10.2.0.5.0 标准版的数据库的告警日志出现ORA-00600错误,具体错误信息如下所示 Errors in file /u01/app/oracle/admin/SCM2/bd ...
- 【故障处理】队列等待之enq IV - contention案例
[故障处理]队列等待之enq IV - contention案例 1.1 BLOG文档结构图 1.2 前言部分 1.2.1 导读和注意事项 各位技术爱好者,看完本文后,你可以掌握如下的技能,也 ...
- Erlang 103 Erlang分布式编程
Outline 笔记系列 Erlang环境和顺序编程Erlang并发编程Erlang分布式编程YawsErlang/OTP 日期 变更说明 2014-11-23 A Outl ...
- 【等待事件】序列等待事件总结(enq: SQ - contention、row cache lock、DFS lock handle和enq: SV - contention)
[等待事件]序列等待事件总结(enq: SQ - contention.row cache lock.DFS lock handle和enq: SV - contention) 1 BLOG文档结 ...
随机推荐
- 【COGS】2287:[HZOI 2015]疯狂的机器人 FFT+卡特兰数+排列组合
[题意][COGS 2287][HZOI 2015]疯狂的机器人 [算法]FFT+卡特兰数+排列组合 [题解]先考虑一维的情况,支持+1和-1,前缀和不能为负数,就是卡特兰数的形式. 设C(n)表示第 ...
- django错误笔记——URL
django提交表单提示"RuntimeError: You called this URL via POST, but the URL doesn’t end in a slash and ...
- Python练习-短小精干版三级"片儿"
经过今天Alex大神的指点,终于打通任督二脉了!将昨天比较复杂的代码优化至此:(代码注释后期添加) # 编辑者:闫龙 #三级目录 menu = { '北京':{ '海淀':{ '五道口':{'soho ...
- C# 图片和Base64之间的转换
public static Bitmap GetImageFromBase64String(string strBase) { try { MemoryStream stream = new Memo ...
- layui-laypage模块代码详解
/** layui-v2.4.0 MIT License By https://www.layui.com */;layui.define(function(e) { "use strict ...
- STL容器基本功能与分类
STL有7中容器. 分别为: vector 向量 <vector>(头文件) 随机访问容器.顺序容器 deque 双端队列 <deque> 随机访问容器.顺序容器 list ...
- no libsigar-amd64-linux.so in java.library.path 解决方法
关于sigar的介绍可以参考这边博文 :https://www.cnblogs.com/luoruiyuan/p/5603771.html 在Linux上运行java程序时出现 no libsigar ...
- linux编译警告 will be initialized after
http://blog.chinaunix.net/uid-17019762-id-3152012.html 作为一个有强迫症的人,实在是受不了 warning 的存在 这个warning是由于初始化 ...
- C++中关于位域的概念
原文来自于http://topic.csdn.net/t/20060801/11/4918904.html中的回复 位域 有些信息在存储时,并不需要占用一个完整的字节, 而只需占几个或一个二进制位 ...
- linux快速安装mysql教程
#安装mysql服务器:yum install mysql-server #设置开机启动chkconfig mysqld on#现在启动服务service mysqld start #设置root初始 ...