CXPACKET 等待类型是SQL Server 并发执行一个query时产生的。在run一个big query时,SQL Server充分利用系统的所有资源(CPU,Memory,IO),在最短时间内返回结果。在拥有多个Processor的系统中,如果将一个Query的Wordload 分布在多个Processor上,那么每个Processor只处理一部分数据,这将成倍的减少查询的时间消耗,相应地,成倍的提高查询性能。

在执行Query时,SQL Server 估计其时间 Cost,如果大于 Parallelism Threshold Value,那么SQL Server使用多个Thread(每个CPU上在同一时间只能运行一个Thread),以并发方式执行query;如果低于Parallelism Threshold Value,那么SQL Server 使用单个Thread执行query。这个 Threshold Value 是由 cost threshold for parallelism Option 决定,引用《cost threshold for parallelism Option》:

Use the cost threshold for parallelism option to specify the threshold at which Microsoft SQL Server creates and runs parallel plans for queries. SQL Server creates and runs a parallel plan for a query only when the estimated cost to run a serial plan for the same query is higher than the value set in cost threshold for parallelism. The cost refers to an estimated elapsed time in seconds required to run the serial plan on a specific hardware configuration.

Longer queries usually benefit from parallel plans; the performance advantage negates the additional time required to initialize, synchronize, and terminate parallel plans. The cost threshold for parallelism option is actively used when a mix of short and longer queries is run. The short queries run serial plans, whereas the longer queries use parallel plans. The value of cost threshold for parallelism determines which queries are considered short, and they should therefore be run using serial plans.

In certain cases, a parallel plan may be chosen even though the query's cost plan is less than the current cost threshold for parallelism value. This can happen because the decision to use a parallel or serial plan is based on a cost estimate provided before the full optimization is complete.

The cost threshold for parallelism option can be set to any value from 0 through 32767. The default value is 5.

不是所有的query都适合并发执行,并发执行有一定的overhead,如果一个query耗时十分小,SQL Server 创建并发执行结构体的Time Cost都会比执行整个查询的Time Cost高。Cost threshold for parallelism option的默认值是5。

如果SQL Server 以并发方式执行Single Query,那么SQL Server使用多少个Thread来并发执行该Query?这个数值是由 max degree of parallelism Option决定的,引用《max degree of parallelism Option》:

When SQL Server runs on a computer with more than one microprocessor or CPU, it detects the best degree of parallelism, that is, the number of processors employed to run a single statement, for each parallel plan execution. You can use the max degree of parallelism option to limit the number of processors to use in parallel plan execution. To enable the server to determine the maximum degree of parallelism, set this option to 0, the default value. Setting maximum degree of parallelism to 0 allows SQL Server to use all the available processors up to 64 processors. To suppress parallel plan generation, set max degree of parallelism to 1. Set the value to a number greater than 1 to restrict the maximum number of processors used by a single query execution. The maximum value for thedegree of parallelism setting is controlled by the edition of SQL Server, CPU type, and operating system. If a value greater than the number of available processors is specified, the actual number of available processors is used. If the computer has only one processor, the max degree of parallelism value is ignored.

sp_configure ;
GO
reconfigure;
GO
sp_configure ;
GO
reconfigure;
GO

sp_configure ;
GO
RECONFIGURE WITH OVERRIDE;
GO
sp_configure ;
GO
RECONFIGURE WITH OVERRIDE;
GO

在并发执行的Thread中,包括Master Thread,假如MDP=6,那么同时有7个Thread来执行Single Query,其中一个Thread是Master Thread,其他被称作Child Thread,Master Thread负责分配任务和将结果combine在一起,Child Thread负责执行特定的任务。由于Child Thread 被分配的Workload不平均,执行的速度不相同,因此,当一些Child Thread完成任务之后,需要等待未完成的Child Thread。这种在并发执行内部,一些Child Thread需要等待其他Child Thread的Wait type就是:CXPACKET。

CXPACKET indicates that multiple CPUs are working in parallel, dividing up the query in smaller pieces.

推荐阅读:《Wait statistics, or please tell me where it hurts

CXPACKET indicates parallelism, not necessarily that there’s a problem. The coordinator thread in a parallel query always accumulates these waits. If the parallel threads are not given equal amounts of work to do, or one thread blocks, the waiting threads will also accumulate CXPACKET waits, which will make them aggregate a lot faster – this is a problem. One thread may have a lot more to do than the others, and so the whole query is blocked while the long-running thread completes. If this is combined with a high number of PAGEIOLATCH_XX waits, it could be large parallel table scans going on because of incorrect non-clustered indexes, or a bad query plan.

If neither of these are the issue, you might want to try setting MAXDOP to 4, 2, or 1 for the offending queries (or possibly the whole instance). Make sure that if you have a NUMA system that you try setting MAXDOP to the number of cores in a single NUMA node first to see if that helps the problem. You also need to consider the MAXDOP effect on a mixed-load system. To be honest, I’d play with the cost threshold for parallelism setting (bump it up to, say, 25) before reducing the MAXDOP of the whole instance. And don’t forget Resource Governor in Enterprise Edition of  SQL Server 2008 onward that allows DOP governing for a particular group of connections to the server.

引用:Knee-Jerk Wait Statistics : CXPACKET

What does the CXPACKET wait type mean?

The simplest explanation is that CXPACKET means you’ve got queries running in parallel and you will *always* see CXPACKET waits for a parallel query. CXPACKET waits do NOT mean you have problematic parallelism – you need to dig deeper to determine that.

You can see the various threads in a parallel operator that are waiting for a resource using the sys.dm_os_waiting_tasks DMV

select *
from sys.dm_os_waiting_tasks

Script2

SELECT
    [owt].[session_id],
    [owt].[exec_context_id],
    [ot].[scheduler_id],
    [owt].[wait_duration_ms],
    [owt].[wait_type],
    [owt].[blocking_session_id],
    [owt].[resource_description],
    CASE [owt].[wait_type]
        WHEN N'CXPACKET' THEN
            RIGHT ([owt].[resource_description],
                )
        ELSE NULL
    END AS [Node ID],
    --[es].[program_name],
    --[est].text,
    [er].[database_id],
    --[eqp].[query_plan],
    [er].[cpu_time]
FROM sys.dm_os_waiting_tasks [owt]
INNER JOIN sys.dm_os_tasks [ot]
    ON [owt].[waiting_task_address] = [ot].[task_address]
INNER JOIN sys.dm_exec_sessions [es]
    ON [owt].[session_id] = [es].[session_id]
INNER JOIN sys.dm_exec_requests [er]
    ON [es].[session_id] = [er].[session_id]
OUTER APPLY sys.dm_exec_sql_text ([er].[sql_handle]) [est]
--OUTER APPLY sys.dm_exec_query_plan ([er].[plan_handle]) [eqp]

ORDER BY [owt].[session_id],
        [owt].[exec_context_id];
GO

There is always a control thread for any parallel operator, which by historical accident is always thread ID 0. The control thread always registers a CXPACKET wait, with the duration equal to the length of time the operator takes to execute each time it is run by the plan.

The only time non-control threads will register CXPACKET waits is if they complete before the other threads in the operator. This can happen if one of the threads gets stuck waiting for a resource for a long time, so look to see what the wait type is of the thread not showing CXPACKET (using my script above) and troubleshoot appropriately. This can also happen because of a skewed work distribution among the threads, and I’ll go into more depth on that case in my next post here (it’s caused by out-of-date statistics and other cardinality estimation problems).

Unexpected parallelism?

Given that CXPACKET simply means you’ve got parallelism happening, the first thing to look at is whether you expect parallelism for the query that’s using it. My query will give you the query plan node ID where the parallelism is happening (it pulls out the node ID from the XML query plan if the wait type of the thread is CXPACKET) so look for that node ID and determine whether the parallelism makes sense.

One of the common cases of unexpected parallelism is when a table scan happens where you’re expecting a smaller index seek or scan. You’ll either see this in the query plan or you’ll see lots of PAGEIOLATCH_SH waits (discussed in detail here) along with the CXPACKET waits (a classic wait statistics pattern to look out for). There are a variety of causes of unexpected table scans, including:

  • Missing nonclustered index so a table scan is the only alternative
  • Out-of-date statistics so the Query Optimizer thinks a table scan is the best data access method to use
  • An implicit conversion, because of a data type mismatch between a table column and a variable or parameter, which means a nonclustered index cannot be used
  • Arithmetic being performed on a table column instead of a variable or parameter, which means a nonclustered index cannot be used

参考Doc:

Wait statistics, or please tell me where it hurts

What is the CXPACKET Wait Type, and How Do You Reduce It?

WaitType:CXPACKET的更多相关文章

  1. WaitType:ASYNC_IO_COMPLETION

    项目组有一个数据库备份的Job运行异常,该Job将备份数据存储到remote server上,平时5个小时就能完成的备份操作,现在运行19个小时还没有完成,backup命令的Wait type是 AS ...

  2. WaitType:ASYNC_NETWORK_IO

    官方文档的定义,是指SQL Server 产生的结果集需要经过Network传递到Client,Network不能很快将结果集传输到Client,导致结果集仍然驻留在SQL Server的Sessio ...

  3. WaitType:SOS_SCHEDULER_YIELD

    今天遇到一个query,处于SOS_SCHEDULER_YIELD 状态,physical IO 不增加,CPU的使用一直在增长.当一个sql query长时间处于SOS_SCHEDULER_YIEL ...

  4. WaitType:ASYNC

    项目组有一个数据库备份的Job运行异常,该Job将备份数据存储到remote server上,平时5个小时就能完成的备份操作,现在运行19个小时还没有完成,backup命令的Wait type是 AS ...

  5. sql server:查詢系統表

    ---查看所有存储过程或视图的位置 select a.name,a.[type],b.[definition] from sys.all_objects a,sys.sql_modules b whe ...

  6. Expert 诊断优化系列------------------语句调优三板斧

    前面三篇通过CPU.内存.磁盘三巨头,讲述了如何透过现在看本质,怎样定位服务器三巨头反映出的问题.为了方便阅读给出链接: SQL SERVER全面优化-------Expert for SQL Ser ...

  7. SQL Server简洁查询正在运行的进程SQL

    通常我们可以使用 sp_who2 我们希望更加简洁的信息,下面这个查询使用系统表sys.sysprocesses,以及sys.dm_exec_sql_text做OUTER APPLY. T-SQL是这 ...

  8. sys.sysprocesses视图的使用小结

    Sys.SysProcesses 系统表是一个很重要的系统视图,主要用来定位与解决Sql Server的阻塞和死锁包含正在 SQL Server 实例上运行的进程的相关信息.这些进程可以是客户端进程或 ...

  9. MYSQL一次性能优化实战经历[转]

    每次经历数据库性能调优,都是对性能优化的再次认识.对自己知识不足的有力验证,只有不断总结.学习才能少走弯路. 一.性能问题描述 应用端反应系统查询缓慢,长时间出不来结果.SQLServer数据库服务器 ...

随机推荐

  1. iOS APP可执行文件的组成

    iOS APP编译后,除了一些资源文件,剩下的就是一个可执行文件,有时候项目大了,引入的库多了,可执行文件很大,想知道这个可执行文件的构成是怎样,里面的内容都是些什么,哪些库占用空间较高,可以用以下方 ...

  2. iOS UIColor RGB HEX

    +(UIColor *)colorWithR:(CGFloat)r g:(CGFloat)g b:(CGFloat)b a:(CGFloat)a{ return [UIColor colorWithR ...

  3. 【原】iOS学习之PINCache第三方缓存框架

    在项目中总是需要缓存一些网络请求数据以减轻服务器压力,业内也有许多优秀的开源的解决方案.通常的缓存方案都是由内存缓存和磁盘缓存组成的,内存缓存速度快容量小,磁盘缓存容量大速度慢可持久化. 1.PINC ...

  4. 全本软件白名单 Quanben Software Whitelist

    Windows应用软件 Windows Applications (TBU) 全本推荐微软Windows 10操作系统 Quanben recommends Microsoft Windows 10 ...

  5. shell脚本生成限定范围的随机数

    #!/bin/bash +)) 这串代码实现了随机生成从1~50之间是数 这串代码特别简单,就是利用RANDOM这个随机数生成器进行取余就能够实现,至于为什么取余时需要+1是因为在取余时如果被整除那么 ...

  6. 【总结】C# Access 数据库 增删查改 的简单步骤

        引用集: using System.Data.OleDb; static string exePath = System.Environment.CurrentDirectory;//本程序所 ...

  7. JS代码将数据导入Excel

    如果在别的浏览器中无法导入,尝试用IE浏览器 function AutomateExcelall(){try { oXL = new ActiveXObject('Excel.Application' ...

  8. tcpdump

    tcpdump tcp -i eth1 -t -s -c and dst port ! and src net -w ./target.cap (1)tcp: ip icmp arp rarp 和 t ...

  9. 【Beta】Daily Scrum Meeting第五次

    1.任务进度 学号 已完成 接下去要做 502 登陆时将返回的个人信息更新到本地数据库 发布任务到服务器 509 给所有api添加注释 添加及修改职工信息并同步到服务器 517 将提交报课移到报课表界 ...

  10. *HDU 1028 母函数

    Ignatius and the Princess III Time Limit: 2000/1000 MS (Java/Others)    Memory Limit: 65536/32768 K ...