Common Scenarios to avoid with DataWarehousing
Database Design
|
Rule |
Description |
Value |
Source |
Problem Description |
|
1 |
Excessive sorting and RID lookup operations should be reduced with covered indexes. |
Sys.dm_exec_sql_text Sys.dm_exec_cached_plans |
Large data warehouse can benefit from more indexes. Indexes can be used to cover queries and avoid sorting. The cost of index overhead is only paid when data is loaded. |
|
|
2 |
Excessive fragmentation: Average fragmentation_in_percent should be <25% |
>25% |
sys.dm_db _index_physical_stats |
Reducing index fragmentation through index rebuilds can benefit big range scans, common in data warehouse and Reporting scenarios. |
|
3 |
Scans and ranges are common. Look for missing indexes |
>= 1 |
Perfmon object SQL Server Access Methods Sys.dm_db_missing_index_group_stats Sys.dm_db_missing_index_groups Sys.dm_db_missing_index_details |
A missing index flushes the cache. |
|
4 |
Unused Indexes should be avoided |
If an index is NEVER used, it will not appear in the DMV sys.dm_db_index_usage_stats |
Index maintenance for unused indexes should be avoided. |
|
Resource issue: CPU
|
Rule |
Description |
Value |
Source |
Problem Description |
|
1 |
Signal Waits |
> 25% |
Sys.dm_os_wait_stats |
Time in runnable queue is pure CPU wait. |
|
2 |
Avoid plan reuse |
> 25% |
Perfmon object SQL Server Statistics |
Data warehouse has fewer transactions than OLTP, each with significantly bigger IO. Therefore, having the correct plan is more important than reusing a plan. Unlike OLTP, data warehouse queries are not identical. |
|
3 |
Parallelism: Cxpacket waits |
<10% |
Sys.dm_os_wait_stats |
Parallelism is desirable in data warehouse or reporting workloads. |
Resource issue: Memory
|
Rule |
Description |
Value |
Source |
Problem Description |
|
1 |
Memory grants pending |
>1 |
Perfmon object SQL Server Memory Manager |
Memory grant not available for query to run. Check for Sufficient memory and page life expectancy. |
|
2 |
Page life expectancy |
Drops by 50% |
Perfmon object SQL Server Buffer Manager |
Page life expectancy is the average number of seconds a data page stays in cache. Low values could indicate a cache flush that is caused by a big read. Look for possible missing index. |
Resource issue: IO
|
Rule |
Description |
Value |
Source |
Problem Description |
|
1 |
Average Disk sec/read |
>20 ms |
Perfmon object Physical Disk |
Reads should take 4-8ms without any IO pressure. |
|
2 |
Average Disk sec/write |
>20 ms |
Perfmon object Physical Disk |
Writes (sequential) can be as fast as 1 ms for transaction log. |
|
3 |
Big scans |
>1 |
Perfmon object SQL Server Access Methods |
A missing index flushes the cache. |
|
4 |
If Top 2 values for wait stats are any of the following: ASYNCH_IO_COMPLETION IO_COMPLETION LOGMGR WRITELOG PAGEIOLATCH_x |
Top 2 |
Sys.dm_os_wait_stats |
If top 2 wait_stats values include IO, there is an IO bottleneck |
Resource issue: Blocking
|
Rule |
Description |
Value |
Source |
Problem Description |
|
1 |
Block percentage |
>2% |
Sys.dm_db_index_operational_stats |
Frequency of blocks. |
|
2 |
Block process report |
30 sec |
Sp_configure, profiler |
Report of statements. |
|
3 |
Average Row Lock Waits |
>100ms |
Sys.dm_db_index_operational_stats |
Duration of blocks. |
|
4 |
If Top 2 values for wait stats are any of the following: LCK_M_BU LCK_M_IS LCK_M_IU LCK_M_IX LCK_M_RIn_NL LCK_M_RIn_S LCK_M_RIn_U LCK_M_RIn_X LCK_M_RS_S LCK_M_RS_U LCK_M_RX_S LCK_M_RX_U LCK_M_RX_X LCK_M_S LCK_M_SCH_M LCK_M_SCH_S LCK_M_SIU LCK_M_SIX LCK_M_U LCK_M_UIX LCK_M_X |
Top 2 |
Sys.dm_os_wait_stats |
If top 2 wait_stats values include IO, there is a blocking bottleneck. Consider using row versioning to minimize shared locking blocks. |
Exactly the opposite of OLTP applications, reporting or relational data warehouse applications are characterized by small numbers of (different) big transactions. These are frequently SELECT intensive operations. The implications are significant for database design, resource usage, and system performance.
Reporting and data warehouse performance objectives are as follows:
- Data warehouse and relational data warehouse designs can have more indexes as the cost of index maintenance is paid only one time, during the batch update process.
- Plan reuse should generally be avoided. Plan reuse may result in picking up a plan that was good for some other query (with different data distribution), but may not be good for this query. The time taken for plan generation of a large DataWarehouse query is not nearly as important as having the right plan.
- Sorts can and should be minimized with correct index usage.
- Missing index situations should be investigated and corrected.
- Large IOs such as range scans benefits from on disk contiguity. Index fragmentation should be frequently monitored and kept to a minimum with index rebuilds.
- Blocking is generally uncommon as most data warehouse transactions are read operations.
- Parallelism is generally desirable for data warehouse applications.
Common Scenarios to avoid with DataWarehousing的更多相关文章
- Common scenarios to avoid in OLTP
Database Design Rule Description Value Source Problem Description 1 High Frequency queries having a ...
- 8 Mistakes to Avoid while Using RxSwift. Part 1
Part 1: not disposing a subscription Judging by the number of talks, articles and discussions relate ...
- Android Lint Checks
Android Lint Checks Here are the current list of checks that lint performs as of Android Studio 2.3 ...
- (WPF) 基本题
What is WPF? WPF (Windows Presentation foundation) is a graphical subsystem for displaying user inte ...
- Processing Images
https://developer.apple.com/library/content/documentation/GraphicsImaging/Conceptual/CoreImaging/ci_ ...
- IMS Global Learning Tools Interoperability™ Implementation Guide
Final Version 1.1 Date Issued: 13 March 2012 Latest version: http://www.imsglobal ...
- 9.Parameters
1.Optional and Named Parameters calls these methods can optionally not specify some of the arguments ...
- C# Development 13 Things Every C# Developer Should Know
https://dzone.com/refcardz/csharp C#Development 13 Things Every C# Developer Should Know Written by ...
- Introducing Microsoft Sync Framework: Sync Services for File Systems
https://msdn.microsoft.com/en-us/sync/bb887623 Introduction to Microsoft Sync Framework File Synchro ...
随机推荐
- Android IOS WebRTC 音视频开发总结(六九)-- qq视频通话都是p2p,我们还怕啥?
本文主要介绍在线教育这个行业,文章最早发表在我们的微信公众号上,支持原创,详见这里, 欢迎关注微信公众号blackerteam,更多详见www.rtc.help 先简单介绍什么是p2p? p是peer ...
- 移动端自动化环境搭建-Android-SDK的安装
安装android的sdk包 A.安装依赖 我们做的是移动端的自动化测试,肯定就需要android的开发环境 网上也有好多教程,我只是用的最简单的 B.安装过程 首先需要前往android官网,找到S ...
- c#基础-oop(面向对象理解)
OOP-面向对象 封装,继承多态 一个桌子,用面向对象来描述一下它这个桌子项目 定义桌子类 对象:桌子 桌子的属性:名字,材质,体积 桌子的方法;放东西(方法) 现在桌子要放书,放花瓶,放文件(这里就 ...
- HP XP7 GAD双活实现的理解
XP7双活的虚拟卷global active device (GAD)实际上对应两个存储的两个物理卷(有点儿像Mirror Disk镜像) 当主机A向阵列A发出写数据请求后,阵列A首先检查要被写入的数 ...
- JavaScript高级程序设计笔记之面向对象
说起面向对象,大部分程序员首先会想到 类 .通过类可以创建许多具有共同属性以及方法的实例或者说对象.但是JavaScript并没有类的概念,而且在JavaScript中几乎一切皆对象,问题来了,Jav ...
- HDOJ(1010)DFS+剪枝
Tempter of the Bone http://acm.hdu.edu.cn/showproblem.php?pid=1010 #include <stdio.h> #include ...
- centos7.x/RedHat7.x重命名网卡名称
从51CTO博客迁移出来几篇博文. 在CentOS7.x或RedHat7.x上,网卡命名规则变成了默认,既自动基于固件.拓扑结构和位置信息来确定.这样一来虽然有好处,但也会影响操作,因为新的命名规则比 ...
- nodejs 下载,安装,测试(windows环境下)
1.下载 nodejs英文官网:http://nodejs.org/ nodejs中文官网:http://nodejs.cn/ 两个都可以下载,用户可以根据自己的网络来选择用哪个下载. 进入官网之后版 ...
- 随笔—邀请赛前训— Codeforces Round #330 (Div. 2) B题
题意: 这道英文题的题意稍稍有点复杂. 找长度为n的数字序列有多少种.这个序列可以分为n/k段,每段k个数字.k个数可以变成一个十进制的数Xi.要求对这每n/k个数,剔除Xi可被ai整除的情况,剔除X ...
- Windows 下, SetTimer 定时器的研究.
一直很困惑一个问题: 我设置了一个10秒的定时器,可是被调用的函数要花费30秒, 那待调用的函数第二次是什么时候调用的呢? 20秒, 40秒, 还是50秒呢.... 所以我进行了实验. 我写了一个类 ...