Common Scenarios to avoid with DataWarehousing

Database Design

Rule	Description	Value	Source	Problem Description
1	Excessive sorting and RID lookup operations should be reduced with covered indexes.		Sys.dm_exec_sql_text Sys.dm_exec_cached_plans	Large data warehouse can benefit from more indexes. Indexes can be used to cover queries and avoid sorting. The cost of index overhead is only paid when data is loaded.
2	Excessive fragmentation: Average fragmentation_in_percent should be <25%	>25%	sys.dm_db _index_physical_stats	Reducing index fragmentation through index rebuilds can benefit big range scans, common in data warehouse and Reporting scenarios.
3	Scans and ranges are common. Look for missing indexes	>= 1	Perfmon object SQL Server Access Methods Sys.dm_db_missing_index_group_stats Sys.dm_db_missing_index_groups Sys.dm_db_missing_index_details	A missing index flushes the cache.
4	Unused Indexes should be avoided	If an index is NEVER used, it will not appear in the DMV sys.dm_db_index_usage_stats		Index maintenance for unused indexes should be avoided.

Resource issue: CPU

Rule	Description	Value	Source	Problem Description
1	Signal Waits	> 25%	Sys.dm_os_wait_stats	Time in runnable queue is pure CPU wait.
2	Avoid plan reuse	> 25%	Perfmon object SQL Server Statistics	Data warehouse has fewer transactions than OLTP, each with significantly bigger IO. Therefore, having the correct plan is more important than reusing a plan. Unlike OLTP, data warehouse queries are not identical.
3	Parallelism: Cxpacket waits	<10%	Sys.dm_os_wait_stats	Parallelism is desirable in data warehouse or reporting workloads.

Resource issue: Memory

Rule

Description

Value

Source

Problem Description

Memory grants pending

Perfmon object

SQL Server Memory Manager

Memory grant not available for query to run. Check for

Sufficient memory and page life expectancy.

Page life expectancy

Drops by 50%

Perfmon object

SQL Server Buffer Manager

Page life expectancy is the average number of seconds a data page stays in cache. Low values could indicate a cache flush that is caused by a big read.

Look for possible missing index.

Resource issue: IO

Rule	Description	Value	Source	Problem Description
1	Average Disk sec/read	>20 ms	Perfmon object Physical Disk	Reads should take 4-8ms without any IO pressure.
2	Average Disk sec/write	>20 ms	Perfmon object Physical Disk	Writes (sequential) can be as fast as 1 ms for transaction log.
3	Big scans	>1	Perfmon object SQL Server Access Methods	A missing index flushes the cache.
4	If Top 2 values for wait stats are any of the following: ASYNCH_IO_COMPLETION IO_COMPLETION LOGMGR WRITELOG PAGEIOLATCH_x	Top 2	Sys.dm_os_wait_stats	If top 2 wait_stats values include IO, there is an IO bottleneck

Resource issue: Blocking

Rule	Description	Value	Source	Problem Description
1	Block percentage	>2%	Sys.dm_db_index_operational_stats	Frequency of blocks.
2	Block process report	30 sec	Sp_configure, profiler	Report of statements.
3	Average Row Lock Waits	>100ms	Sys.dm_db_index_operational_stats	Duration of blocks.
4	If Top 2 values for wait stats are any of the following: LCK_M_BU LCK_M_IS LCK_M_IU LCK_M_IX LCK_M_RIn_NL LCK_M_RIn_S LCK_M_RIn_U LCK_M_RIn_X LCK_M_RS_S LCK_M_RS_U LCK_M_RX_S LCK_M_RX_U LCK_M_RX_X LCK_M_S LCK_M_SCH_M LCK_M_SCH_S LCK_M_SIU LCK_M_SIX LCK_M_U LCK_M_UIX LCK_M_X	Top 2	Sys.dm_os_wait_stats	If top 2 wait_stats values include IO, there is a blocking bottleneck. Consider using row versioning to minimize shared locking blocks.

Exactly the opposite of OLTP applications, reporting or relational data warehouse applications are characterized by small numbers of (different) big transactions. These are frequently SELECT intensive operations. The implications are significant for database design, resource usage, and system performance.

Reporting and data warehouse performance objectives are as follows:

Data warehouse and relational data warehouse designs can have more indexes as the cost of index maintenance is paid only one time, during the batch update process.
Plan reuse should generally be avoided. Plan reuse may result in picking up a plan that was good for some other query (with different data distribution), but may not be good for this query. The time taken for plan generation of a large DataWarehouse query is not nearly as important as having the right plan.
Sorts can and should be minimized with correct index usage.
Missing index situations should be investigated and corrected.
Large IOs such as range scans benefits from on disk contiguity. Index fragmentation should be frequently monitored and kept to a minimum with index rebuilds.
Blocking is generally uncommon as most data warehouse transactions are read operations.
Parallelism is generally desirable for data warehouse applications.

Common Scenarios to avoid with DataWarehousing的更多相关文章

Common scenarios to avoid in OLTP
Database Design Rule Description Value Source Problem Description 1 High Frequency queries having a ...
8 Mistakes to Avoid while Using RxSwift. Part 1
Part 1: not disposing a subscription Judging by the number of talks, articles and discussions relate ...
Android Lint Checks
Android Lint Checks Here are the current list of checks that lint performs as of Android Studio 2.3 ...
(WPF) 基本题
What is WPF? WPF (Windows Presentation foundation) is a graphical subsystem for displaying user inte ...
Processing Images
https://developer.apple.com/library/content/documentation/GraphicsImaging/Conceptual/CoreImaging/ci_ ...
IMS Global Learning Tools Interoperability™ Implementation Guide
Final Version 1.1 Date Issued: 13 March 2012 Latest version: http://www.imsglobal ...
9.Parameters
1.Optional and Named Parameters calls these methods can optionally not specify some of the arguments ...
C# Development 13 Things Every C# Developer Should Know
https://dzone.com/refcardz/csharp C#Development 13 Things Every C# Developer Should Know Written by ...
Introducing Microsoft Sync Framework: Sync Services for File Systems
https://msdn.microsoft.com/en-us/sync/bb887623 Introduction to Microsoft Sync Framework File Synchro ...

随机推荐

shell字符串判空
主要用到两个命令 -n -z if [ -n "$PID" ]; then echo "PID is not empty" fi if[ -z "$ ...
Android开发--环境的配置
一 Android开发环境:JDK.eclipse ADT.海马模拟器或者夜神模拟器.配置之前先保证运行内存足够大,不然会导致运行卡. 二 JDK(不用安装) 1.jdk官方下载地址:http://w ...
apache+tomcat分布式集群搭建
今天搭建apche+tomcat分布式集群,遇到很多问题,在网上找到的很多都不成功,然后和同事一起研究了一下,最终搭建成功了.做个笔记,以备自己以后参考. 1,下载apache.在下载Apache(2 ...
web开发-给即将毕业实习生的一点面试经验
简历投递: 智联招聘51job 像赶集网和58同城最好别去投面试的公司,特别是深圳这边,面试的时候公司小,很多人,八九不离十是那种搞培训的,很多时候,有些公司会主动打电话来教你去面试,这些绝大多数也 ...
Mysql忘记用户密码的解决办法
1.1 忘记用户密码的解决办法普通用户,直接用root超级管理员登录进去修改密码就可以了,但是如果root密码丢失了,怎么办呢? 1.1.1 msyqld_saft方式找回密码停止mysql:se ...
MS sql server 基础知识回顾（二）-表连接和子查询
五.表连接当数据表中存在许多重复的冗余信息时,就要考虑将这些信息建在另一张新表中,在新表中为原表设置好外键,在进行数据查询的时候,就要使用到连接了,表连接就好像两根线,线的两端分别连接两张表的不同字 ...
python学习笔记2-functools.wraps 装饰器
wraps其实没有实际的大用处, 就是用来解决装饰器导致的原函数名指向的函数的属性发生变化的问题: 装饰器装饰过函数func, 此时func不是指向真正的func,而是指向装饰器中的装饰过的函数 i ...
java break语句的三种用法
1.用于switch语句当中,用于终止语句 2.用于跳出循环,此为不带标签的break语句,相当与goto的作用 e.g while(i<j&&h<k){ if(h< ...
MATLAB - 练习程序，直方图均衡化
直方图均衡化的作用是图像增强. 有两个问题比较难懂,一是为什么要选用累积分布函数,二是为什么使用累积分布函数处理后像素值会均匀分布. 第一个问题.均衡化过程中,必须要保证两个条件:①像素无论怎么映射, ...
使用Aspose插件将程序中的表格，导出生成excel表格
http://www.cnblogs.com/lanyue52011/p/3372452.html这个是原文地址 /// <summary> /// 点击按钮,将内存表导出excel表格! ...

Common Scenarios to avoid with DataWarehousing

Common Scenarios to avoid with DataWarehousing的更多相关文章

随机推荐

热门专题