Reference: http://blogs.msdn.com/b/felixmar/archive/2011/08/29/partitioning-amp-archiving-tables-in-sql-server-part-2-split-merge-and-switch-partitions.aspx

In the 1st part of this post, I explained how to create a partitioned table using a partition function as well as a partition schema.  Now I’ll continue talking about how to merge or split partitions changing the partition function and the partition schema and how to move data between partitions using alter table command.

Here´s where the importance of  the function definition makes sense, the way partitions split or merge depends of the RIGHT or LEFT specification. The easiest way you can understand how they work is looking at the following examples. I will use a partition function with  LEFT specification, look at the process and the result after splitting and merging the table:

I already created a database PartitionDBLeft, this is how the database looks like:

And these are the definitions for partition schema, partition function and orders table:

In this case, the orders table is stored in a partition schema (psOrderDateRange) as well as the index. Also note that my primary key contains the columns OrderID plus OrderDate which is the partition key (the column by which the table is partitioned). When you create a primary key, SQL creates an index, usually a clustered index which sorts the table by that column. If the column OrderDate is not part of the primary key, you won´t be able to create a clustered key because in partitioned tables the sort order is determined by the partition key so you would need to create a non-clustered key instead. Later in this post you’ll see how important is this.

The partition function is defined using LEFT, now let´s insert some orders for 2008:

Now, let’s query the partitions, using the queries explained the Part 1, as you see the records are stored inside partition 1 (FG1)

Repeat the procedure inserting 300 for 2009 and 300 records for 2010, (just change the value for @OrderDate), this is how it looks:

Maybe you would expect to have 300 records for each year (2008, 209 and 2010) but the results are different, this is due to the function definition. When using LEFT, each value defined in the function corresponds to the UPPER limit of each partition which you can see in  the the column Max Value in the last query. On the other hand if you had use RIGHT instead of LEFT, the result would be the expected. By now, I will drop the table as well as the schema and the function and will define the function again this time using a different value for each range:

The schema will remain the same as well as the table, this time using the new partition function. After inserting the same records this is how it looks:

In fact, you could insert the dates ‘2008/12/31, 2009/12/31 and 2010/12/31 and those would be the limit for each partition:

Split a partition

According to the partition schema definition, If you insert a record for 2011, that record will be inserted in partition 4 which is the PRIMARY partition. Suppose you need to insert the orders from 2011 in a new partition (inside a new filegroup, let’s say FG4).  First, you have to add a new filegroup (FG4) and a new partition (an .NDF file) inside FG4:

Now you can modify your partition function to SPLIT the last range (which currently holds all records greater than 2010/12/31) to create a new range for orders from 01/01/2011 to 31/12/2010. Before doing so, you need to alter the partition schema to include the new filegroup (FG4) to let the partition function map the new range data to this filegroup, otherwise you will get this error:

The following script will alter the partition schema making FG4 the next available filegroup:

Now let´s alter the partition function to include the new range:

Now, check how the partition schema and partition function look after the change, you can use Management Studio to create the scripts:

This is the result:

This is the expected behavior in most of cases when you need to add a new partition in a rotating base of time to hold new data, in this scenario I would do it in december to prepare my orders table for the next year data. I can always use PRIMARY as the last partition for future data since I´m sure there will not be data there, however the partition function requires an additional range for all data that exceeds the last range value.

If I insert data between 2011/01/01 and 2011/12/31 that would be inserted in FG4:

For the next year, all you have to do is repeat the process, I mean: add FG5, alter partition function marking FG5 the next used filegroup and alter the partition function splitting the last range including 2012/12/31 and voilà, you have a new partition ready to hold 2012 records!

Merge Partitions

Now let´s talk about merging partitions. After some years, you decide all those partitions are difficult to maintain, it would be better to have all data for the first n years in a single partition. Let’s suppose you want to join all orders from 2008/01/01 to 20010/12/31 in a single file. Well, you can merge those 3 partitions but just 2 at time.

Merging is as simple as alter the partition function specifying the range that will be the new upper limit for the joined ranges:

Lets firs join FG1 and FG2 and see the result:

If you query the partitions, this is what you will see:

Note that all records from FG1 moved to FG2, now you only have 5 partitions instead of the 6 originals. and FG1 does not contains any data.

Now let´s join FG2 and FG3:

Let´s look what’s inside each filegroup:

Again, data from FG2 moved to FG3.

You can also check the reports included in management studio:

What about FG1 and FG2?. In fact, this filegroups are empty, you could use for future data, for other kind of data or you could remove them.

Switch Partitions

Now suppose you need to archive all data from 2008/01/01 to 2010/12/31 since this is historical data. At the current time, all this data is stored in FG3 which is now partition No. 1. By the way, If this data will never change anymore, you could mark this filegroup as READONLY:

This is not necessary to archive data but is recommended to protect historical information  against writing and to simplify backup strategy which I’ll discuss in another post.

The first thing you must be aware of is index aligning. Remember: if the partition key (OrderDate column) is part of the primary Key (OrderID + OrderDate) the index is aligned with the data. This means that each corresponding portion of the index is allocated in its corresponding partition because it uses the same partition schema and you can have a clustered index. If this is not the case (suppose your primary key is just OrderID), you can´t have a clustered index in the primary key because the data needs to be sorted using the partition key (which is Order Date) and you would have to create a non-clustered index in the primary key and all the index should be stored in a different partition (usually the PRIMARY filegroup).The main problem with this design is that you would need to drop the index before switching the partitions and then recreate the index after the switch operation but also, the process to move data from 1 partition to another (switch) would be slower. Fortunately our Orders table has a aligned index which you can see in the index properties window:

You can see the index is stored in the partition schema (psOrderDateRange) so it is aligned with the data.

In this scenario, I will move data from FG3 to a different table (Orders_History) which can be in the same or in a different database, I just want to archive those records and free space from my current database. Instead of write a delete query which can be slow and use a lot of transaction log space, I will simply move FG3 to a temporary table inside the same filegroup which will be very fast since only metadata will be moved and then I will transfer those records to the historical table which can be stored in the same or in a different database using an INSERT SELECT clause. Of course this will take some time but I can move it in a simpler way and without locking my production table or affecting my users.

The process is:

  1. Create a temporary  table with the same structure from the original table (orders)  in the SAME FILEGROUP. This is necessary since you can only switch data from partitions located in the same filegroup.
  2. Create the destination table (orders_history) with the same structure as the original and the temporary.
  3. Switch the desired partition using ALTER TABLE clause
  4. insert the data from the temporary table to the destination table using INSERT SELECT clause.
  5. Drop the temporary table.

Let´s see:

This is my temporary table created on FG3:

This is my Historical table created in a different database, note I excluded INDENTITY since this value will exists in the source table.

The data I want to move lives in FG3 which is now partition 1. I just simply alter orders table this way:

What is surprising is that it doesn't matter how big partition 1 is, the data will be switched very fast!

Before moving the data to the history table, let´s query the partitions again:

FG3 has not any records.

All the data is now in the Orders_Temp table:

Now let´s move the data to Orders_History:

Finally, drop Orders_Temp table:

However you cannot drop FG3 because one range of the orders table is allocated in this filegroup, although you know there is not data there. However you can merge FG3 and FG4 this will release FG3 if you want to drop it. in the last query you can se FG3 boundary is 2010-12-31, so you can merge using this value:

If you query the partitions, FG3 is not used anymore:

Now you can remove FG3 if you want.

Conclusion

You can create a partitioning strategy in very large tables in order to simplify administration, enhance performance, create an archiving strategy to purge out data from the database and manage partitions merging or splitting them to provision space for future data. You must take care of table design and choose the correct partition key as well as index. If you make a wrong decision  it could be very difficult to manage future data. Try to reproduce the same examples in this post but now using RIGHT instead of LEFT and you´ll see what I mean. Also, try to create a table without the OrderDate column in the primary key to see how the index is built.

I hope you find this information helpful, If that´s the case leave me a message in the blog.

Thanks.

 
 
 
 
 
 

Partitioning & Archiving tables in SQL Server (Part 2: Split, Merge and Switch partitions)的更多相关文章

  1. Partitioning & Archiving tables in SQL Server (Part 1: The basics)

    Reference: http://blogs.msdn.com/b/felixmar/archive/2011/02/14/partitioning-amp-archiving-tables-in- ...

  2. Part 17 Temporary tables in SQL Server

    Temporary tables in SQL Server

  3. SQL Server 2008中的MERGE(不仅仅是合并)

    SQL Server 2008中的MERGE语句能做很多事情,它的功能是根据源表对目标表执行插入.更新或删除操作.最典型的应用就是进行两个表的同步. 下面通过一个简单示例来演示MERGE语句的使用方法 ...

  4. SQL Server 2008中的MERGE(数据同步)

    OK,就像标题呈现的一样,SQL Server 2008中的MERGE语句能做很多事情,它的功能是根据源表对目标表执行插入.更新或删除操作.最典型的应用就是进行两个表的同步. 下面通过一个简单示例来演 ...

  5. Sql Server字符串拆分(Split)方法汇总

    详细链接:https://shop499704308.taobao.com/?spm=a1z38n.10677092.card.11.594c1debsAGeak--方法0:动态SQL法 declar ...

  6. SQL SERVER 自定义函数 split

    Create function [dbo].[split] ( ), ) ) )) as begin declare @i int set @SourceSql=rtrim(ltrim(@Source ...

  7. sql server 存储过程分隔split

    CREATE FUNCTION [dbo].[F_split] ( ), ) ) , ), f )) --实现split功能 的函数 AS BEGIN DECLARE @i INT SET @Sour ...

  8. sql server:compare data from two tables

    --Comparing data between two tables in SQL Server --Create two Tables-- CREATE TABLE TableA(ID Int, ...

  9. Microsoft SQL Server Trace Flags

    Complete list of Microsoft SQL Server trace flags (585 trace flags) REMEMBER: Be extremely careful w ...

随机推荐

  1. 【 D3.js 入门系列 --- 6 】 如何让图表动起来

    [5.1]节中制作了一个比较完善的图表,但它是静态的,想做出它的动态效果吗?在D3中只需要短短的几行代码即可. 这一节将涉及4个函数的使用. 1.transition() 启动转变效果只需要添加这个即 ...

  2. UVa 673 平衡的括号

    题意:给出包含"()"和"[]"的括号序列,判断是否合法. 用栈来完成,注意空串就行. #include<iostream> #include< ...

  3. eclipse连接mysql,插入数据时乱码

    问题:如果eclipse中项目的编码方式为utf-8 插入数据后,在数据库中查看后,汉字出现乱码情况 解决方法: 1.在获取连接的时候将conn = DriverManager.getConnecti ...

  4. Singly Linked List

    Singly Linked List Singly linked list storage structure:typedef struct Node{ ElemType data; struct N ...

  5. Invert Binary Tree

    Invert a binary tree: 4 / \ 2 7 / \ / \ 1 3 6 9 to 4 / \ 7 2 / \ / \ 9 6 3 1 简单递归实现,调换左右子树,子树的所有子树结构 ...

  6. JavaScript 的循环语句语法摘要

     if条件语句语法: if(condition){ statements; } 理解:圆括号里的是条件参数  ,花括号里的为执行的语句. 示例代码:if(1>2){ alert("Th ...

  7. IntelliJ IDEA 在网页修改数据,但是在浏览器刷新的时候,不能读取到修改之后的数据

    使用IntelliJ IDEA 在网页修改数据,但是在浏览器刷新的时候,不能读取到修改之后的数据? 解决办法:tomcat配置中,On frame deactivation属性选择Update cla ...

  8. 043. asp.net主题之一初识主题和皮肤

    主题由外观, 级联样式表(CSS), 图像和其它资源组成, 主题中至少包含外观. 他是在网站或Web服务器上的特殊目录中定义的: 外观: 外观文件是主题的核心内容, 用于定义页面中服务器控件的外观, ...

  9. [转]js来弹出窗口的详细说明

    1.警告对话框 <script> alert("警告文字") </script> 2.确认对话框 <script> confirm(" ...

  10. varchar2_to_blob,应用向数据库更新LOB字段时的超时问题

    将字符串转换为BLOB类型数据,写入服务器. 1,首先利用to_clob函数把varchar2字段转成 clob字段. 2  利用c2b上面函数将clob转成blob. 即: c2b(to_clob( ...