SSAS 通过 ETL 自动建立分区

一、动态分区的好处就不说了，随着时间的推移，不可能一个度量值组都放在一个分区中，处理速度非常慢，如何动态添加分区，如何动态处理分区，成为了很多新手BI工程师一个头痛的问题，废话不多说，分享一下我的经验。

二、首先讲一下大致的流程,主要是通过SSIS进行任务的处理，本文主要是按照月进行分区，当然分区的规则大家可以根据自己的需求制定。

该包用到的所有变量

三、对上面四个步骤分别讲解一下。

1、得到所有分区：

①、主要设置如下图

②、输出的结果集应该传给变量Partitions

③、SQLStatement为：（主要依据创建分区的语句中需要的参数的值）

 1 SELECT 'RmyyHisDW'                                                     AS DataSoureID,--数据源

 2        'RmyyMZ'                                                        AS CubeName,--分区来自哪一个cube

 3        'RmyyMZ'                                                        AS CubeID,

 4        'Fact Mz Visit Table'                                           AS MeasureGroup,--指定是一个度量值组

 5        'Fact Mz Visit Table'                                           AS MeasureGroupID,

 6        'Fact Mz Visit Table' + Cast(MonthInfo.YearMonth AS VARCHAR(6)) AS Partition,--分区名称=度量值组名称+年月

 7        'SELECT [dbo].[fact_mz_visit_table].[patient_id],

 8        [dbo].[fact_mz_visit_table].[times],

 9        [dbo].[fact_mz_visit_table].[name],

10        [dbo].[fact_mz_visit_table].[age],

11        [dbo].[fact_mz_visit_table].[ampm],

12        [dbo].[fact_mz_visit_table].[charge_type],

13        [dbo].[fact_mz_visit_table].[clinic_type],

14        [dbo].[fact_mz_visit_table].[contract_code],

15        [dbo].[fact_mz_visit_table].[visit_dept],

16        [dbo].[fact_mz_visit_table].[doctor_code],

17        [dbo].[fact_mz_visit_table].[gh_date],

18        [dbo].[fact_mz_visit_table].[gh_date_time],

19        [dbo].[fact_mz_visit_table].[gh_opera],

20        [dbo].[fact_mz_visit_table].[haoming_code],

21        [dbo].[fact_mz_visit_table].[icd_code],

22        [dbo].[fact_mz_visit_table].[icd_code1],

23        [dbo].[fact_mz_visit_table].[icd_code2],

24        [dbo].[fact_mz_visit_table].[icd_code3],

25        [dbo].[fact_mz_visit_table].[response_type],

26        [dbo].[fact_mz_visit_table].[visit_date],

27        [dbo].[fact_mz_visit_table].[visit_date_time],

28        [dbo].[fact_mz_visit_table].[visit_flag]

29 FROM   [dbo].[fact_mz_visit_table]

30 WHERE  visit_flag &lt;&gt; 9 and  where_clause'                                         AS SQL,--要进行分区的SQL

31        cast(MinDateKey as varchar(8)) as MinDateKey,--最小datekey

32        cast(MaxDateKey as varchar(8)) as MaxDateKey--最大datekey

33 FROM   (SELECT t1.YearMonth,

34                (SELECT Min(datekey)

35                 FROM   dim_date t2

36                 WHERE  CONVERT(VARCHAR(6), t2.Date, 112) = t1.YearMonth) AS MinDateKey,

37                (SELECT Max(datekey)

38                 FROM   dim_date t2

39                 WHERE  CONVERT(VARCHAR(6), t2.Date, 112) = t1.YearMonth) AS MaxDateKey

40         FROM   (SELECT DISTINCT CONVERT(VARCHAR(6), Date, 112) AS YearMonth

41                 FROM   dim_date) AS t1) MonthInfo

42 WHERE  EXISTS(SELECT *

43               FROM   fact_mz_visit_table

44               WHERE  visit_date BETWEEN MonthInfo.MinDateKey AND MonthInfo.MaxDateKey)

注意：SQL字段中最后面有个where_clause ，在“判断分区脚本任务”中的C#脚本中会替换成后面的where条件，也就是将MinDateKey和MaxDateKey加入条件限制，进行分区。

④、步骤③执行的结果为

2、Foreach 循环容器（主要循环执行上面的sql语句执行的结果）相关设置如下图

注意：变量映射按照sql语句中的字段名的顺序

3、判断分区是否存在，主要是通过步骤2中传出的参数判断cube中是否有该分区，有则不创建，无则通过Anaysis Services执行DDL任务来创建。

①、具体设置如下:

②、点击编辑脚本任务

需要引用AMO

③主要代码为

/*

   Microsoft SQL Server Integration Services Script Task

   Write scripts using Microsoft Visual C# 2008.

   The ScriptMain is the entry point class of the script.

*/

using System;

using System.Data;

using Microsoft.SqlServer.Dts.Runtime;

using System.Windows.Forms;

using Microsoft.AnalysisServices;

namespace ST_f33f263fa3864817a3291fc4715774d3.csproj

{

    [System.AddIn.AddIn("ScriptMain", Version = "1.0", Publisher = "", Description = "")]

    public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase

    {

        #region VSTA generated code

        enum ScriptResults

        {

            Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success,

            Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure

        };

        #endregion

        /*

        The execution engine calls this method when the task executes.

        To access the object model, use the Dts property. Connections, variables, events,

        and logging features are available as members of the Dts property as shown in the following examples.

        To reference a variable, call Dts.Variables["MyCaseSensitiveVariableName"].Value;

        To post a log entry, call Dts.Log("This is my log text", 999, null);

        To fire an event, call Dts.Events.FireInformation(99, "test", "hit the help message", "", 0, true);

        To use the connections collection use something like the following:

        ConnectionManager cm = Dts.Connections.Add("OLEDB");

        cm.ConnectionString = "Data Source=localhost;Initial Catalog=AdventureWorks;Provider=SQLNCLI10;Integrated Security=SSPI;Auto Translate=False;";

        Before returning from this method, set the value of Dts.TaskResult to indicate success or failure.

        To open Help, press F1.

    */

        public void Main()

        {

            // TODO: Add your code here

            // Dts.TaskResult = (int)ScriptResults.Success;

            //将参数赋给变量

            String sPartition = (String)Dts.Variables["Partition"].Value;

            String sCubeName = (String)Dts.Variables["CubeName"].Value;

            String sMeasureGroup = (String)Dts.Variables["MeasureGroup"].Value;

            String sServer = "localhost";

            String sDataBaseID = (String)Dts.Variables["DatabaseID"].Value;

            String sCubeID = (String)Dts.Variables["CubeID"].Value;

            String sMeasureGroupID = (String)Dts.Variables["MeasureGroupID"].Value;

            String sDataSoureID = (String)Dts.Variables["DataSoureID"].Value;

            String sSQL = (String)Dts.Variables["SQL"].Value;

            String sMaxDateKey = (String)Dts.Variables["MaxDateKey"].Value;

            String sMinDateKey = (String)Dts.Variables["MinDateKey"].Value;

            string aSql = sSQL.Replace("where_clause", "visit_date &gt;=" + sMinDateKey + " and visit_date &lt;=" + sMaxDateKey);

            ConnectionManager cm = Dts.Connections.Add("MSOLAP100");

            cm.ConnectionString = "Provider=MSOLAP.4;Data Source=localhost;Integrated Security=SSPI;Initial Catalog=" + sDataBaseID;

            Microsoft.AnalysisServices.Server aServer = new Server();

            aServer.Connect(sServer);

            Microsoft.AnalysisServices.Database aDatabase = aServer.Databases.FindByName(sDataBaseID);

            Microsoft.AnalysisServices.Cube aCube = aDatabase.Cubes.FindByName(sCubeName);

            Microsoft.AnalysisServices.MeasureGroup aMeasureGroup = aCube.MeasureGroups.FindByName(sMeasureGroup);

            //判断分区是否存在

            if (aMeasureGroup.Partitions.Contains(sPartition))

            {

                Dts.Variables["IsNetePresent"].Value = false;

                Dts.Variables["Xmla_Script"].Value = "";

                Dts.TaskResult = (int)ScriptResults.Success;

            }

            else

            {

                Dts.Variables["IsNetePresent"].Value = true;

                Dts.Variables["Xmla_Script"].Value =

                    "<Create xmlns=\"http://schemas.microsoft.com/analysisservices/2003/engine\">"

                    + "<ParentObject>"

                    + "<DatabaseID>" + sDataBaseID + "</DatabaseID>"

                    + "<CubeID>" + sCubeID + "</CubeID>"

                    + "<MeasureGroupID>" + sMeasureGroupID + "</MeasureGroupID>"

                    + "</ParentObject>"

                    + "<ObjectDefinition>"

                    + "<Partition xmlns:xsd=\"http://www.w3.org/2001/XMLSchema\" "

                    +"xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\" xmlns:ddl2=\"http://schemas.microsoft.com/analysisservices/2003/engine/2\" xmlns:ddl2_2=\"http://schemas.microsoft.com/analysisservices/2003/engine/2/2\" xmlns:ddl100_100=\"http://schemas.microsoft.com/analysisservices/2008/engine/100/100\" xmlns:ddl200=\"http://schemas.microsoft.com/analysisservices/2010/engine/200\" xmlns:ddl200_200=\"http://schemas.microsoft.com/analysisservices/2010/engine/200/200\">"

                    + "<ID>" + sPartition + "</ID>"

                    + "<Name>" + sPartition + "</Name>"

                    + "<Source xsi:type=\"QueryBinding\">"

                    + "<DataSourceID>" + sDataSoureID + "</DataSourceID>"

                    + "<QueryDefinition>" + aSql + "</QueryDefinition>"

                    + "</Source>"

                    + "<StorageMode>Molap</StorageMode> <ProcessingMode>Regular</ProcessingMode>"

                    + "<ProactiveCaching> <SilenceInterval>-PT1S</SilenceInterval> <Latency>-PT1S</Latency> <SilenceOverrideInterval>-PT1S</SilenceOverrideInterval> <ForceRebuildInterval>-PT1S</ForceRebuildInterval>"

                    + "<Source xsi:type=\"ProactiveCachingInheritedBinding\" /> </ProactiveCaching>"

                    + "</Partition>"

                    + "</ObjectDefinition>"

                    + "</Create>";

                Dts.TaskResult = (int)ScriptResults.Success;

            }

        }

    }

}

④、判断是否执行下一步

4、不存在创建分区（主要执行步骤3传过来的Xmla_Script），具体设置如下：

5、执行任务，查看结果：

标签: SSAS

SSAS 通过 ETL 自动建立分区的更多相关文章

MySQL每天自动增加分区
有一个表tb_3a_huandan_detail,每天有300W左右的数据.查询太慢了,网上了解了一下,可以做表分区.由于数据较大,所以决定做定时任务每天执行存过自动进行分区. 1.在进行自动增加分区 ...
mysql的分区技术(建立分区)
-- mysql建立表分区,使用range方法建立: create table t_range( id int(11), money int(11) unsigned not null, date d ...
ubuntu 开机自动挂载分区
转载: http://blog.sina.com.cn/s/blog_142e95b170102vx2a.html 我的计算机是双硬盘,一个是windows系统,一个是Fedora和ubuntu系统. ...
linux下EOF写法梳理自动新建分区并挂载的脚本
linux下EOF写法梳理 - 散尽浮华 - 博客园 https://www.cnblogs.com/kevingrace/p/6257490.html 在平时的运维工作中,我们经常会碰到这样一个场景 ...
sqlite使用dbexpress时数据库不存在自动建立数据库
在发布使用delphi dbexpress编写的基于SQLITE的程序时,需要在运行时判断某个数据库是否存在,如果不存在,则自动建立. 方法有2,其中之一是判断数据库文件是否存在,如果不存在,则创建一 ...
Linux 通过 UUID 在 fstab 中自动挂载分区
Linux 通过 UUID 在 fstab 中自动挂载分区 summerm6关注 2019.10.17 16:29:00字数 1,542阅读 605 https://xiexianbin.cn/lin ...
用 VS Code 搞 Qt6：让信号和槽自动建立连接
Qt 具备让某个对象的信号与符合要求的槽函数自动建立连接.弄起来也很简单,只要调用这个静态方法即可: QMetaObject::connectSlotsByName(...); connectSlot ...
Ubuntu下自动挂载分区
参考文章:http://feierky.iteye.com/blog/1998602 1.查看分区的UUID sudo blkid /dev/sda1: UUID="3526b254-390 ...
绿色版Mysql自动建立my.ini和命令行启动并动态指定datadir路径
1.先去下载绿色版的Mysql(https://cdn.mysql.com//archives/mysql-5.7/mysql-5.7.20-winx64.zip) 2.解压缩到任意目录(如D:\My ...

随机推荐

《饥荒游戏》SW BUG 刷猴子 & 刷淘气值办法
简介该办法利用刷猴子的方式,通过杀猴子获取淘气值,从而刷出坎普斯,继而刷坎普斯背包物品准备灭火器x1 箱子x1 逗猴球x1 猴窝xN 帽贝岩x2 避雷针x1 操作步骤 1.灭火器建造在2个帽贝岩 ...
从BSP模型到Apache Hama
一.什么是BSP模型概述 BSP(Bulk Synchronous Parallel,整体同步并行计算模型)是一种并行计算模型,由英国计算机科学家Viliant在上世纪80年代提出.Google发布 ...
IOS错误Could not produce class with ID
运行环境 Unity 5.3.5f1 (IL2CPP)编译IOS版本 XCode Version 7.2.1 (7C1002) Mac OS X 10.11.3 (15D21) (Mac mini) ...
Linux下的C Socket编程 -- 获取对方IP地址
Linux下的C Socket编程(二) 获取域名对应的IP地址经过上面的讨论,如果我们想要连接到远程的服务器,我们需要知道对方的IP地址,系统函数gethostbyname便能够实现这个目的.它能 ...
TCP/IP协议（二）tcp/ip基础知识
今天凌晨时候看书,突然想到一个问题:怎样做到持续学习?然后得出这样一个结论:放弃不必要的社交,控制欲望,克服懒惰... 然后又有了新的问题:学习效率时高时低,状态不好怎么解决?这也是我最近在思考的问题 ...
Linux文本查看及处理.md
cat cat命令的用途是连接文件或标准输入并打印.这个命令常用来显示文件内容,或者将几个文件连接起来显示,或者从标准输入读取内容并显示,它常与重定向符号配合使用. 主要功能一次显示整个文件:cat ...
[LeetCode] Random Pick Index 随机拾取序列
Given an array of integers with possible duplicates, randomly output the index of a given target num ...
【原】Learning Spark (Python版) 学习笔记(四)----Spark Sreaming与MLlib机器学习
本来这篇是准备5.15更的,但是上周一直在忙签证和工作的事,没时间就推迟了,现在终于有时间来写写Learning Spark最后一部分内容了. 第10-11 章主要讲的是Spark Streaming ...
侯捷老师C++大系之C++面向对象开发：（一）不带指针的类：Complex复数类的实现过程
一.笔记1.C++编程简介 2.头文件与类的声明防卫式声明#ifndef __COMPLEX__#define __COMPLEX__ …… #endif头文件的布局模板简介template< ...
SQLite3源程序分析之分析器的生成
1.概述 Lemon是一个LALR(1)文法分析器生成工具,与bison和yacc类似,是一个可以独立于SQLite使用的开源的分析器生成工具.而且它使用与yacc(bison)不同的语法规则,可以减 ...

SSAS 通过 ETL 自动建立分区

SSAS 通过 ETL 自动建立分区的更多相关文章

随机推荐

热门专题