MDX : Non Empty v/s NonEmpty

User Rating: / 50 
PoorBest

Written by Jason Thomas   
Friday, 07 May 2010 00:44

Reposted from Jason Thomas blog with the author's permission.

The last few months, recession worries have allayed and it is common to see the Talent Acquisition guys of our company stalking us techies to take interviews for lateral hires at odd hours. Now spending a beautiful Saturday morning + afternoon taking interviews is not my cup of tea but then their persistence paid off and finally I agreed to come for taking the interviews. That is when I thought of giving my readers a sneak peek of my list of interview questions, one by one per post.

One of my favourite questions in MDX is the difference between Non Empty and NonEmpty because even though many people use them daily to remove NULLS from their queries, very few understand the working behind it. Many times, I have even got answers like “there is a space between Non and Empty, that is the difference”. The objective of this post is to clearly differentiate between the two.

Let us say my initial query is

SELECT  
  { 
    [Measures].[Hits] 
   ,[Measures].[Subscribers] 
   ,[Measures].[Spam] 
  } ON COLUMNS 
,{ 
    [Geography].[Country].Children 
  } ON ROWS 
FROM [Blog Statistics];

This will give the following output

NON EMPTY

Non Empty is prefixed before the sets defining the axes and is used for removing NULLs. Let us see what happens when we add Non Empty on the Rows axis.

SELECT  
  { 
    [Measures].[Hits] 
   ,[Measures].[Subscribers] 
   ,[Measures].[Spam] 
  } ON COLUMNS 
,NON EMPTY  
    { 
      [Geography].[Country].Children 
    } ON ROWS 
FROM [Blog Statistics];

The output is shown below

You will notice that Chile (CL) has been filtered out while rows like UK, Canada, etc are still there even if they have NULLs for some of the measures. In short, only the rows having NULL for all the members of the set defined in the column axis is filtered out. This is because the Non Empty operator works on the top level of the query. Internally, the sets defined for the axes are generated first and then the tuples having NULL values are removed. Now that we know how NON EMPTY works, it shouldn’t be hard for us to tell the output of the below query

SELECT  
  NON EMPTY  
    { 
      [Measures].[Hits] 
     ,[Measures].[Subscribers] 
     ,[Measures].[Spam] 
    } ON COLUMNS 
,{ 
    [Geography].[Country].Children 
  } ON ROWS 
FROM [Blog Statistics];

The output is shown below

NONEMPTY()

The NonEmpty() returns the set of tuples that are not empty from a specified set, based on the cross product of the specified set with a second set. Suppose we want to see all the measures related to countries which have a non-null value for Subscribers

SELECT  
  { 
    [Measures].[Hits] 
   ,[Measures].[Subscribers] 
   ,[Measures].[Spam] 
  } ON COLUMNS 
,{ 
    NonEmpty 
    ( 
      [Geography].[Country].Children 
     ,[Measures].[Subscribers] 
    ) 
  } ON ROWS 
FROM [Blog Statistics];

This will give the following output

As you can see, the NonEmpty operator takes all the rows having a not NULL value for Subscribers in the rows and then displays all the measures defined in the column axis. Basically what happens internally is that NonEmpty is evaluated when the sets defining the axis are evaluated. So at this point of time, there is no context of the other axes. What I said now can be better understood from the following example

Now, we write the below query

SELECT  
  {[Date].[Month].[March]} ON COLUMNS 
,{ 
    [Geography].[Country].Children 
  } ON ROWS 
FROM [Blog Statistics] 
WHERE  
  [Measures].[Hits];

Output is given below

Think for a while and predict which all rows would be returned when the NonEmpty operator is applied on the rows

SELECT  
  {[Date].[Month].[March]} ON COLUMNS 
,{ 
    NonEmpty([Geography].[Country].Children) 
  } ON ROWS 
FROM [Blog Statistics] 
WHERE  
  [Measures].[Hits];

If you guessed just IN, US, GB and AU, please go back and read once again. If you replied All rows except Chile, full marks to you, you have been an attentive reader. The reason is because NonEmpty is evaluated when the set defining the axis is evaluated (here, Country) and at that point of time, NonEmpty is evaluated for each member of the country against the default member of the Date dimension (which would be ALL generally). As you can see, we already have values for CA and AP for other months and hence they will not be filtered out.

Optimizing Non Empty by using NonEmpty

Ok, now you know how Non Empty and NonEmpty works internally and we can apply this knowledge to optimize our queries. Suppose there is a complex logic in our axes like finding all the countries that have 30 or more hits in any month. The query is given below

SELECT  
  {[Measures].[Hits]} ON COLUMNS 
,{ 
    Filter 
    ( 
        [Geography].[Country].Children 
      *  
        [Date].[Month].Children 
     , 
      [Measures].[Hits] > 30 
    ) 
  } ON ROWS 
FROM [Blog Statistics];

Now my time dimension will have 10 years of data, which means around 120 (10*12) members for the month attribute and my country attribute may have let’s say, 100 members. Now even though I just have 3 months of data for 10 countries for hits, the filter function will need to go through all the combinations of country and month (120*100 combinations). Instead of that, we can just use the NonEmpty operator and bring down the combinations to less than 30 (3 months*10 countries) by using the below query

SELECT  
  {[Measures].[Hits]} ON COLUMNS 
,{ 
    Filter 
    ( 
      NonEmpty 
      ( 
          [Geography].[Country].Children 
        *  
          [Date].[Month].Children 
       ,[Measures].[Hits] 
      ) 
     , 
      [Measures].[Hits] > 30 
    ) 
  } ON ROWS 
FROM [Blog Statistics];


 

Jason has been working with Microsoft BI tools since he joined the IT industry in 2006. Since then he has worked across SSAS, SSRS and SSIS for a large number of clients. He is currently working for Mariner and is based out of Charlotte, NC. His personal blog can be found at http://road-blogs.blogspot.com


Comments (10)
 
 
10Wednesday, 22 January 2014 11:30
Rajendra
 
Nice explanation
 
 
9Thursday, 02 January 2014 18:56
AVANADE Guy
 
Agreed. Nice explanation. The impact on performance is extremely important when work with attributes/levels with thousands of members. I recently had a NON EMPTY query that was running around 19 minutes tuned to under 10 seconds with NONEMPTY.
 
 
8Thursday, 03 January 2013 05:00
Preeti
 
Nice explanation
 
 
7Tuesday, 06 March 2012 10:49
Sree1234
 
Very Usefull ....Here we have one more like 'non empty behavior' it is good to have example for 'non empty behavior'.
 
 
6Sunday, 21 August 2011 10:16
Soniya
 
Very informative ... thanks
 
 
5Tuesday, 09 August 2011 13:04
Sudhanshu Sekhar Padhan
 
One helpful article to avoid confusion between Non Empty and Nonempty. Thanks
 
 
4Tuesday, 16 November 2010 12:41
165083
 
Could you provide information on how to implement the non empty in cube design
 
 
3Tuesday, 16 November 2010 09:21
roopesh babu v
 
kudos .. really a nice article .. thx for it ..
 
 
2Friday, 14 May 2010 13:39
Jason Tom Thomas
 
Could you please let me know which example you are referring to, maybe I could post the explanation in the comments section itself. 
If you are speaking of the image with mouseover as Nonempty second example, I am actually considering it as the source for my queries below.
 
 
1Tuesday, 11 May 2010 16:47
Saviour Faire
 
Interesting article, but some of the examples, code and samples, do not seem to make sense because the explanations were not provided. 
For example, the last "NON EMPTY" example query and results were not explained, and do not make sense. ie: "spam" column not shown and "Chile" is shown but empty, but these results not explained. As a student of ssas and mdx, based on the author's explanation provided I would be lead to believe a different result would have happened. 
As a student of the software, my interest as been piqued and I'll investigate further.

MDX : Non Empty v/s NonEmpty的更多相关文章

  1. NonEmpty和Non Empty的区别[转]

    One of my favourite questions in MDX is the difference between Non Empty and NonEmpty because even t ...

  2. 【转载】NonEmpty和Non Empty的区别

    转载来源:http://www.ssas-info.com/analysis-services-articles/50-mdx/2196-mdx-non-empty-vs-nonempty One o ...

  3. XVII Open Cup named after E.V. Pankratiev. GP of Moscow Workshops

    A. Centroid Tree 枚举至多两个重心作为根,检查对于每个点是否都满足$2size[x]\leq size[father[x]]$即可. #include<stdio.h> # ...

  4. empty、isset、is_null的比较

    直接上代码 <?php $a=0; $b='0'; $c=0.0; $d=''; $e=NULL; $f=array(); $g='\0'; $h=' ';//space $i=true; $j ...

  5. empty、isset、is

    直接上代码 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 <?php $a=0; $b='0'; $c=0.0; ...

  6. 【转】Go Interface 源码剖析

    源网址:http://legendtkl.com/2017/07/01/golang-interface-implement/ 0.引言 在上一篇文章 <深入理解 Go Interface> ...

  7. Codeforces 797C - Minimal string

    C. Minimal string 题目链接:http://codeforces.com/problemset/problem/797/C time limit per test 1 second m ...

  8. golang interface判断为空nil

    要判断interface 空的问题,首先看下其底层实现. interface 底层结构 根据 interface 是否包含有 method,底层实现上用两种 struct 来表示:iface 和 ef ...

  9. 【PAT】1053 Path of Equal Weight(30 分)

    1053 Path of Equal Weight(30 分) Given a non-empty tree with root R, and with weight W​i​​ assigned t ...

随机推荐

  1. 数据分析之sql篇

    刚才在琢磨客户分析的时候,突然想到一个假设,如果某个客户的续约率很高,那么证明他在产品的使用上效果是很好的,如果这些些产品的组合十分有效,那么查看其他类似的客户的续约率,做一次论证应该是有意义的.于是 ...

  2. Android 高级UI设计笔记14:Gallery(画廊控件)之 3D图片浏览

    1. 利用Gallery组件实现 3D图片浏览器的功能,如下: 2. 下面是详细的实现过程如下: (1)这里我是测试性代码,我的图片是自己添加到res/drawable/目录下的,如下: 但是开发中不 ...

  3. 【Android 界面效果26】listview android:cacheColorHint,android:listSelector属性作用

    ListView是常用的显示控件,默认背景是和系统窗口一样的透明色,如果给ListView加上背景图片,或者背景颜色时,滚动时listView会黑掉, 原因是,滚动时,列表里面的view重绘时,用的依 ...

  4. Oracle基础 游标

    一.游标 游标用来处理从数据库中检索的多行记录(使用SELECT语句).利用游标,程序可以逐个地处理和遍历一次检索返回的整个记录集. 为了处理SQL语句,Oracle将在内存中分配一个区域,这就是上下 ...

  5. VMware系统运维(八)vCenter Server安装

    1.终于开始安装vCenter Server了,需要配置数据源哦! 2.下一步 3.接受协议,下一步 4.输入许可密钥,也可以后面再输入,下一步 5.选择数据源,即我们前面配置的系统DSN,下一步 6 ...

  6. Python执行系统命令的方法

    Python中执行系统命令常见方法有两种: 两者均需 import os (1) os.system # 仅仅在一个子终端运行系统命令,而不能获取命令执行后的返回信息 system(command) ...

  7. Freebsd 编译内核

    # cd /usr/src/sys/i386/conf # cp GENERIC GENERIC.20060812# ee GENERIC 如果要加入ipf防火墙的话则加入options        ...

  8. 转: 学习开源项目的若干建议(infoq)

    转: http://www.infoq.com/cn/news/2014/04/learn-open-source 学习开源项目的若干建议 作者 崔康 发布于 2014年4月11日 | 注意:GTLC ...

  9. 让TabelView视图中自定义的Toolbar固定(不随cell的移动而移动)

    //在viewDidLoad方法中创建Toolbartoolbar = [[UIView alloc] initWithFrame:CGRectMake(, , , )]; toolbar.backg ...

  10. BZOJ 3043

    Description 给定一个长度为n的数列{a1,a2...an},每次可以选择一个区间[l,r],使这个区间内的数都加一或者都减一.问至少需要多少次操作才能使数列中的所有数都一样,并求出在保证最 ...