hive grouping set
reference
data-demo
2015-03,2015-03-10,cookie1
2015-03,2015-03-10,cookie5
2015-03,2015-03-12,cookie7
2015-04,2015-04-12,cookie3
2015-04,2015-04-13,cookie2
2015-04,2015-04-13,cookie4
2015-04,2015-04-16,cookie4
2015-03,2015-03-10,cookie2
2015-03,2015-03-10,cookie3
2015-04,2015-04-12,cookie5
2015-04,2015-04-13,cookie6
2015-04,2015-04-15,cookie3
2015-04,2015-04-15,cookie2
2015-04,2015-04-16,cookie1
grouping query
select
month,day,count(cookieid)
from cookie5
group by month,day
grouping sets (month,day);
same as group query
select month,NULL as day,count(cookieid) as nums from cookie5 group by month
union all
select NULL as month,day,count(cookieid) as nums from cookie5 group by day;
result
| month | day | c2 |
| - | - | - |
| NULL | 2015-03-10 | 4 |
| NULL | 2015-03-12 | 1 |
| NULL | 2015-04-12 | 2 |
| NULL | 2015-04-13 | 3 |
| NULL | 2015-04-15 | 2 |
| NULL | 2015-04-16 | 2 |
| 2015-03 | NULL | 5 |
| 2015-04 | NULL | 19 |
GROUPING__ID query
select
month,
day,
count(distinct cookieid) as uv,
GROUPING__ID
from cookie5
group by month,day
grouping sets (month,day)
order by GROUPING__ID;
same as group query
SELECT month,NULL as day,COUNT(DISTINCT cookieid) AS uv,1 AS GROUPING__ID FROM cookie5 GROUP BY month
UNION ALL
SELECT NULL as month,day,COUNT(DISTINCT cookieid) AS uv,2 AS GROUPING__ID FROM cookie5 GROUP BY day;
result
| _u1.month | _u1.day | _u1.uv | _u1.grouping_id |
| NULL | 2015-03-10 | 4 | 2 |
| NULL | 2015-03-12 | 1 | 2 |
| NULL | 2015-04-12 | 2 | 2 |
| NULL | 2015-04-13 | 3 | 2 |
| NULL | 2015-04-15 | 2 | 2 |
| NULL | 2015-04-16 | 2 | 2 |
| 2015-03 | NULL | 5 | 1 |
| 2015-04 | NULL | 6 | 1 |
all demo query
SELECT month, day,
COUNT(DISTINCT cookieid) AS uv,
GROUPING__ID
FROM cookie5
GROUP BY month,day
GROUPING SETS (month,day,(month,day))
ORDER BY GROUPING__ID;
same as group query
SELECT month,NULL as day,COUNT(DISTINCT cookieid) AS uv,1 AS GROUPING__ID FROM cookie5 GROUP BY month
UNION ALL
SELECT NULL as month,day,COUNT(DISTINCT cookieid) AS uv,2 AS GROUPING__ID FROM cookie5 GROUP BY day
UNION ALL
SELECT month,day,COUNT(DISTINCT cookieid) AS uv,3 AS GROUPING__ID FROM cookie5 GROUP BY month,day;
result
| month | day | uv | grouping_id |
| 2015-04 | NULL | 6 | 1 |
| 2015-03 | NULL | 5 | 1 |
| NULL | 2015-03-10 | 4 | 2 |
| NULL | 2015-04-16 | 2 | 2 |
| NULL | 2015-04-15 | 2 | 2 |
| NULL | 2015-04-13 | 3 | 2 |
| NULL | 2015-04-12 | 2 | 2 |
| NULL | 2015-03-12 | 1 | 2 |
| 2015-04 | 2015-04-16 | 2 | 3 |
| 2015-04 | 2015-04-12 | 2 | 3 |
| 2015-04 | 2015-04-13 | 3 | 3 |
| 2015-03 | 2015-03-12 | 1 | 3 |
| 2015-03 | 2015-03-10 | 4 | 3 |
| 2015-04 | 2015-04-15 | 2 | 3 |
hive grouping set的更多相关文章
- hive grouping sets 实现原理
先下结论: 看了hive 1.1.0 grouping sets 实现(从源码及执行计划都可以看出与kylin实现不一样),(前提是可累加,如sum函数)他并没有像kylin一样先按照group by ...
- hive grouping sets 等聚合函数
函数说明: grouping sets 在一个 group by 查询中,根据不同的维度组合进行聚合,等价于将不同维度的 group by 结果集进行 union allcube 根据 group b ...
- hive中grouping sets的使用
hive中grouping sets 数量较多时如何处理? 可以使用如下设置来 set hive.new.job.grouping.set.cardinality = 30; 这条设置的意义在于 ...
- Hive高级聚合GROUPING SETS,ROLLUP以及CUBE
scala> import org.apache.spark.sql.hive.HiveContextimport org.apache.spark.sql.hive.HiveContext s ...
- Hive高阶聚合函数 GROUPING SETS、Cube、Rollup
-- GROUPING SETS作为GROUP BY的子句,允许开发人员在GROUP BY语句后面指定多个统计选项,可以简单理解为多条group by语句通过union all把查询结果聚合起来结合起 ...
- Hive函数:GROUPING SETS,GROUPING__ID,CUBE,ROLLUP
参考:lxw大数据田地:http://lxw1234.com/archives/2015/04/193.htm 数据准备: CREATE EXTERNAL TABLE test_data ( mont ...
- Hive SQL grouping sets 用法
概述 GROUPING SETS,GROUPING__ID,CUBE,ROLLUP 这几个分析函数通常用于OLAP中,不能累加,而且需要根据不同维度上钻和下钻的指标统计,比如,分小时.天.月的UV数. ...
- hive之案例分析(grouping sets,lateral view explode, concat_ws)
有这样一组搜索结果数据: 租户,平台, 登录用户, 搜索关键词, 搜索的商品结果List {"tenantcode":"", "platform&qu ...
- Hive学习之路 (十七)Hive分析窗口函数(五) GROUPING SETS、GROUPING__ID、CUBE和ROLLUP
概述 GROUPING SETS,GROUPING__ID,CUBE,ROLLUP 这几个分析函数通常用于OLAP中,不能累加,而且需要根据不同维度上钻和下钻的指标统计,比如,分小时.天.月的UV数. ...
- hive
Hive Documentation https://cwiki.apache.org/confluence/display/Hive/Home 2016-12-22 14:52:41 ANTLR ...
随机推荐
- ArcGIS Pro创建、发布、调用GP服务全过程示例(等高线分析)
在之前的文章介绍过使用ArcMap发布GP分析服务,由于ArcGIS后续不在更新ArcMap,改用ArcGIS Pro,本文对ArcGIS Pro发布GP分析服务进行说明. 本文以等高线分析为例,使用 ...
- Sourcetree 提交顺序
总结:一共5个步骤 1.首先获取git主分支的代码. 2.暂存所需要上传的代码. 3.拉取代码(如发生文件冲突先暂不处理). 4.提交代码,然后再次拉取代码(不显示冲突跳下一步).如果还是显示文件冲突 ...
- 2021-04-02:给定一个正方形或者长方形矩阵matrix,实现zigzag打印。[[0,1,2],[3,4,5],[6,7,8]]的打印顺序是0,1,3,6,4,2,5,7,8。
2021-04-02:给定一个正方形或者长方形矩阵matrix,实现zigzag打印.[[0,1,2],[3,4,5],[6,7,8]]的打印顺序是0,1,3,6,4,2,5,7,8. 福大大 答案2 ...
- DataGridViewImageColumn 图片照片
Private Sub BT_PHOTOADDRESS_Click(sender As Object, e As EventArgs) Handles BT_PHOTOADDRESS.Click Di ...
- python通过变量名称的反射,获取变量的引用
有一些极端情况下,例如变量名称是动态的,我们无法直接调用变量名,如何获取到变量的引用呢? aa = [globals()["xxxx"]]
- ENVI手动地理配准栅格图像的方法
本文介绍在ENVI软件中,手动划定地面控制点从而实现栅格图像相互间地理配准的方法:其中,所用软件版本为ENVI Classic 5.3 (64-bit). 首先,在软件中同时打开两景需要进行地 ...
- Kali下压缩解压缩命令大全zip,tar,tar.gz,tar.bz2(转)
转自http://blog.csdn.net/yangjin_unique/article/details/7824852 tar 解包:tar xvf FileName.tar 打包:tar cvf ...
- 安卓第一课:gradle仓库的导入
今天装好android studio,结果刚进入就报错了: SSL peer shut down incorrectly 读注释发现原来是gradle下载文件不成功.果然,原来是vpn掉线了,上网查了 ...
- P1585 魔法阵 题解
题意: 题目传送门 可以看做一个人手中有一些宝石,并将宝石分成两组,一组的编号为 1 至 n×m/2,二组为 n×m/2+1 至 n×m+1.当两组两个宝石编号相差为 n×m/2 为一对.现在要遍历一 ...
- AGC019F Yes or No
题意 有 \(N+M\) 个问题,其中有 \(N\) 个问题的答案是 YES,\(M\) 个问题的答案是 NO.当你回答一个问题之后,会知道这个问题的答案,求最优策略下期望对多少.答案对 \(9982 ...