方案一:请参考《数据库“行专列”操作---使用row_number()over(partition by 分组字段 [order by 排序字段])》,该方案是sqlserver,oracle,mysql,hive均适用的。

在hive中的方案分为以下两种方案:

创建测试表,并插入测试数据:

--hive 测试 行转列 collect_set collect_list
create table tommyduan_test(
gridid string,
height int,
cell string,
mrcount int,
weakmrcount int
); insert into tommyduan_test values('g1',1,'cell1',12,3);
insert into tommyduan_test values('g1',1,'cell2',22,3);
insert into tommyduan_test values('g1',1,'cell3',23,3);
insert into tommyduan_test values('g1',1,'cell4',1,3);
insert into tommyduan_test values('g1',1,'cell5',3,3);
insert into tommyduan_test values('g1',1,'cell6',4,3);
insert into tommyduan_test values('g1',1,'cell19',21,3); insert into tommyduan_test values('g2',1,'cell4',1,3);
insert into tommyduan_test values('g2',1,'cell5',3,3);
insert into tommyduan_test values('g2',1,'cell6',4,3);
insert into tommyduan_test values('g2',1,'cell19',21,3);

方案二:使用collect_set方案

注意:collect_set是一个set集合,不允许重复的记录插入

select gridid,height,collect_list(cell) cellArray,collect_list(mrcount) mrcountArray,collect_list(weakmrcount) weakmrcountArray
from (
select gridid,height,cell,mrcount,weakmrcount,row_number()over(partition by gridid,height order by mrcount desc) rn
from tommyduan_test
group by gridid,height,cell,mrcount,weakmrcount
) t10
where rn<4
group by gridid,height;
+---------+---------+-----------------------------+---------------+-------------------+--+
| gridid | height | cellarray | mrcountarray | weakmrcountarray |
+---------+---------+-----------------------------+---------------+-------------------+--+
| g1 | 1 | ["cell3","cell2","cell19"] | [23,22,21] | [3,3,3] |
| g2 | 1 | ["cell19","cell6","cell5"] | [21,4,3] | [3,3,3] |
+---------+---------+-----------------------------+---------------+-------------------+--+ select gridid,height,
(case when size(cellArray)>0 then cellArray[] else '-9999' end) as cell1,
(case when size(cellArray)>0 then mrcountArray[] else '-9999' end) as cell1_mrcount,
(case when size(cellArray)>0 then weakmrcountArray[] else '-9999' end) as cell1_weakmrcount,
(case when size(cellArray)>1 then cellArray[] else '-9999' end) as cell2,
(case when size(cellArray)>1 then mrcountArray[] else '-9999' end) as cell2_mrcount,
(case when size(cellArray)>1 then weakmrcountArray[] else '-9999' end) as cell2_weakmrcount,
(case when size(cellArray)>2 then cellArray[] else '-9999' end) as cell3,
(case when size(cellArray)>2 then mrcountArray[] else '-9999' end) as cell3_mrcount,
(case when size(cellArray)>2 then weakmrcountArray[] else '-9999' end) as cell3_weakmrcount
from
(
select gridid,height,collect_list(cell) cellArray,collect_list(mrcount) mrcountArray,collect_list(weakmrcount) weakmrcountArray
from (
select gridid,height,cell,mrcount,weakmrcount,row_number()over(partition by gridid,height order by mrcount desc) rn
from tommyduan_test
group by gridid,height,cell,mrcount,weakmrcount
) t10
where rn<4
group by gridid,height
) t12;
+---------+---------+---------+----------------+--------------------+--------+----------------+--------------------+---------+----------------+--------------------+--+
| gridid | height | cell1 | cell1_mrcount | cell1_weakmrcount | cell2 | cell2_mrcount | cell2_weakmrcount | cell3 | cell3_mrcount | cell3_weakmrcount |
+---------+---------+---------+----------------+--------------------+--------+----------------+--------------------+---------+----------------+--------------------+--+
| g1 | 1 | cell3 | 23 | 3 | cell2 | 22 | 3 | cell19 | 21 | 3 |
| g2 | 1 | cell19 | 21 | 3 | cell6 | 4 | 3 | cell5 | 3 | 3 |
+---------+---------+---------+----------------+--------------------+--------+----------------+--------------------+---------+----------------+--------------------+--+

方案三:使用collect_list/collect_all方案

注意:collect_set是一个set集合,不允许重复的记录插入

select gridid,height,collect_set(cell),collect_set(mrcount),collect_set(weakmrcount)
from (select * from tommyduan_test order by gridid,height,mrcount desc) t10
group by gridid,height;
+---------+---------+-------------------------------------------------------------+----------------------+------+--+
| gridid | height | _c2 | _c3 | _c4 |
+---------+---------+-------------------------------------------------------------+----------------------+------+--+
| g1 | 1 | ["cell3","cell2","cell19","cell1","cell6","cell5","cell4"] | [23,22,21,12,4,3,1] | [] |
| g2 | 1 | ["cell19","cell6","cell5","cell4"] | [21,4,3,1] | [] |
+---------+---------+-------------------------------------------------------------+----------------------+------+--+ select gridid,height,collect_set(cell) cellArray,collect_set(mrcount) mrcountArray,collect_set(weakmrcount) weakmrcountArray
from (
select gridid,height,cell,mrcount,weakmrcount,row_number()over(partition by gridid,height order by mrcount desc) rn
from tommyduan_test
group by gridid,height,cell,mrcount,weakmrcount
) t10
where rn<4
group by gridid,height;
+---------+---------+-----------------------------+---------------+-------------------+--+
| gridid | height | cellarray | mrcountarray | weakmrcountarray |
+---------+---------+-----------------------------+---------------+-------------------+--+
| g1 | 1 | ["cell3","cell2","cell19"] | [23,22,21] | [] |
| g2 | 1 | ["cell19","cell6","cell5"] | [21,4,3] | [] |
+---------+---------+-----------------------------+---------------+-------------------+--+ select gridid,height,collect_set(concat_ws(',',cell,cast(mrcount as string), cast(weakmrcount as string))) as cellArray
from (
select gridid,height,cell,mrcount,weakmrcount,row_number()over(partition by gridid,height order by mrcount desc) rn
from tommyduan_test
group by gridid,height,cell,mrcount,weakmrcount
) t10
where rn<4
group by gridid,height
+---------+---------+--------------------------------------------+--+
| gridid | height | cellarray |
+---------+---------+--------------------------------------------+--+
| g1 | 1 | ["cell3,23,3","cell2,22,3","cell19,21,3"] |
| g2 | 1 | ["cell19,21,3","cell6,4,3","cell5,3,3"] |
+---------+---------+--------------------------------------------+--+ select gridid,height,
(case when size(cellArray)>0 then split(cellArray[],'_')[] else '-9999' end) as cell1,
(case when size(cellArray)>0 then split(cellArray[],'_')[] else '-9999' end) as cell1_mrcount,
(case when size(cellArray)>0 then split(cellArray[],'_')[] else '-9999' end) as cell1_weakmrcount,
(case when size(cellArray)>1 then split(cellArray[],'_')[] else '-9999' end) as cell2,
(case when size(cellArray)>1 then split(cellArray[],'_')[] else '-9999' end) as cell2_mrcount,
(case when size(cellArray)>1 then split(cellArray[],'_')[] else '-9999' end) as cell2_weakmrcount,
(case when size(cellArray)>2 then split(cellArray[],'_')[] else '-9999' end) as cell3,
(case when size(cellArray)>2 then split(cellArray[],'_')[] else '-9999' end) as cell3_mrcount,
(case when size(cellArray)>2 then split(cellArray[],'_')[] else '-9999' end) as cell3_weakmrcount
from
(
select gridid,height,collect_set(concat_ws('_',cell,cast(mrcount as string), cast(weakmrcount as string))) as cellArray
from (
select gridid,height,cell,mrcount,weakmrcount,row_number()over(partition by gridid,height order by mrcount desc) rn
from tommyduan_test
group by gridid,height,cell,mrcount,weakmrcount
) t10
where rn<4
group by gridid,height
) t12;
+---------+---------+---------+----------------+--------------------+--------+----------------+--------------------+---------+----------------+--------------------+--+
| gridid | height | cell1 | cell1_mrcount | cell1_weakmrcount | cell2 | cell2_mrcount | cell2_weakmrcount | cell3 | cell3_mrcount | cell3_weakmrcount |
+---------+---------+---------+----------------+--------------------+--------+----------------+--------------------+---------+----------------+--------------------+--+
| g1 | 1 | cell3 | 23 | 3 | cell2 | 22 | 3 | cell19 | 21 | 3 |
| g2 | 1 | cell19 | 21 | 3 | cell6 | 4 | 3 | cell5 | 3 | 3 |
+---------+---------+---------+----------------+--------------------+--------+----------------+--------------------+---------+----------------+--------------------+--+

hive:数据库“行专列”操作---使用collect_set/collect_list/collect_all & row_number()over(partition by 分组字段 [order by 排序字段])的更多相关文章

  1. 数据库“行专列”操作---使用row_number()over(partition by 分组字段 [order by 排序字段])

    测试样例: create table test(rsrp string,rsrq string,tkey string,distan string); '); '); '); '); select * ...

  2. dos命令行连接操作ORACLE数据库

    C:\Adminstrator> sqlplus "/as sysdba" 查看是否连接到数据库 SQL> select status from v$instance; ...

  3. hive函数应用之操作json

    1.创建表 createtable.sql中存放的创建表语句如下 create external table adt.jsontest ( appKey string comment "AP ...

  4. Python(数据库之表操作)

    一.修改表 1. 修改表名 ALTER TABLE 表名 RENAME 新表名; #mysql中库名.表名对大小写不敏感 2. 增加字段 ALTER TABLE 表名ADD 字段名 数据类型 [完整性 ...

  5. SQL Server数据库--》top关键字,order by排序,distinct去除重复记录,sql聚合函数,模糊查询,通配符,空值处理。。。。

    top关键字:写在select后面 字段的前面 比如你要显示查询的前5条记录,如下所示: select top 5 * from Student 一般情况下,top是和order by连用的 orde ...

  6. hive 分组排序函数 row_number() over(partition by " " order by " "desc

    语法:row_number() over (partition by 字段a order by 计算项b desc ) rank --这里rank是别名 partition by:类似hive的建表, ...

  7. Hive数据库操作

    Hive数据结构 除了基本数据类型(与java类似),hive支持三种集合类型 Hive集合类型数据 array.map.structs hive (default)> create table ...

  8. 大数据开发实战:离线大数据处理的主要技术--Hive,概念,SQL,Hive数据库

    1.Hive出现背景 Hive是Facebook开发并贡献给Hadoop开源社区的.它是建立在Hadoop体系架构上的一层SQL抽象,使得数据相关人员使用他们最为熟悉的SQL语言就可以进行海量数据的处 ...

  9. HIVE的sql语句操作

    Hive 是基于Hadoop 构建的一套数据仓库分析系统,它提供了丰富的SQL查询方式来分析存储在hadoop 分布式文件系统中的数据,可以将结构 化的数据文件映射为一张数据库表,并提供完整的SQL查 ...

随机推荐

  1. PHP自动测试框架Top 10

    对于很多PHP开发新手来说,测试自己编写的代码是一个非常棘手的问题.如果出现问题,他们将不知道下一步该怎么做.花费很长的时间调试PHP代码是一个非常不明智的选择,最好的方法就是在编写应用程序代码之前就 ...

  2. poj 2681 字符串

    http://poj.org/problem?id=2681 给你任意长度的字符串,找出两串字符中不相同的字符个数(总数) #include<string> #include<cst ...

  3. poj supermaket (贪心)

    http://poj.org/problem?id=1456 #include<cstring> #include<iostream> #include<algorith ...

  4. java.lang.Object学习总结

  5. 不安装oracle客户端连接oracle数据库

    PLSQL Developer 或Toad 不安装Oracle 客户端连接数据库 为了简化Oracle在个人电脑的使用,避免占用不必要的资源,可以不安装Oracle客户端.方法是:使用Oracle I ...

  6. 四则运算题目生成(python版)

    四则运算题目生成-基于控制台 项目托管在码云:飞机票 需求分析 根据控制台提示信息,输入题目生成相关配置参数 题目生成数量 数字范围 式子中是否有分数 .... 程序支持 10000 题目生成 题目与 ...

  7. C++多线程学习之(一)——并发与多线程

    1 并发 计算机领域的并发指的是在单个系统里同时执行多个独立的任务,而非顺序地进行一些活动. 1.1 并发的途径 多进程并发:将应用程序分为多个独立的进程,它们在同一时刻运行,就像同时进行网页浏览和文 ...

  8. 高校学生征信系统Postmortem结果

    Postmortem结果 设想和目标 1 我们的软件要解决什么问题?是否定义得很清楚?是否对典型用户和典型场景有清晰的描述? 我们的软件需要解决的问题是当前高校学生征信系统建设薄弱的问题,我们试图建立 ...

  9. 201621123031 《Java程序设计》第7周学习总结

    作业07-Java GUI编程 1.本周学习总结 1.1 思维导图:Java图形界面总结 1.2 可选:使用常规方法总结其他上课内容. 事件监听器: Java事件监听器是由事件类和监听接口组成,自定义 ...

  10. Java 后端微信小程序支付demo (网上说的坑里面基本上都有)

    Java 后端微信小程序支付 一.遇到的问题 1. 商户号该产品权限未开通,请前往商户平台>产品中心检查后重试 2.签名错误 3.已经调起微信统一下单接口,可以拿到预支付ID,但是前端支付的时候 ...