Hive—简单窗口分析函数
hive 窗口分析函数 0: jdbc:hive2://localhost:10000> select * from t_access;
+----------------+---------------------------------+-----------------------+--------------+--+
| t_access.ip | t_access.url | t_access.access_time | t_access.dt |
+----------------+---------------------------------+-----------------------+--------------+--+
| 192.168.33.3 | http://www.xxx.ccc.aa/stu | 2017-08-04 15:30:20 | 20170804 |
| 192.168.33.3 | http://www.xxx.ccc.aa/teach | 2017-08-04 15:35:20 | 20170804 |
| 192.168.33.4 | http://www.xxx.ccc.aa/stu | 2017-08-04 15:30:20 | 20170804 |
| 192.168.33.4 | http://www.xxx.ccc.aa/job | 2017-08-04 16:30:20 | 20170804 |
| 192.168.33.5 | http://www.xxx.ccc.aa/job | 2017-08-04 15:40:20 | 20170804 |
| 192.168.33.3 | http://www.xxx.ccc.aa/stu | 2017-08-05 15:30:20 | 20170805 |
| 192.168.44.3 | http://www.xxx.ccc.aa/teach | 2017-08-05 15:35:20 | 20170805 |
| 192.168.33.44 | http://www.xxx.ccc.aa/stu | 2017-08-05 15:30:20 | 20170805 |
| 192.168.33.46 | http://www.xxx.ccc.aa/job | 2017-08-05 16:30:20 | 20170805 |
| 192.168.33.55 | http://www.xxx.ccc.aa/job | 2017-08-05 15:40:20 | 20170805 |
| 192.168.133.3 | http://www.xxx.ccc.aa/register | 2017-08-06 15:30:20 | 20170806 |
| 192.168.111.3 | http://www.xxx.ccc.aa/register | 2017-08-06 15:35:20 | 20170806 |
| 192.168.34.44 | http://www.xxx.ccc.aa/pay | 2017-08-06 15:30:20 | 20170806 |
| 192.168.33.46 | http://www.xxx.ccc.aa/excersize | 2017-08-06 16:30:20 | 20170806 |
| 192.168.33.55 | http://www.xxx.ccc.aa/job | 2017-08-06 15:40:20 | 20170806 |
| 192.168.33.46 | http://www.xxx.ccc.aa/excersize | 2017-08-06 16:30:20 | 20170806 |
| 192.168.33.25 | http://www.xxx.ccc.aa/job | 2017-08-06 15:40:20 | 20170806 |
| 192.168.33.36 | http://www.xxx.ccc.aa/excersize | 2017-08-06 16:30:20 | 20170806 |
| 192.168.33.55 | http://www.xxx.ccc.aa/job | 2017-08-06 15:40:20 | 20170806 |
+----------------+---------------------------------+-----------------------+--------------+--+ ## LAG函数
select ip,url,access_time,
row_number() over(partition by ip order by access_time) as rn,
lag(access_time,1,0) over(partition by ip order by access_time)as last_access_time
from t_access; +----------------+---------------------------------+----------------------+-----+----------------------+--+
| ip | url | access_time | rn | last_access_time |
+----------------+---------------------------------+----------------------+-----+----------------------+--+
| 192.168.111.3 | http://www.xxx.ccc.aa/register | 2017-08-06 15:35:20 | 1 | 0 |
| 192.168.133.3 | http://www.xxx.ccc.aa/register | 2017-08-06 15:30:20 | 1 | 0 |
| 192.168.33.25 | http://www.xxx.ccc.aa/job | 2017-08-06 15:40:20 | 1 | 0 |
| 192.168.33.3 | http://www.xxx.ccc.aa/stu | 2017-08-04 15:30:20 | 1 | 0 |
| 192.168.33.3 | http://www.xxx.ccc.aa/teach | 2017-08-04 15:35:20 | 2 | 2017-08-04 15:30:20 |
| 192.168.33.3 | http://www.xxx.ccc.aa/stu | 2017-08-05 15:30:20 | 3 | 2017-08-04 15:35:20 |
| 192.168.33.36 | http://www.xxx.ccc.aa/excersize | 2017-08-06 16:30:20 | 1 | 0 |
| 192.168.33.4 | http://www.xxx.ccc.aa/stu | 2017-08-04 15:30:20 | 1 | 0 |
| 192.168.33.4 | http://www.xxx.ccc.aa/job | 2017-08-04 16:30:20 | 2 | 2017-08-04 15:30:20 |
| 192.168.33.44 | http://www.xxx.ccc.aa/stu | 2017-08-05 15:30:20 | 1 | 0 |
| 192.168.33.46 | http://www.xxx.ccc.aa/job | 2017-08-05 16:30:20 | 1 | 0 |
| 192.168.33.46 | http://www.xxx.ccc.aa/excersize | 2017-08-06 16:30:20 | 2 | 2017-08-05 16:30:20 |
| 192.168.33.46 | http://www.xxx.ccc.aa/excersize | 2017-08-06 16:30:20 | 3 | 2017-08-06 16:30:20 |
| 192.168.33.5 | http://www.xxx.ccc.aa/job | 2017-08-04 15:40:20 | 1 | 0 |
| 192.168.33.55 | http://www.xxx.ccc.aa/job | 2017-08-05 15:40:20 | 1 | 0 |
| 192.168.33.55 | http://www.xxx.ccc.aa/job | 2017-08-06 15:40:20 | 2 | 2017-08-05 15:40:20 |
| 192.168.33.55 | http://www.xxx.ccc.aa/job | 2017-08-06 15:40:20 | 3 | 2017-08-06 15:40:20 |
| 192.168.34.44 | http://www.xxx.ccc.aa/pay | 2017-08-06 15:30:20 | 1 | 0 |
| 192.168.44.3 | http://www.xxx.ccc.aa/teach | 2017-08-05 15:35:20 | 1 | 0 |
+----------------+---------------------------------+----------------------+-----+----------------------+--+ ## LEAD函数
select ip,url,access_time,
row_number() over(partition by ip order by access_time) as rn,
lead(access_time,1,0) over(partition by ip order by access_time)as last_access_time
from t_access;
+----------------+---------------------------------+----------------------+-----+----------------------+--+
| ip | url | access_time | rn | last_access_time |
+----------------+---------------------------------+----------------------+-----+----------------------+--+
| 192.168.111.3 | http://www.xxx.ccc.aa/register | 2017-08-06 15:35:20 | 1 | 0 |
| 192.168.133.3 | http://www.xxx.ccc.aa/register | 2017-08-06 15:30:20 | 1 | 0 |
| 192.168.33.25 | http://www.xxx.ccc.aa/job | 2017-08-06 15:40:20 | 1 | 0 |
| 192.168.33.3 | http://www.xxx.ccc.aa/stu | 2017-08-04 15:30:20 | 1 | 2017-08-04 15:35:20 |
| 192.168.33.3 | http://www.xxx.ccc.aa/teach | 2017-08-04 15:35:20 | 2 | 2017-08-05 15:30:20 |
| 192.168.33.3 | http://www.xxx.ccc.aa/stu | 2017-08-05 15:30:20 | 3 | 0 |
| 192.168.33.36 | http://www.xxx.ccc.aa/excersize | 2017-08-06 16:30:20 | 1 | 0 |
| 192.168.33.4 | http://www.xxx.ccc.aa/stu | 2017-08-04 15:30:20 | 1 | 2017-08-04 16:30:20 |
| 192.168.33.4 | http://www.xxx.ccc.aa/job | 2017-08-04 16:30:20 | 2 | 0 |
| 192.168.33.44 | http://www.xxx.ccc.aa/stu | 2017-08-05 15:30:20 | 1 | 0 |
| 192.168.33.46 | http://www.xxx.ccc.aa/job | 2017-08-05 16:30:20 | 1 | 2017-08-06 16:30:20 |
| 192.168.33.46 | http://www.xxx.ccc.aa/excersize | 2017-08-06 16:30:20 | 2 | 2017-08-06 16:30:20 |
| 192.168.33.46 | http://www.xxx.ccc.aa/excersize | 2017-08-06 16:30:20 | 3 | 0 |
| 192.168.33.5 | http://www.xxx.ccc.aa/job | 2017-08-04 15:40:20 | 1 | 0 |
| 192.168.33.55 | http://www.xxx.ccc.aa/job | 2017-08-05 15:40:20 | 1 | 2017-08-06 15:40:20 |
| 192.168.33.55 | http://www.xxx.ccc.aa/job | 2017-08-06 15:40:20 | 2 | 2017-08-06 15:40:20 |
| 192.168.33.55 | http://www.xxx.ccc.aa/job | 2017-08-06 15:40:20 | 3 | 0 |
| 192.168.34.44 | http://www.xxx.ccc.aa/pay | 2017-08-06 15:30:20 | 1 | 0 |
| 192.168.44.3 | http://www.xxx.ccc.aa/teach | 2017-08-05 15:35:20 | 1 | 0 |
+----------------+---------------------------------+----------------------+-----+----------------------+--+ ## FIRST_VALUE 函数
例:取每个用户访问的第一个页面
select ip,url,access_time,
row_number() over(partition by ip order by access_time) as rn,
first_value(url) over(partition by ip order by access_time rows between unbounded preceding and unbounded following)as last_access_time
from t_access;
+----------------+---------------------------------+----------------------+-----+---------------------------------+--+
| ip | url | access_time | rn | last_access_time |
+----------------+---------------------------------+----------------------+-----+---------------------------------+--+
| 192.168.111.3 | http://www.xxx.ccc.aa/register | 2017-08-06 15:35:20 | 1 | http://www.xxx.ccc.aa/register |
| 192.168.133.3 | http://www.xxx.ccc.aa/register | 2017-08-06 15:30:20 | 1 | http://www.xxx.ccc.aa/register |
| 192.168.33.25 | http://www.xxx.ccc.aa/job | 2017-08-06 15:40:20 | 1 | http://www.xxx.ccc.aa/job |
| 192.168.33.3 | http://www.xxx.ccc.aa/stu | 2017-08-04 15:30:20 | 1 | http://www.xxx.ccc.aa/stu |
| 192.168.33.3 | http://www.xxx.ccc.aa/teach | 2017-08-04 15:35:20 | 2 | http://www.xxx.ccc.aa/stu |
| 192.168.33.3 | http://www.xxx.ccc.aa/stu | 2017-08-05 15:30:20 | 3 | http://www.xxx.ccc.aa/stu |
| 192.168.33.36 | http://www.xxx.ccc.aa/excersize | 2017-08-06 16:30:20 | 1 | http://www.xxx.ccc.aa/excersize |
| 192.168.33.4 | http://www.xxx.ccc.aa/stu | 2017-08-04 15:30:20 | 1 | http://www.xxx.ccc.aa/stu |
| 192.168.33.4 | http://www.xxx.ccc.aa/job | 2017-08-04 16:30:20 | 2 | http://www.xxx.ccc.aa/stu |
| 192.168.33.44 | http://www.xxx.ccc.aa/stu | 2017-08-05 15:30:20 | 1 | http://www.xxx.ccc.aa/stu |
| 192.168.33.46 | http://www.xxx.ccc.aa/job | 2017-08-05 16:30:20 | 1 | http://www.xxx.ccc.aa/job |
| 192.168.33.46 | http://www.xxx.ccc.aa/excersize | 2017-08-06 16:30:20 | 2 | http://www.xxx.ccc.aa/job |
| 192.168.33.46 | http://www.xxx.ccc.aa/excersize | 2017-08-06 16:30:20 | 3 | http://www.xxx.ccc.aa/job |
| 192.168.33.5 | http://www.xxx.ccc.aa/job | 2017-08-04 15:40:20 | 1 | http://www.xxx.ccc.aa/job |
| 192.168.33.55 | http://www.xxx.ccc.aa/job | 2017-08-05 15:40:20 | 1 | http://www.xxx.ccc.aa/job |
| 192.168.33.55 | http://www.xxx.ccc.aa/job | 2017-08-06 15:40:20 | 2 | http://www.xxx.ccc.aa/job |
| 192.168.33.55 | http://www.xxx.ccc.aa/job | 2017-08-06 15:40:20 | 3 | http://www.xxx.ccc.aa/job |
| 192.168.34.44 | http://www.xxx.ccc.aa/pay | 2017-08-06 15:30:20 | 1 | http://www.xxx.ccc.aa/pay |
| 192.168.44.3 | http://www.xxx.ccc.aa/teach | 2017-08-05 15:35:20 | 1 | http://www.xxx.ccc.aa/teach |
+----------------+---------------------------------+----------------------+-----+---------------------------------+--+ ## LAST_VALUE 函数
例:取每个用户访问的最后一个页面
select ip,url,access_time,
row_number() over(partition by ip order by access_time) as rn,
last_value(url) over(partition by ip order by access_time rows between unbounded preceding and unbounded following)as last_access_time
from t_access;
+----------------+---------------------------------+----------------------+-----+---------------------------------+--+
| ip | url | access_time | rn | last_access_time |
+----------------+---------------------------------+----------------------+-----+---------------------------------+--+
| 192.168.111.3 | http://www.xxx.ccc.aa/register | 2017-08-06 15:35:20 | 1 | http://www.xxx.ccc.aa/register |
| 192.168.133.3 | http://www.xxx.ccc.aa/register | 2017-08-06 15:30:20 | 1 | http://www.xxx.ccc.aa/register |
| 192.168.33.25 | http://www.xxx.ccc.aa/job | 2017-08-06 15:40:20 | 1 | http://www.xxx.ccc.aa/job |
| 192.168.33.3 | http://www.xxx.ccc.aa/stu | 2017-08-04 15:30:20 | 1 | http://www.xxx.ccc.aa/stu |
| 192.168.33.3 | http://www.xxx.ccc.aa/teach | 2017-08-04 15:35:20 | 2 | http://www.xxx.ccc.aa/stu |
| 192.168.33.3 | http://www.xxx.ccc.aa/stu | 2017-08-05 15:30:20 | 3 | http://www.xxx.ccc.aa/stu |
| 192.168.33.36 | http://www.xxx.ccc.aa/excersize | 2017-08-06 16:30:20 | 1 | http://www.xxx.ccc.aa/excersize |
| 192.168.33.4 | http://www.xxx.ccc.aa/stu | 2017-08-04 15:30:20 | 1 | http://www.xxx.ccc.aa/stu |
| 192.168.33.4 | http://www.xxx.ccc.aa/job | 2017-08-04 16:30:20 | 2 | http://www.xxx.ccc.aa/stu |
| 192.168.33.44 | http://www.xxx.ccc.aa/stu | 2017-08-05 15:30:20 | 1 | http://www.xxx.ccc.aa/stu |
| 192.168.33.46 | http://www.xxx.ccc.aa/job | 2017-08-05 16:30:20 | 1 | http://www.xxx.ccc.aa/job |
| 192.168.33.46 | http://www.xxx.ccc.aa/excersize | 2017-08-06 16:30:20 | 2 | http://www.xxx.ccc.aa/job |
| 192.168.33.46 | http://www.xxx.ccc.aa/excersize | 2017-08-06 16:30:20 | 3 | http://www.xxx.ccc.aa/job |
| 192.168.33.5 | http://www.xxx.ccc.aa/job | 2017-08-04 15:40:20 | 1 | http://www.xxx.ccc.aa/job |
| 192.168.33.55 | http://www.xxx.ccc.aa/job | 2017-08-05 15:40:20 | 1 | http://www.xxx.ccc.aa/job |
| 192.168.33.55 | http://www.xxx.ccc.aa/job | 2017-08-06 15:40:20 | 2 | http://www.xxx.ccc.aa/job |
| 192.168.33.55 | http://www.xxx.ccc.aa/job | 2017-08-06 15:40:20 | 3 | http://www.xxx.ccc.aa/job |
| 192.168.34.44 | http://www.xxx.ccc.aa/pay | 2017-08-06 15:30:20 | 1 | http://www.xxx.ccc.aa/pay |
| 192.168.44.3 | http://www.xxx.ccc.aa/teach | 2017-08-05 15:35:20 | 1 | http://www.xxx.ccc.aa/teach |
+----------------+---------------------------------+----------------------+-----+---------------------------------+--+ /*
累计报表--分析函数实现版
*/
-- sum() over() 函数
select id
,month
,sum(amount) over(partition by id order by month rows between unbounded preceding and current row)
from
(select id,month,
sum(fee) as amount
from t_test
group by id,month) tmp;
Hive—简单窗口分析函数的更多相关文章
- hive中窗口分析函数
分组统计 1. groups sets(field1,field2,field3, (field1,field2)) 样例如下: select dt,tenantCode,nvl(platform,' ...
- pyqt5之简单窗口的创建
在学完tkinter后,发现tkinter在布局方面特别的不方便(Tkinter资料:http://effbot.org/tkinterbook/tkinter-index.htm),因此学习pyqt ...
- 雷林鹏分享:jQuery EasyUI 窗口 - 创建简单窗口
jQuery EasyUI 窗口 - 创建简单窗口 创建一个窗口(window)非常简单,我们创建一个 DIV 标记: Some Content. 现在运行测试页面,您会看见一个窗口(window)显 ...
- OpenGL学习 (一) - 简单窗口绘制
一.OpenGL 简介 OpenGL 本质: OpenGL(Open Graphics Library),通常可以认为是API,其包含了一系列可以操作图形.图像的函数.但深究下来,它是由Khronos ...
- Hive 窗口分析函数
1.窗口函数 1.LAG(col,n,DEFAULT) 用于统计窗口内往上第n行值 第一个参数为列名,第二个参数为往上第n行(可选,默认为1),第三个参数为默认值(当往上第n行为NULL时候,取默认值 ...
- hive row_number等窗口分析函数
一.排序&去重分析 row_number() over(partititon by col1 order by col2) as rn 结果:1,2,3,4 rank() over(parti ...
- Hive 窗口函数、分析函数
1 分析函数:用于等级.百分点.n分片等 Ntile 是Hive很强大的一个分析函数. 可以看成是:它把有序的数据集合 平均分配 到 指定的数量(num)个桶中, 将桶号分配给每一行.如果不能平均分配 ...
- Windows程序设计笔记(二) 关于编写简单窗口程序中的几点疑惑
在编写窗口程序时主要是5个步骤,创建窗口类.注册窗口类.创建窗口.显示窗口.消息环的编写.对于这5个步骤为何要这样写,当初我不是太理解,学习到现在有些问题我基本上已经找到了答案,同时对于Windows ...
- hive:排序分析函数
基本排序函数 语法: rank()over([partition by col1] order by col2) dense_rank()over([partition by col1] order ...
随机推荐
- 【Hibernate学习笔记-5.2】使用@Temporal修饰日期类型的属性
作者:ssslinppp 1. 摘要 关于日期类型,Java和数据库表示的方法不同: Java:只有java.util.Date和java.util.Calender两种: 数据库:dat ...
- 【XMLHttpRequest】获取XMLHttpRequest
// 获取http请求 function getXMLHttpRequest() { req = false; //本地XMLHttpRequest对象 if (window.XMLHttpReque ...
- MySQL主从同步和半同步配置
mysql主从配置: 1,安装maraidb,使用国内yum镜像站下载:[root@localhost mysql]# cat /etc/yum.repos.d/MairaDB.repo # Mari ...
- 浏览器缩放导致的样式bug
缩放75% 这种问题修改的话 要兼顾多种浏览器,并且有些地方样式是要求写死的,修改成本会比较大,所以一般是不会去处理的
- java中使HttpDelete可以发送body信息
java中使HttpDelete可以发送body信息RESTful api中用到了DELETE方法,android开发的同事遇到了问题,使用HttpDelete执行DELETE操作的时候,不能携带bo ...
- nsenter工具进入docker容器
对于运行在后台的Docker容器,我们经常需要做的事情是进入到容器中,docker为我们提供了docker exec .docker attach 命令,并且还提供了nsenter工具,外部工具供我们 ...
- 当vcenter是linux版本的时候Sysprep存放路径
为 VMware vCenter Server Appliance 安装 Microsoft Sysprep 工具在从 Microsoft 网站下载并安装 Microsoft Sysprep 工具之后 ...
- python + docker, 实现天气数据 从FTP获取以及持久化(五)-- 利用 Docker 容器化 Python 程序
背景 不知不觉中,我们已经完成了所有的编程工作.接下来,我们需要把 Python 程序 做 容器化 (Docker)部署. 思考 考虑到项目的实际情况,“持久化天气”的功能将会是一个独立的功能模块发布 ...
- 新手搭建 nginx + php (LNMP)
配置源 纯净的Centos 6.5系统 配置163yum源 (这个比较简单,百度能解决很多问题) 开始 安装 开发软件包:yum -y groupinstall "Developmen ...
- 【洛谷】P1341 无序字母对(欧拉回路)
题目 传送门:QWQ 分析 快把欧拉回路忘光了. 欧拉回路大概就是一笔画的问题,可不可以一笔画完全图. 全图有欧拉回路当且仅当全图的奇数度数的点有0或2个. 2个时是一个点是起点,另一个是终点. 本题 ...