[Hive - LanguageManual ] Windowing and Analytics Functions (待)
- Added by Lefty Leverenz, last edited by Lefty Leverenz on Aug 01, 2014 (view change)
- show comment
Windowing and Analytics Functions
- Windowing and Analytics Functions
- Enhancements to Hive QL
- Examples
- PARTITION BY with one partitioning column, no ORDER BY or window specification
- PARTITION BY with two partitioning columns, no ORDER BY or window specification
- PARTITION BY with one partitioning column, one ORDER BY column, and no window specification
- PARTITION BY with two partitioning columns, two ORDER BY columns, and no window specification
- PARTITION BY with partitioning, ORDER BY, and window specification
- WINDOW clause
- LEAD using default 1 row lead and not specifying default value
- LAG specifying a lag of 3 rows and default value of 0
Enhancements to Hive QL
Version
Icon
Introduced in Hive version 0.11.
This section introduces the Hive QL enhancements for windowing and analytics functions. See "Windowing Specifications in HQL" (attached to HIVE-4197) for details. HIVE-896 has more information, including links to earlier documentation in the initial comments.
All of the windowing and analytics functions operate as per the SQL standard.
The current release supports the following functions for windowing and analytics:
- Windowing functions
- LEAD
- The number of rows to lead can optionally be specified. If the number of rows to lead is not specified, the lead is one row.
- Returns null when the lead for the current row extends beyond the end of the window.
- LAG
- The number of rows to lag can optionally be specified. If the number of rows to lag is not specified, the lag is one row.
- Returns null when the lag for the current row extends before the beginning of the window.
- FIRST_VALUE
- LAST_VALUE
- LEAD
- The OVER clause
- OVER with standard aggregates:
- COUNT
- SUM
- MIN
- MAX
- AVG
- OVER with a PARTITION BY statement with one or more partitioning columns of any primitive datatype.
- OVER with PARTITION BY and ORDER BY with one or more partitioning and/or ordering columns of any datatype.
OVER with a window specification. Windows can be defined separately in a WINDOW clause. Window specifications support these standard options:
ROWS ((CURRENT ROW) | (UNBOUNDED | [num]) PRECEDING) AND (UNBOUNDED | [num]) FOLLOWING
IconThe OVER clause supports the following functions, but it does not support a window with them (see HIVE-4797):
Ranking functions: Rank, NTile, DenseRank, CumeDist, PercentRank.
Lead and Lag functions.
- OVER with standard aggregates:
- Analytics functions
- RANK
- ROW_NUMBER
- DENSE_RANK
- CUME_DIST
- PERCENT_RANK
- NTILE
Examples
This section provides examples of how to use the Hive QL windowing and analytics functions in SELECT statements. See HIVE-896 for additional examples.
PARTITION BY with one partitioning column, no ORDER BY or window specification
SELECT a, COUNT(b) OVER (PARTITION BY c)FROM T; |
PARTITION BY with two partitioning columns, no ORDER BY or window specification
SELECT a, COUNT(b) OVER (PARTITION BY c, d)FROM T; |
PARTITION BY with one partitioning column, one ORDER BY column, and no window specification
SELECT a, SUM(b) OVER (PARTITION BY c ORDER BY d)FROM T; |
PARTITION BY with two partitioning columns, two ORDER BY columns, and no window specification
SELECT a, SUM(b) OVER (PARTITION BY c, d ORDER BY e, f)FROM T; |
PARTITION BY with partitioning, ORDER BY, and window specification
SELECT a, SUM(b) OVER (PARTITION BY c ORDER BY d ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)FROM T; |
SELECT a, AVG(b) OVER (PARTITION BY c ORDER BY d ROWS BETWEEN 3 PRECEDING AND CURRENT ROW)FROM T; |
SELECT a, AVG(b) OVER (PARTITION BY c ORDER BY d ROWS BETWEEN 3 PRECEDING AND 3 FOLLOWING)FROM T; |
SELECT a, AVG(b) OVER (PARTITION BY c ORDER BY d ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)FROM T; |
There can be multiple OVER clauses in a single query. A single OVER clause only applies to the immediately preceding function call. In this example, the first OVER clause applies to COUNT(b) and the second OVER clause applies to SUM(b):
SELECT a, COUNT(b) OVER (PARTITION BY c), SUM(b) OVER (PARTITION BY c)FROM T; |
Aliases can be used as well, with or without the keyword AS:
SELECT a, COUNT(b) OVER (PARTITION BY c) AS b_count, SUM(b) OVER (PARTITION BY c) b_sumFROM T; |
WINDOW clause
SELECT a, SUM(b) OVER wFROM T;WINDOW w AS (PARTITION BY c ORDER BY d ROWS UNBOUNDED PRECEDING) |
LEAD using default 1 row lead and not specifying default value
SELECT a, LEAD(a) OVER (PARTITION BY b ORDER BY C ROWS BETWEEN CURRENT ROW AND 1 FOLLOWING)FROM T; |
LAG specifying a lag of 3 rows and default value of 0
SELECT a, LAG(a, 3, 0) OVER (PARTITION BY b ORDER BY C ROWS 3 PRECEDING)FROM T; |
[Hive - LanguageManual ] Windowing and Analytics Functions (待)的更多相关文章
- [Hive - LanguageManual] Create/Drop/Alter -View、 Index 、 Function
Create/Drop/Alter View Create View Drop View Alter View Properties Alter View As Select Version info ...
- [HIve - LanguageManual] Hive Operators and User-Defined Functions (UDFs)
Hive Operators and User-Defined Functions (UDFs) Hive Operators and User-Defined Functions (UDFs) Bu ...
- [Hive - LanguageManual] Select base use
Select Syntax WHERE Clause ALL and DISTINCT Clauses Partition Based Queries HAVING Clause LIMIT Clau ...
- [Hive - LanguageManual ] ]SQL Standard Based Hive Authorization
Status of Hive Authorization before Hive 0.13 SQL Standards Based Hive Authorization (New in Hive 0. ...
- [HIve - LanguageManual] LateralView
Lateral View Syntax Description Example Multiple Lateral Views Outer Lateral Views Lateral View Synt ...
- [HIve - LanguageManual] XPathUDF
Documentation for Built-In User-Defined Functions Related To XPath UDFs xpath, xpath_short, xpath_in ...
- [Hive - LanguageManual] GroupBy
Group By Syntax Simple Examples Select statement and group by clause Advanced Features Multi-Group-B ...
- [Hive - LanguageManual] Import/Export
LanguageManual ImportExport Skip to end of metadata Added by Carl Steinbach, last edited by Le ...
- [Hive - LanguageManual] DML: Load, Insert, Update, Delete
LanguageManual DML Hive Data Manipulation Language Hive Data Manipulation Language Loading files int ...
随机推荐
- vim编程 插入 保存不退出 保存退出 退出不保存 另存为其他文件名 保存覆盖现有文件
---恢复内容开始--- 在xshell里写代码,如果需要编辑代码,可以输入 vim+ xxx.py ,进入vim编辑界面 这里的xxx.py表示 python的存储文件,后缀名是.py. 1. ...
- wordCount程序中MapReduce工作过程分析
Map处理的是一个纯文本.Mapper处理的数据是由InputFormat分解过的数据集,其中InputFormat的作用是将数据集切割成小数据集InputSplit,每一个InputSplit将由一 ...
- Effective C++学习笔记 条款05:了解C++默默编写并调用的哪些函数
一.如果用户没有提供构造函数.copy构造函数.copy assignment操作符和析构函数,当且仅当这些函数被需要的时候,编译器才会帮你创建出来.编译器生成的这些函数都是public且inline ...
- Oracle数据泵导入导出数据,建立表空
Oracle11g 数据导入到oracle10g 中:1.在oracle11g 服务器命令行中用expdp 导出数据expdp ts/ts@orcl directory=expdp_dir dumpf ...
- 1709. Penguin-Avia(并查集)
1709 简单题 并查集找下就行 #include <iostream> #include<cstdio> #include<cstring> #include&l ...
- HDU 1166 敌兵布阵 (线段树 单点更新)
题目链接 线段树掌握的很差,打算从头从最简单的开始刷一波, 嗯..就从这个题开始吧! #include <iostream> #include <cstdio> #includ ...
- [LA 3887] Slim Span
3887 - Slim SpanTime limit: 3.000 seconds Given an undirected weighted graph G <tex2html_verbatim ...
- [反汇编练习] 160个CrackMe之018
[反汇编练习] 160个CrackMe之018. 本系列文章的目的是从一个没有任何经验的新手的角度(其实就是我自己),一步步尝试将160个CrackMe全部破解,如果可以,通过任何方式写出一个类似于注 ...
- 完整cocos2d-x编译Andriod应用过程
作者:何卫 转载请注明,原文链接:http://www.cnblogs.com/hewei2012/p/3366969.html 其他平台移植:http://cocos2d.cocoachina.co ...
- iOS 静态类库项目的建立与使用
iOS 静态类库项目的建立与使用 新建 Xcode workspace 打开 Xcode , 选择 File -> New -> Workspace , 将 Workspace 命名为 ...