[Hive - LanguageManual ] Windowing and Analytics Functions (待)
- Added by Lefty Leverenz, last edited by Lefty Leverenz on Aug 01, 2014 (view change)
- show comment
Windowing and Analytics Functions
- Windowing and Analytics Functions
- Enhancements to Hive QL
- Examples
- PARTITION BY with one partitioning column, no ORDER BY or window specification
- PARTITION BY with two partitioning columns, no ORDER BY or window specification
- PARTITION BY with one partitioning column, one ORDER BY column, and no window specification
- PARTITION BY with two partitioning columns, two ORDER BY columns, and no window specification
- PARTITION BY with partitioning, ORDER BY, and window specification
- WINDOW clause
- LEAD using default 1 row lead and not specifying default value
- LAG specifying a lag of 3 rows and default value of 0
Enhancements to Hive QL
Version
Icon
Introduced in Hive version 0.11.
This section introduces the Hive QL enhancements for windowing and analytics functions. See "Windowing Specifications in HQL" (attached to HIVE-4197) for details. HIVE-896 has more information, including links to earlier documentation in the initial comments.
All of the windowing and analytics functions operate as per the SQL standard.
The current release supports the following functions for windowing and analytics:
- Windowing functions
- LEAD
- The number of rows to lead can optionally be specified. If the number of rows to lead is not specified, the lead is one row.
- Returns null when the lead for the current row extends beyond the end of the window.
- LAG
- The number of rows to lag can optionally be specified. If the number of rows to lag is not specified, the lag is one row.
- Returns null when the lag for the current row extends before the beginning of the window.
- FIRST_VALUE
- LAST_VALUE
- LEAD
- The OVER clause
- OVER with standard aggregates:
- COUNT
- SUM
- MIN
- MAX
- AVG
- OVER with a PARTITION BY statement with one or more partitioning columns of any primitive datatype.
- OVER with PARTITION BY and ORDER BY with one or more partitioning and/or ordering columns of any datatype.
OVER with a window specification. Windows can be defined separately in a WINDOW clause. Window specifications support these standard options:
ROWS ((CURRENT ROW) | (UNBOUNDED | [num]) PRECEDING) AND (UNBOUNDED | [num]) FOLLOWING
IconThe OVER clause supports the following functions, but it does not support a window with them (see HIVE-4797):
Ranking functions: Rank, NTile, DenseRank, CumeDist, PercentRank.
Lead and Lag functions.
- OVER with standard aggregates:
- Analytics functions
- RANK
- ROW_NUMBER
- DENSE_RANK
- CUME_DIST
- PERCENT_RANK
- NTILE
Examples
This section provides examples of how to use the Hive QL windowing and analytics functions in SELECT statements. See HIVE-896 for additional examples.
PARTITION BY with one partitioning column, no ORDER BY or window specification
SELECT a, COUNT (b) OVER (PARTITION BY c) FROM T; |
PARTITION BY with two partitioning columns, no ORDER BY or window specification
SELECT a, COUNT (b) OVER (PARTITION BY c, d) FROM T; |
PARTITION BY with one partitioning column, one ORDER BY column, and no window specification
SELECT a, SUM (b) OVER (PARTITION BY c ORDER BY d) FROM T; |
PARTITION BY with two partitioning columns, two ORDER BY columns, and no window specification
SELECT a, SUM (b) OVER (PARTITION BY c, d ORDER BY e, f) FROM T; |
PARTITION BY with partitioning, ORDER BY, and window specification
SELECT a, SUM (b) OVER (PARTITION BY c ORDER BY d ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) FROM T; |
SELECT a, AVG (b) OVER (PARTITION BY c ORDER BY d ROWS BETWEEN 3 PRECEDING AND CURRENT ROW) FROM T; |
SELECT a, AVG (b) OVER (PARTITION BY c ORDER BY d ROWS BETWEEN 3 PRECEDING AND 3 FOLLOWING) FROM T; |
SELECT a, AVG (b) OVER (PARTITION BY c ORDER BY d ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) FROM T; |
There can be multiple OVER
clauses in a single query. A single OVER
clause only applies to the immediately preceding function call. In this example, the first OVER clause applies to COUNT(b) and the second OVER clause applies to SUM(b):
SELECT a, COUNT (b) OVER (PARTITION BY c), SUM (b) OVER (PARTITION BY c) FROM T; |
Aliases can be used as well, with or without the keyword AS:
SELECT a, COUNT (b) OVER (PARTITION BY c) AS b_count, SUM (b) OVER (PARTITION BY c) b_sum FROM T; |
WINDOW clause
SELECT a, SUM (b) OVER w FROM T; WINDOW w AS (PARTITION BY c ORDER BY d ROWS UNBOUNDED PRECEDING) |
LEAD using default 1 row lead and not specifying default value
SELECT a, LEAD(a) OVER (PARTITION BY b ORDER BY C ROWS BETWEEN CURRENT ROW AND 1 FOLLOWING) FROM T; |
LAG specifying a lag of 3 rows and default value of 0
SELECT a, LAG(a, 3, 0) OVER (PARTITION BY b ORDER BY C ROWS 3 PRECEDING) FROM T; |
[Hive - LanguageManual ] Windowing and Analytics Functions (待)的更多相关文章
- [Hive - LanguageManual] Create/Drop/Alter -View、 Index 、 Function
Create/Drop/Alter View Create View Drop View Alter View Properties Alter View As Select Version info ...
- [HIve - LanguageManual] Hive Operators and User-Defined Functions (UDFs)
Hive Operators and User-Defined Functions (UDFs) Hive Operators and User-Defined Functions (UDFs) Bu ...
- [Hive - LanguageManual] Select base use
Select Syntax WHERE Clause ALL and DISTINCT Clauses Partition Based Queries HAVING Clause LIMIT Clau ...
- [Hive - LanguageManual ] ]SQL Standard Based Hive Authorization
Status of Hive Authorization before Hive 0.13 SQL Standards Based Hive Authorization (New in Hive 0. ...
- [HIve - LanguageManual] LateralView
Lateral View Syntax Description Example Multiple Lateral Views Outer Lateral Views Lateral View Synt ...
- [HIve - LanguageManual] XPathUDF
Documentation for Built-In User-Defined Functions Related To XPath UDFs xpath, xpath_short, xpath_in ...
- [Hive - LanguageManual] GroupBy
Group By Syntax Simple Examples Select statement and group by clause Advanced Features Multi-Group-B ...
- [Hive - LanguageManual] Import/Export
LanguageManual ImportExport Skip to end of metadata Added by Carl Steinbach, last edited by Le ...
- [Hive - LanguageManual] DML: Load, Insert, Update, Delete
LanguageManual DML Hive Data Manipulation Language Hive Data Manipulation Language Loading files int ...
随机推荐
- (step4.3.5)hdu 1501(Zipper——DFS)
题目大意:个字符串.此题是个非常经典的dfs题. 解题思路:DFS 代码如下:有详细的注释 /* * 1501_2.cpp * * Created on: 2013年8月17日 * Author: A ...
- C Socket Programming for Linux with a Server and Client Example Code
Typically two processes communicate with each other on a single system through one of the following ...
- BZOJ 1000: A+B Problem
问题:A + B问题 描述:http://acm.wust.edu.cn/problem.php?id=1000&soj=0 代码示例: import java.util.Scanner; p ...
- 结构体mem_pool_t
/** Memory area header */ typedef struct mem_area_struct mem_area_t; /** Memory pool */ typedef stru ...
- 解析UML9种图的作用
本文和大家重点讨论一下UML9种图的概念,UML中有五类图,共有9种图形,每种图形都有各自的特点,下面就让我们一起来看一下这些图形特点的详细介绍吧. UML9种图简介 1.用例图 说明的是谁要使用系统 ...
- aspx中的表单验证 jquery.validate.js 的使用 以及 jquery.validate相关扩展验证(Jquery表单提交验证插件)
这一期我们先讲在aspx中使用 jquery.validate插件进行表单的验证, 关于MVC中使用 validate我们在下一期中再讲 上面是效果,下面来说使用步骤 jQuery.Valid ...
- ti processor sdk linux am335x evm /bin/setup-package-install.sh hacking
#!/bin/sh # # ti processor sdk linux am335x evm /bin/setup-package-install.sh hacking # 说明: # 本文主要对T ...
- vim - 查找替换
:%s/\<key_word_replaced\>/word_you_want_to_say/g
- Python cookbook - 读书笔记
Before: python built-in function: docs 我只想学function map(), THIS - 摘: map(foo, seq) is equivalent to ...
- ZOJ1311, POJ1144 Network
题目描述:TLC电话线路公司正在新建一个电话线路网络.他们将一些地方(这些地方用1到N的整数标明,任何2个地方的标号都不相同)用电话线路连接起来.这些线路是双向的,每条线路连接2个地方,并且每个地方的 ...