[Hive - LanguageManual ] Windowing and Analytics Functions (待)
- Added by Lefty Leverenz, last edited by Lefty Leverenz on Aug 01, 2014 (view change)
- show comment
Windowing and Analytics Functions
- Windowing and Analytics Functions
- Enhancements to Hive QL
- Examples
- PARTITION BY with one partitioning column, no ORDER BY or window specification
- PARTITION BY with two partitioning columns, no ORDER BY or window specification
- PARTITION BY with one partitioning column, one ORDER BY column, and no window specification
- PARTITION BY with two partitioning columns, two ORDER BY columns, and no window specification
- PARTITION BY with partitioning, ORDER BY, and window specification
- WINDOW clause
- LEAD using default 1 row lead and not specifying default value
- LAG specifying a lag of 3 rows and default value of 0
Enhancements to Hive QL
Version
Icon
Introduced in Hive version 0.11.
This section introduces the Hive QL enhancements for windowing and analytics functions. See "Windowing Specifications in HQL" (attached to HIVE-4197) for details. HIVE-896 has more information, including links to earlier documentation in the initial comments.
All of the windowing and analytics functions operate as per the SQL standard.
The current release supports the following functions for windowing and analytics:
- Windowing functions
- LEAD
- The number of rows to lead can optionally be specified. If the number of rows to lead is not specified, the lead is one row.
- Returns null when the lead for the current row extends beyond the end of the window.
- LAG
- The number of rows to lag can optionally be specified. If the number of rows to lag is not specified, the lag is one row.
- Returns null when the lag for the current row extends before the beginning of the window.
- FIRST_VALUE
- LAST_VALUE
- LEAD
- The OVER clause
- OVER with standard aggregates:
- COUNT
- SUM
- MIN
- MAX
- AVG
- OVER with a PARTITION BY statement with one or more partitioning columns of any primitive datatype.
- OVER with PARTITION BY and ORDER BY with one or more partitioning and/or ordering columns of any datatype.
OVER with a window specification. Windows can be defined separately in a WINDOW clause. Window specifications support these standard options:
ROWS ((CURRENT ROW) | (UNBOUNDED | [num]) PRECEDING) AND (UNBOUNDED | [num]) FOLLOWING
IconThe OVER clause supports the following functions, but it does not support a window with them (see HIVE-4797):
Ranking functions: Rank, NTile, DenseRank, CumeDist, PercentRank.
Lead and Lag functions.
- OVER with standard aggregates:
- Analytics functions
- RANK
- ROW_NUMBER
- DENSE_RANK
- CUME_DIST
- PERCENT_RANK
- NTILE
Examples
This section provides examples of how to use the Hive QL windowing and analytics functions in SELECT statements. See HIVE-896 for additional examples.
PARTITION BY with one partitioning column, no ORDER BY or window specification
SELECT a, COUNT(b) OVER (PARTITION BY c)FROM T; |
PARTITION BY with two partitioning columns, no ORDER BY or window specification
SELECT a, COUNT(b) OVER (PARTITION BY c, d)FROM T; |
PARTITION BY with one partitioning column, one ORDER BY column, and no window specification
SELECT a, SUM(b) OVER (PARTITION BY c ORDER BY d)FROM T; |
PARTITION BY with two partitioning columns, two ORDER BY columns, and no window specification
SELECT a, SUM(b) OVER (PARTITION BY c, d ORDER BY e, f)FROM T; |
PARTITION BY with partitioning, ORDER BY, and window specification
SELECT a, SUM(b) OVER (PARTITION BY c ORDER BY d ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)FROM T; |
SELECT a, AVG(b) OVER (PARTITION BY c ORDER BY d ROWS BETWEEN 3 PRECEDING AND CURRENT ROW)FROM T; |
SELECT a, AVG(b) OVER (PARTITION BY c ORDER BY d ROWS BETWEEN 3 PRECEDING AND 3 FOLLOWING)FROM T; |
SELECT a, AVG(b) OVER (PARTITION BY c ORDER BY d ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)FROM T; |
There can be multiple OVER clauses in a single query. A single OVER clause only applies to the immediately preceding function call. In this example, the first OVER clause applies to COUNT(b) and the second OVER clause applies to SUM(b):
SELECT a, COUNT(b) OVER (PARTITION BY c), SUM(b) OVER (PARTITION BY c)FROM T; |
Aliases can be used as well, with or without the keyword AS:
SELECT a, COUNT(b) OVER (PARTITION BY c) AS b_count, SUM(b) OVER (PARTITION BY c) b_sumFROM T; |
WINDOW clause
SELECT a, SUM(b) OVER wFROM T;WINDOW w AS (PARTITION BY c ORDER BY d ROWS UNBOUNDED PRECEDING) |
LEAD using default 1 row lead and not specifying default value
SELECT a, LEAD(a) OVER (PARTITION BY b ORDER BY C ROWS BETWEEN CURRENT ROW AND 1 FOLLOWING)FROM T; |
LAG specifying a lag of 3 rows and default value of 0
SELECT a, LAG(a, 3, 0) OVER (PARTITION BY b ORDER BY C ROWS 3 PRECEDING)FROM T; |
[Hive - LanguageManual ] Windowing and Analytics Functions (待)的更多相关文章
- [Hive - LanguageManual] Create/Drop/Alter -View、 Index 、 Function
Create/Drop/Alter View Create View Drop View Alter View Properties Alter View As Select Version info ...
- [HIve - LanguageManual] Hive Operators and User-Defined Functions (UDFs)
Hive Operators and User-Defined Functions (UDFs) Hive Operators and User-Defined Functions (UDFs) Bu ...
- [Hive - LanguageManual] Select base use
Select Syntax WHERE Clause ALL and DISTINCT Clauses Partition Based Queries HAVING Clause LIMIT Clau ...
- [Hive - LanguageManual ] ]SQL Standard Based Hive Authorization
Status of Hive Authorization before Hive 0.13 SQL Standards Based Hive Authorization (New in Hive 0. ...
- [HIve - LanguageManual] LateralView
Lateral View Syntax Description Example Multiple Lateral Views Outer Lateral Views Lateral View Synt ...
- [HIve - LanguageManual] XPathUDF
Documentation for Built-In User-Defined Functions Related To XPath UDFs xpath, xpath_short, xpath_in ...
- [Hive - LanguageManual] GroupBy
Group By Syntax Simple Examples Select statement and group by clause Advanced Features Multi-Group-B ...
- [Hive - LanguageManual] Import/Export
LanguageManual ImportExport Skip to end of metadata Added by Carl Steinbach, last edited by Le ...
- [Hive - LanguageManual] DML: Load, Insert, Update, Delete
LanguageManual DML Hive Data Manipulation Language Hive Data Manipulation Language Loading files int ...
随机推荐
- HTTP长连接实现“服务器推”的技术
HTTP长连接实现“服务器推”的技术快速入门及演示示例 在我的印象里HTTP是一种“无状态的协议”,也就是不知道以前请求的历史,无法保留上一次请求的结果.Cookie的诞生,弥补了这个不足,浏览器可以 ...
- FastDFS_v5.05安装配置
废话不多讲,启动FastDFS文件服务器的命令是 #/usr/bin/fdfs_trackerd /etc/fdfs/tracker.conf #/usr/bin/fdfs_storaged /etc ...
- Servlet小示例:jsp页面提交信息Servlet接收并打印输出
该示例采用doPost方法提交表单,该示例一共包含两个文件. 一个是用来提交用户信息的表单userForm2.jsp,另一个是用来接收参数的Servlet. userForm2.jsp <%@ ...
- java中文排序
对中文名称进行排序,不多说,上代码 package test; /** * @Title: Person.java * @Copyright: Copyright (c) 2012-11-19 * @ ...
- Java API —— HashMap类 & LinkedHashMap类
1.HashMap类 1)HashMap类概述 键是哈希表结构,可以保证键的唯一性 2)HashMap案例 HashMap<String,String> ...
- PHP中该怎样防止SQL注入?
因为用户的输入可能是这样的: ? 1 value'); DROP TABLE table;-- 那么SQL查询将变成如下: ? 1 INSERT INTO `table` (`column`) VAL ...
- java开发之多线程需要学习和理解的东西
40个Java多线程问题总结 http://www.codeceo.com/article/40-java-thread-problems.html
- C++ STL之排序算法
排序算法和查找算法差不多,也涉及到迭代器区间问题,关于该问题的注意事项就不在啰嗦了 一.全部排序sort.stable_sort sort是一种不稳定排序,使用时需要包含头文件algorithm 默认 ...
- hdu 3501 Calculation 2 (欧拉函数)
题目 题意:求小于n并且 和n不互质的数的总和. 思路:求小于n并且与n互质的数的和为:n*phi[n]/2 . 若a和n互质,n-a必定也和n互质(a<n).也就是说num必定为偶数.其中互质 ...
- bzoj1132
每次都选最左边的点,然后以这个点为原点 统计和这个点构成的三角形面积和 不难想到极角排序然后由叉积很容易求出 shl ; eps=1e-8; var i,j,k,m,n:longint; x,y:.. ...