Skip to end of metadata

 

Go to start of metadata

 

Windowing and Analytics Functions

Enhancements to Hive QL

Version

Icon

Introduced in Hive version 0.11.

This section introduces the Hive QL enhancements for windowing and analytics functions. See "Windowing Specifications in HQL" (attached to HIVE-4197) for details. HIVE-896 has more information, including links to earlier documentation in the initial comments.

All of the windowing and analytics functions operate as per the SQL standard.

The current release supports the following functions for windowing and analytics:

  1. Windowing functions

    • LEAD

      • The number of rows to lead can optionally be specified. If the number of rows to lead is not specified, the lead is one row.
      • Returns null when the lead for the current row extends beyond the end of the window.
    • LAG
      • The number of rows to lag can optionally be specified. If the number of rows to lag is not specified, the lag is one row.
      • Returns null when the lag for the current row extends before the beginning of the window.
    • FIRST_VALUE
    • LAST_VALUE
  2. The OVER clause
    • OVER with standard aggregates:

      • COUNT
      • SUM
      • MIN
      • MAX
      • AVG
    • OVER with a PARTITION BY statement with one or more partitioning columns of any primitive datatype.
    • OVER with PARTITION BY and ORDER BY with one or more partitioning and/or ordering columns of any datatype.
      • OVER with a window specification. Windows can be defined separately in a WINDOW clause. Window specifications support these standard options:

        ROWS ((CURRENT ROW) | (UNBOUNDED | [num]) PRECEDING) AND (UNBOUNDED | [num]) FOLLOWING
        
        Icon

        The OVER clause supports the following functions, but it does not support a window with them (see HIVE-4797):

        Ranking functions: Rank, NTile, DenseRank, CumeDist, PercentRank.

        Lead and Lag functions.

  3. Analytics functions
    • RANK
    • ROW_NUMBER
    • DENSE_RANK
    • CUME_DIST
    • PERCENT_RANK
    • NTILE

Examples

This section provides examples of how to use the Hive QL windowing and analytics functions in SELECT statements. See HIVE-896 for additional examples.

PARTITION BY with one partitioning column, no ORDER BY or window specification

SELECT a, COUNT(b) OVER (PARTITION BY c)
FROM T;

PARTITION BY with two partitioning columns, no ORDER BY or window specification

SELECT a, COUNT(b) OVER (PARTITION BY c, d)
FROM T;

PARTITION BY with one partitioning column, one ORDER BY column, and no window specification

SELECT a, SUM(b) OVER (PARTITION BY ORDER BY d)
FROM T;

PARTITION BY with two partitioning columns, two ORDER BY columns, and no window specification

SELECT a, SUM(b) OVER (PARTITION BY c, d ORDER BY e, f)
FROM T;

PARTITION BY with partitioning, ORDER BY, and window specification

SELECT a, SUM(b) OVER (PARTITION BY ORDER BY ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
FROM T;
SELECT a, AVG(b) OVER (PARTITION BY ORDER BY ROWS BETWEEN 3 PRECEDING AND CURRENT ROW)
FROM T;
SELECT a, AVG(b) OVER (PARTITION BY ORDER BY ROWS BETWEEN 3 PRECEDING AND 3 FOLLOWING)
FROM T;
SELECT a, AVG(b) OVER (PARTITION BY ORDER BY ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
FROM T;

There can be multiple OVER clauses in a single query. A single OVER clause only applies to the immediately preceding function call. In this example, the first OVER clause applies to COUNT(b) and the second OVER clause applies to SUM(b):

SELECT 
 a,
 COUNT(b) OVER (PARTITION BY c),
 SUM(b) OVER (PARTITION BY c)
FROM T;

Aliases can be used as well, with or without the keyword AS:

SELECT 
 a,
 COUNT(b) OVER (PARTITION BY c) AS b_count,
 SUM(b) OVER (PARTITION BY c) b_sum
FROM T;

WINDOW clause

SELECT a, SUM(b) OVER w
FROM T;
WINDOW w AS (PARTITION BY ORDER BY ROWS UNBOUNDED PRECEDING)

LEAD using default 1 row lead and not specifying default value

SELECT a, LEAD(a) OVER (PARTITION BY ORDER BY ROWS BETWEEN CURRENT ROW AND 1 FOLLOWING)
FROM T;

LAG specifying a lag of 3 rows and default value of 0

SELECT a, LAG(a, 3, 0) OVER (PARTITION BY ORDER BY ROWS 3 PRECEDING)
FROM T;
 

[Hive - LanguageManual ] Windowing and Analytics Functions (待)的更多相关文章

  1. [Hive - LanguageManual] Create/Drop/Alter -View、 Index 、 Function

    Create/Drop/Alter View Create View Drop View Alter View Properties Alter View As Select Version info ...

  2. [HIve - LanguageManual] Hive Operators and User-Defined Functions (UDFs)

    Hive Operators and User-Defined Functions (UDFs) Hive Operators and User-Defined Functions (UDFs) Bu ...

  3. [Hive - LanguageManual] Select base use

    Select Syntax WHERE Clause ALL and DISTINCT Clauses Partition Based Queries HAVING Clause LIMIT Clau ...

  4. [Hive - LanguageManual ] ]SQL Standard Based Hive Authorization

    Status of Hive Authorization before Hive 0.13 SQL Standards Based Hive Authorization (New in Hive 0. ...

  5. [HIve - LanguageManual] LateralView

    Lateral View Syntax Description Example Multiple Lateral Views Outer Lateral Views Lateral View Synt ...

  6. [HIve - LanguageManual] XPathUDF

    Documentation for Built-In User-Defined Functions Related To XPath UDFs xpath, xpath_short, xpath_in ...

  7. [Hive - LanguageManual] GroupBy

    Group By Syntax Simple Examples Select statement and group by clause Advanced Features Multi-Group-B ...

  8. [Hive - LanguageManual] Import/Export

    LanguageManual ImportExport     Skip to end of metadata   Added by Carl Steinbach, last edited by Le ...

  9. [Hive - LanguageManual] DML: Load, Insert, Update, Delete

    LanguageManual DML Hive Data Manipulation Language Hive Data Manipulation Language Loading files int ...

随机推荐

  1. C#基础练习(事件登陆案例)

    Form1的后台代码: namespace _08事件登陆案例 {     public partial class Form1 : Form     {         public Form1() ...

  2. JavaScript之this,new,delete,call,apply(转)

    JavaScript之this,new,delete,call,apply 1.this 一般而言,在Javascript中,this指向函数执行时的当前对象. 2.new 在JavaScript中, ...

  3. IntelliJ IDEA 15开发Java Maven项目

    1.安装好之后开始创建项目

  4. 持久化框架Hibernate 开发实例(一)

    1 Hibernate简介 Hibernate框架是一个非常流行的持久化框架,其中在web开发中占据了非常重要的地位, Hibernate作为Web应用的底层,实现了对数据库操作的封装.HIberna ...

  5. sublime中文乱码

    今天在用sublime的时候,又出现乱码的情况了.弹层如下: 检测了一下,当前文件,sublime编辑器左下角显示如下: 显示的是 ASCII 编码的文件,而其他没有没问题的页面显示的的 GBK(或G ...

  6. Alpha、Beta、RC、GA版本的区别

    Alpha:是内部测试版,一般不向外部发布,会有很多Bug.一般只有测试人员使用. Beta:也是测试版,这个阶段的版本会一直加入新的功能.在Alpha版之后推出. RC:(Release Candi ...

  7. sql语句记录

    清空日志 DUMP TRANSACTION 库名 WITH NO_LOG 截断事务日志 BACKUP LOG 数据库名 WITH NO_LOG 收缩数据库 DBCC SHRINKDATABASE(数据 ...

  8. UVa 11400 Lighting System Design

    题意: 一共有n种灯泡,不同种类的灯泡必须用不同种电源,但同一种灯泡可以用同一种电源.每种灯泡有四个参数: 电压值V.电源费用K.每个灯泡的费用C.所需该种灯泡的数量L 为了省钱,可以用电压高的灯泡来 ...

  9. 自定义View等待旋转

    效果图 1 string.xml <string name="default_progressbar">Default Progressbar:</string& ...

  10. Share SDK 第三方登录

    import java.util.HashMap; import org.apache.http.Header; import android.app.Activity; import android ...