Lateral View Syntax

lateralView: LATERAL VIEW udtf(expression) tableAlias AS columnAlias (',' columnAlias)*
fromClause: FROM baseTable (lateralView)*

Description

Lateral view is used in conjunction with user-defined table generating functions such as explode(). As mentioned in Built-in Table-Generating Functions, a UDTF generates zero or more output rows for each input row. A lateral view first applies the UDTF to each row of base table and then joins resulting output rows to the input rows to form a virtual table having the supplied table alias.

Version

Icon

Prior to Hive 0.6.0, lateral view did not support the predicate push-down optimization. In Hive 0.5.0 and earlier, if you used a WHERE clause your query may not have compiled. A workaround was to add set hive.optimize.ppd=false; before your query. The fix was made in Hive 0.6.0; seehttps://issues.apache.org/jira/browse/HIVE-1056: Predicate push down does not work with UDTF's.

Version

Icon

From Hive 0.12.0, column aliases can be omitted. In this case, aliases are inherited from field names of StructObjectInspector which is returned from UTDF.

Example

Consider the following base table named pageAds. It has two columns: pageid (name of the page) and adid_list (an array of ads appearing on the page):

Column name

Column type

pageid

STRING

adid_list

Array<int>

An example table with two rows:

pageid

adid_list

front_page

[1, 2, 3]

contact_page

[3, 4, 5]

and the user would like to count the total number of times an ad appears across all pages.

A lateral view with explode() can be used to convert adid_list into separate rows using the query:

SELECT pageid, adid
FROM pageAds LATERAL VIEW explode(adid_list) adTable AS adid;

The resulting output will be

pageid (string)

adid (int)

"front_page"

1

"front_page"

2

"front_page"

3

"contact_page"

3

"contact_page"

4

"contact_page"

5

Then in order to count the number of times a particular ad appears, count/group by can be used:

SELECT adid, count(1)
FROM pageAds LATERAL VIEW explode(adid_list) adTable AS adid
GROUP BY adid;

int adid

count(1)

1

1

2

1

3

2

4

1

5

1

Multiple Lateral Views

A FROM clause can have multiple LATERAL VIEW clauses. Subsequent LATERAL VIEWS can reference columns from any of the tables appearing to the left of the LATERAL VIEW.

For example, the following could be a valid query:

SELECT FROM exampleTable
LATERAL VIEW explode(col1) myTable1 AS myCol1
LATERAL VIEW explode(myCol1) myTable2 AS myCol2;

LATERAL VIEW clauses are applied in the order that they appear. For example with the following base table:

Array<int> col1

Array<string> col2

[1, 2]

[a", "b", "c"]

[3, 4]

[d", "e", "f"]

The query:

SELECT myCol1, col2 FROM baseTable
LATERAL VIEW explode(col1) myTable1 AS myCol1;

Will produce:

int mycol1

Array<string> col2

1

[a", "b", "c"]

2

[a", "b", "c"]

3

[d", "e", "f"]

4

[d", "e", "f"]

A query that adds an additional LATERAL VIEW:

SELECT myCol1, myCol2 FROM baseTable
LATERAL VIEW explode(col1) myTable1 AS myCol1
LATERAL VIEW explode(col2) myTable2 AS myCol2;

Will produce:

int myCol1

string myCol2

1

"a"

1

"b"

1

"c"

2

"a"

2

"b"

2

"c"

3

"d"

3

"e"

3

"f"

4

"d"

4

"e"

4

"f"

Outer Lateral Views

Version

Icon

Introduced in Hive version 0.12.0

The user can specify the optional OUTER keyword to generate rows even when a LATERAL VIEW usually would not generate a row. This happens when the UDTF used does not generate any rows which happens easily with explode when the column to explode is empty. In this case the source row would never appear in the results. OUTER can be used to prevent that and rows will be generated with NULL values in the columns coming from the UDTF.

For example, the following query returns an empty result:

SELEC * FROM src LATERAL VIEW explode(array()) C AS a limit 10;

But with the OUTER keyword

SELECT FROM src LATERAL VIEW OUTER explode(array()) C AS a limit 10;

it will produce:

238 val_238 NULL
86 val_86 NULL
311 val_311 NULL
27 val_27 NULL
165 val_165 NULL
409 val_409 NULL
255 val_255 NULL
278 val_278 NULL
98 val_98 NULL

[HIve - LanguageManual] LateralView的更多相关文章

  1. [HIve - LanguageManual] Hive Operators and User-Defined Functions (UDFs)

    Hive Operators and User-Defined Functions (UDFs) Hive Operators and User-Defined Functions (UDFs) Bu ...

  2. [Hive - LanguageManual ] Windowing and Analytics Functions (待)

    LanguageManual WindowingAndAnalytics     Skip to end of metadata   Added by Lefty Leverenz, last edi ...

  3. [Hive - LanguageManual] Import/Export

    LanguageManual ImportExport     Skip to end of metadata   Added by Carl Steinbach, last edited by Le ...

  4. [Hive - LanguageManual] DML: Load, Insert, Update, Delete

    LanguageManual DML Hive Data Manipulation Language Hive Data Manipulation Language Loading files int ...

  5. [Hive - LanguageManual] Alter Table/Partition/Column

    Alter Table/Partition/Column Alter Table Rename Table Alter Table Properties Alter Table Comment Add ...

  6. Hive LanguageManual DDL

    hive语法规则LanguageManual DDL SQL DML 和 DDL 数据操作语言 (DML) 和 数据定义语言 (DDL) 一.数据库 增删改都在文档里说得也很明白,不重复造车轮 二.表 ...

  7. [Hive - LanguageManual ] ]SQL Standard Based Hive Authorization

    Status of Hive Authorization before Hive 0.13 SQL Standards Based Hive Authorization (New in Hive 0. ...

  8. [Hive - LanguageManual] Hive Concurrency Model (待)

    Hive Concurrency Model Hive Concurrency Model Use Cases Turn Off Concurrency Debugging Configuration ...

  9. [Hive - LanguageManual ] Explain (待)

    EXPLAIN Syntax EXPLAIN Syntax Hive provides an EXPLAIN command that shows the execution plan for a q ...

随机推荐

  1. Android:修改版本

    修改AndroidManifest.xml下的Version即可 <uses-sdk android:minSdkVersion="14" android:targetSdk ...

  2. 查看Linux版本系统信息方法汇总

    Linux下如何查看版本信息, 包括位数.版本信息以及CPU内核信息.CPU具体型号等等,整个CPU信息一目了然.  1.# uname -a   (Linux查看版本当前操作系统内核信息)  Lin ...

  3. BitMask 使用参考

    对于 Java 类应用,内存方面需要注意: 不要占用大量内存,否则可用内存少:触发 GC 或 OutOfMemoryError: 不要频繁创建对象,频繁内存分配,触发 GC. 对于枚举和常量: 使用枚 ...

  4. sqlplus时报Linux-x86_64 Error: 13: Permission denied

    在本机上非oracle用户运行sqlplus时,报以下错误:[cpdds@node1 ~]$ sqlplus cpdds_pdata/cpdds_pdata SQL*Plus: Release 10. ...

  5. C/C++技巧

    C中如何调用C++函数 将 C++ 函数声明为``extern "C"''(在你的 C++ 代码里做这个声明),然后调用它(在你的 C 或者 C++ 代码里调用).例如: // C ...

  6. ORACLE将表中的数据恢复到某一个时间点

    执行如下SQL将test_temp表中的数据恢复到 2013-04-26  21:06:00 注意,这里一定要先删除全部数据,否则可能会导致数据重复 delete from test_temp; in ...

  7. 面试题_48_to_65_Java 集合框架的面试题

    这部分也包含数据结构.算法及数组的面试问题 48) List.Set.Map 和 Queue 之间的区别(答案)List 是一个有序集合,允许元素重复.它的某些实现可以提供基于下标值的常量访问时间,但 ...

  8. java6 新特新

    JAVA6新特性介绍   1. 使用JAXB来实现对象与XML之间的映射 JAXB是Java Architecture for XML Binding的缩写,可以将一个Java对象转变成为XML格式, ...

  9. 418. Sentence Screen Fitting

    首先想到的是直接做,然后TLE. public class Solution { public int wordsTyping(String[] sentence, int rows, int col ...

  10. UVa 11584 Partitioning by Palindromes【DP】

    题意:给出一个字符串,问最少能够划分成多少个回文串 dp[i]表示以第i个字母结束最少能够划分成的回文串的个数 dp[i]=min(dp[i],dp[j]+1)(如果从第j个字母到第i个字母是回文串) ...