hive复杂数据类型的用法

1、简单描述
2、测试

1、简单描述

arrays: ARRAY<data_type>
maps: MAP<primitive_type, data_type>
structs: STRUCT<col_name : data_type [COMMENT col_comment], ...>
union: UNIONTYPE<data_type, data_type, ...>

Hive 中对该类型的完全支持仍然不完整。如果 JOIN、WHERE 和 GROUP BY 子句中引用的 UNIONTYPE 字段的查询将会失败，Hive 没有定义语法来提取 UNIONTYPE 的 tag 或 value 字段。

复杂数据类型的构造函数：

构造函数	操作数	描述
map	(key1, value1, key2, value2, ...)	Creates a map with the given key/value pairs.
struct	(val1, val2, val3, ...)	Creates a struct with the given field values. Struct field names will be col1, col2, ....
named_struct	(name1, val1, name2, val2, ...)	Creates a struct with the given field names and values. (As of Hive 0.8.0.)
array	(val1, val2, ...)	Creates an array with the given elements.
create_union	(tag, val1, val2, ...)	Creates a union type with the value that is being pointed to by the tag parameter.

注：create_union 中的 tag 让我们知道 union 的哪一部分正在被使用。

复杂数据类型访问元素：

构造函数	操作数	描述
A[n]	A is an Array and n is an int	Returns the nth element in the array A. The first element has index 0. For example, if A is an array comprising of ['foo', 'bar'] then A[0] returns 'foo' and A[1] returns 'bar'.
M[key]	M is a Map<K, V> and key has type K	Returns the value corresponding to the key in the map. For example, if M is a map comprising of {'f' -> 'foo', 'b' -> 'bar', 'all' -> 'foobar'} then M['all'] returns 'foobar'.
S.x	S is a struct	Returns the x field of S. For example for the struct foobar {int foo, int bar}, foobar.foo returns the integer stored in the foo field of the struct.

2、测试

-- ------------------------------ ARRAY ------------------------------

-- ARRAY<data_type>

create table arraytest (id int,info array<string>)

row format delimited

fields terminated by '\t'

collection items terminated by ','

stored as textfile;

-- 不要忽略`collection items terminated by ','

-- 它表示数组元素间的分隔符

-- 如果忽略了输出是这样的：

hive> select * from arraytest;

OK

1       ["zhangsan,male"]

2       ["lisi,male"]

-- 数据

1	zhangsan,male

2	lisi,male

-- 导入

load data local inpath '/root/data/arraytest.txt' into table arraytest;

-- 查看

hive> select * from arraytest;

OK

1       ["zhangsan","male"]

2       ["lisi","male"]

-- 索引查看数组元素

hive> select id,info[0] from arraytest;

OK

1       zhangsan

2       lisi

-- 将数组的所有元素展开输出

hive> select explode(info) from arraytest;

OK

zhangsan

male

lisi

male

-- ------------------------------ MAP ------------------------------

-- MAP<primitive_type, data_type>

create table maptest (id int,info map<string,string>)

row format delimited

fields terminated by '\t'

collection items terminated by ','

map keys terminated by ':'

stored as textfile;

-- 不要忽略`map keys terminated by ':'

-- 它表示键值间的分隔符

-- 数据

1	name:zhangsan,sex:male

2	name:lisi,sex:male

-- 导入

load data local inpath '/root/data/maptest.txt' into table maptest;

-- 查看

hive> select * from maptest;

OK

1       {"name":"zhangsan","sex":"male"}

2       {"name":"lisi","sex":"male"}

-- 查看map元素

hive> select id,info["name"] from maptest;

OK

1       zhangsan

2       lisi

-- ------------------------------ STRUCT ------------------------------

-- STRUCT<col_name : data_type [COMMENT col_comment], ...>

create table structtest (id int,info struct<name:string,sex:string>)

row format delimited

fields terminated by '\t'

collection items terminated by ','

stored as textfile;

-- 数据

1	zhangsan,male

2	lisi,male

-- 导入

load data local inpath '/root/data/structtest.txt' into table structtest;

-- 查看

hive> select * from structtest;

OK

1       {"name":"zhangsan","sex":"male"}

2       {"name":"lisi","sex":"male"}

hive> select id,info.name from structtest;

OK

1       zhangsan

2       lisi

-- ------------------------------ 综合array\map\struct ------------------------------

create table alltest(

    id int,

    name string,

    salary bigint,

    sub array<string>,

    details map<string, int>,

    address struct<city:string, state:string, pin:int>

)

row format delimited

fields terminated by ','

collection items terminated by '$'

map keys terminated by '#'

stored as textfile;

-- 数据

1,abc,40000,a$b$c,pf#500$epf#200,hyd$ap$500001

2,def,3000,d$f,pf#500,bang$kar$600038

4,abc,40000,a$b$c,pf#500$epf#200,bhopal$MP$452013

5,def,3000,d$f,pf#500,Indore$MP$452014

-- 导入数据

load data local inpath '/root/data/alltest.txt' into table alltest;

-- 查看

hive> select * from alltest;

OK

1       abc     40000   ["a","b","c"]   {"pf":500,"epf":200}    {"city":"hyd","state":"ap","pin":500001}

2       def     3000    ["d","f"]       {"pf":500}      {"city":"bang","state":"kar","pin":600038}

4       abc     40000   ["a","b","c"]   {"pf":500,"epf":200}    {"city":"bhopal","state":"MP","pin":452013}

5       def     3000    ["d","f"]       {"pf":500}      {"city":"Indore","state":"MP","pin":452014}

-- ------------------------------ UNIONTYPE ------------------------------

-- create_union(tag, val1, val2, ...)

-- Creates a union type with the value that is being pointed to by the tag parameter. 

-- ---- 简单示例：里面都是基本类型 ------

create table uniontest(

    id int,

    info uniontype<string,string>

)

row format delimited

fields terminated by '\t'

collection items terminated by ','

stored as textfile;

-- 插入数据：insert into

-- tag 索引后面的值是从 0 开始的

insert into table uniontest

    values

    (1,create_union(0,"zhangsan","male")),  -- 使用 "zhangsan"

    (1,create_union(1,"zhangsan","male")),  -- 使用 "male"

    (2,create_union(0,"lisi","female")),

    (2,create_union(1,"lisi","female"));

-- 查看

hive> select * from uniontest;

OK

1       {0:"zhangsan"}

1       {1:"male"}

2       {0:"lisi"}

2       {1:"female"}

-- 数据

1	0,zhangsan

1	1,male

2	0,lisi

2	1,female

-- 插入数据：load data

load data local inpath '/root/data/uniontest.txt' into table uniontest;

-- 查看

hive> select * from uniontest;

OK

1       {0:"zhangsan"}

1       {1:"male"}

2       {0:"lisi"}

2       {1:"female"}

-- 如果数据格式是这样的：

-- 1	0,zhangsan,male

-- 1	1,zhangsan,male

-- 2	0,lisi,female

-- 2	1,lisi,female

-- 会把后面的字符串当作一个整体，输出：

-- 1       {0:"zhangsan,male"}

-- 1       {1:"zhangsan,male"}

-- 2       {0:"lisi,female"}

-- 2       {1:"lisi,female"}

-- ---- 复杂示例：里面包含复杂类型 ------

create table uniontest_comp(

    id int,

    info uniontype<int,

                   string,

                   array<string>,

                   map<string,string>,

                   struct<sex:string,age:string>>

)

row format delimited

fields terminated by '\t'

collection items terminated by ','

stored as textfile;

-- 插入数据

-- 也可以使用 `insert into table ....select ....`

insert into table uniontest_comp

    values

    (1,create_union(0,1,"zhangsan",array("male","33"),map("sex","male","age","33"),named_struct("sex","male","age","33"))),

    (1,create_union(1,1,"zhangsan",array("male","33"),map("sex","male","age","33"),named_struct("sex","male","age","33"))),

    (1,create_union(2,1,"zhangsan",array("male","33"),map("sex","male","age","33"),named_struct("sex","male","age","33"))),

    (1,create_union(3,1,"zhangsan",array("male","33"),map("sex","male","age","33"),named_struct("sex","male","age","33"))),

    (1,create_union(4,1,"zhangsan",array("male","33"),map("sex","male","age","33"),named_struct("sex","male","age","33")));

-- 查看

hive> select * from uniontest_comp;

OK

1       {0:1}

1       {1:"zhangsan"}

1       {2:["male","33"]}

1       {3:{"sex":"male","age":"33"}}

1       {4:{"sex":"male","age":"33"}}

参考：http://querydb.blogspot.com/2015/11/hive-complex-data-types.html

hive复杂数据类型的用法的更多相关文章

大数据时代的技术hive：hive的数据类型和数据模型
在上篇文章里,我列举了一个简单的hive操作实例,创建了一张表test,并且向这张表加载了数据,这些操作和关系数据库操作类似,我们常把hive和关系数据库进行比较,也正是因为hive很多知识点和关系数 ...
day01-day04总结- Python 数据类型及其用法
Python 数据类型及其用法: 本文总结一下Python中用到的各种数据类型,以及如何使用可以使得我们的代码变得简洁. 基本结构我们首先要看的是几乎任何语言都具有的数据类型,包括字符串.整型.浮点 ...
Hive 5、Hive 的数据类型和 DDL Data Definition Language)
官方帮助文档:https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL Hive的数据类型 -- 扩展数据类型data_t ...
hadoop笔记之Hive的数据类型
Hive的数据类型 Hive的数据类型前面说过,Hive是一个数据仓库,相当于一个数据库.既然是数据库,那么就必须能创建表,既然有表,那么当中就有列,列中就有对应的类型总的来讲,hive的数据类型 ...
Hive之数据类型
Hive之数据类型 (本文是基于多篇文章根据个人理解进行的整合,参考的文章见末尾的整理) 数据类型 Hive支持两种数据类型,一类叫原子数据类型,一类叫复杂数据类型.原子数据类型包括数值型.布尔型 ...
Hive 复杂数据类型的使用
Hive复杂数据类型 1.Array数据类型的使用 1.1.创建数据库表,以array作为数据类型 hive (hive_demo1)> create table stu_test(name a ...
《Hive编程指南》读书笔记 | 一文看懂Hive的数据类型和文件格式
Hive支持关系型数据库中的大多数基本数据类型,同时也支持关系型数据库中很少出现的3种集合数据类型. 和大多数数据库相比,Hive具有一个独特的功能,那就是其对于数据在文件中的编码方式具有非常大的灵活 ...
Mybatis中动态SQL语句中的parameterType不同数据类型的用法
Mybatis中动态SQL语句中的parameterType不同数据类型的用法1. 简单数据类型, 此时#{id,jdbcType=INTEGER}中id可以取任意名字如#{a,jdbcType ...
【Kylin实战】Hive复杂数据类型与视图
1. 引言在分析广告日志时,会有这样的多维分析需求: 曝光.点击用户分别有多少? 标签能覆盖多少广告用户? 各个标签(标注)类别能覆盖的曝光.点击在各个DSP上所覆盖的用户数 -- 广告数据与标签数 ...

随机推荐

AtCoder Beginner Contest 184 F - Programming Contest (双向搜索)
题意:有一个长度为$n$的数组,你可以从中选一些数出来使得它们的和不大于$t$,问能选出来的最大的和是多少. 题解:$n$的数据范围是$40$,直接二进制枚举贴TLE,之前写过这样的一 ...
002、Python中json字符串与字典转换
1.测试用例文件TestCase.xlsx 2.编写Python文件进行读取 #!/usr/bin/env python # -*- coding:utf-8 -*- import time impo ...
Seq2Seq原理详解
一.Seq2Seq简介 seq2seq 是一个Encoder–Decoder 结构的网络,它的输入是一个序列,输出也是一个序列.Encoder 中将一个可变长度的信号序列变为固定长度的向量表达,Dec ...
Windows下TeamViewer远程控制的安装与使用
Windows下远程控制的安装与使用由于疫情,在家里写论文,资料数据都在学校,只能远程控制了,选的是TeamViewer. 分为控制机和被控制机,安装使用略有不同. 从该网址安装:https://w ...
C# 数据类型(3)
动态类型 dynamic types 动态类型是后来引进的,他其实是一个static type,但是不像其他的静态类型,编译器不会检查你到底是啥类型(也不会检查你能不能去call某个'method') ...
关于ucore实验一的资料查找
任务:阅读实验一makefile 搞清楚ucore.img是如何构建的 $@ $< $^ 这三个变量分别是什么意思 https://blog.csdn.net/YEYUANGEN/arti ...
ASP.Net MVP Framework had been dead !
ASP.Net MVP Framework Project Description A project to get you started with creating and designing w ...
web 安全 & web 攻防: XSS（跨站脚本攻击）和 CSRF（跨站请求伪造）
web 安全 & web 攻防: XSS(跨站脚本攻击)和 CSRF(跨站请求伪造) XSS(跨站脚本攻击)和CSRF(跨站请求伪造) Cross-site Scripting (XSS) h ...
Electron Security All In One
Electron Security All In One https://www.electronjs.org/docs/tutorial/security CSP Content-Security- ...
TensorFlow & Machine Learning
TensorFlow & Machine Learning TensorFlow 实战传统方式规则 + 数据集 => 答案无监督学习机器学习神经元网络答案 + 数据集 =&g ...

hive复杂数据类型的用法

1、简单描述

2、测试

hive复杂数据类型的用法的更多相关文章

随机推荐

热门专题