hiveql basic
set hive.cli.print.current.db=true;
set hive.mapred.mode=strict;
set hive.mapred.mode=nonstrict;
SHOW PARTITIONS tablename;
--Dynamic Partition Inserts --by position not by names
INSERT OVERWRITE TABLE employees
PARTITION (country, state)
SELECT ..., se.cnty, se.st
FROM staged_employees se;SET hive.map.aggr=true;
----with this way , we can not generate the temporary table
FROM (
SELECT upper(name), salary, deductions["Federal Taxes"] as fed_taxes,
round(salary * (1 - deductions["Federal Taxes"])) as salary_minus_fed_taxes
FROM employees
) e
SELECT e.name, e.salary_minus_fed_taxes
WHERE e.salary_minus_fed_taxes > 70000;
--When Hive Can Avoid MapReduce
set hive.exec.mode.local.auto=true;
--Hive supports the classic SQL JOINstatement, but only equi-joinsare supported.
--Hive also assumes that the lasttable in the query is the largest
--It attempts to buffer the other tables and then stream the last table through
-- you should structure your join queries so the largest table is last.
SELECT /*+ STREAMTABLE(s) */ s.ymd, s.symbol, s.price_close, d.dividend
FROM stocks s JOIN dividends d ON s.ymd = d.ymd AND s.symbol = d.symbol
WHERE s.symbol = 'AAPL';
set hive.auto.convert.join=true;
hive.mapjoin.smalltable.filesize=25000000;--table size less than this can use in map phase
SELECT /*+ MAPJOIN(d) */ s.ymd, s.symbol, s.price_close, d.dividend
FROM stocks s JOIN dividends d ON s.ymd = d.ymd AND s.symbol = d.symbol
WHERE s.symbol = 'AAPL';
set hive.input.format=org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat;
set hive.optimize.bucketmapjoin=true;
set hive.optimize.bucketmapjoin.sortedmerge=true;
--Using DISTRIBUTE BY ... SORT BYor the shorthand CLUSTER BYclauses is a way to exploit
--the parallelism of SORT BY, yet achieve a total ordering across the output files.
--this method is better than use order by (just one reducer);
--Queries that Sample Data
SELECT * from numbers TABLESAMPLE(BUCKET 3 OUT OF 10 ON rand()) s;
SELECT * FROM numbersflat TABLESAMPLE(0.1 PERCENT) s;--block sampling
--index
CREATE INDEX employees_index
ON TABLE employees (country)
AS 'org.apache.hadoop.hive.ql.index.compact.CompactIndexHandler'
WITH DEFERRED REBUILD;
hiveql basic的更多相关文章
- Atitit HTTP 认证机制基本验证 (Basic Authentication) 和摘要验证 (Digest Authentication)attilax总结
Atitit HTTP认证机制基本验证 (Basic Authentication) 和摘要验证 (Digest Authentication)attilax总结 1.1. 最广泛使用的是基本验证 ( ...
- Basic Tutorials of Redis(9) -First Edition RedisHelper
After learning the basic opreation of Redis,we should take some time to summarize the usage. And I w ...
- Basic Tutorials of Redis(8) -Transaction
Data play an important part in our project,how can we ensure correctness of the data and prevent the ...
- Basic Tutorials of Redis(7) -Publish and Subscribe
This post is mainly about the publishment and subscription in Redis.I think you may subscribe some o ...
- Basic Tutorials of Redis(6) - List
Redis's List is different from C#'s List,but similar with C#'s LinkedList.Sometimes I confuse with t ...
- Basic Tutorials of Redis(5) - Sorted Set
The last post is mainly about the unsorted set,in this post I will show you the sorted set playing a ...
- Basic Tutorials of Redis(4) -Set
This post will introduce you to some usages of Set in Redis.The Set is a unordered set,it means that ...
- Basic Tutorials of Redis(3) -Hash
When you first saw the name of Hash,what do you think?HashSet,HashTable or other data structs of C#? ...
- Basic Tutorials of Redis(2) - String
This post is mainly about how to use the commands to handle the Strings of Redis.And I will show you ...
随机推荐
- Visual Studio《加载此属性页时出错》的解决办法
打开aspx页面时不能切换到设计视图,vs 2008工具箱中无控件.打开vs 2008的工具>选项>HTML设计器时提示:加载此属性页时出错 有时还会有其它错误提示,比如打开一个Windo ...
- vmware screen
1. Question Description: the screen of the vmware looks small . 2. Solution: 2.1 look the size of sc ...
- 初学者对WAMP服务器的设置
服务器设置 在wamp/bin/apache/Apache###/conf/httpd.conf文件中设置 根文件夹 修改documentroot和directory两项 保存后重启服务 404返回值 ...
- Play 内置模板标签(1.2.3版本)http://www.anool.net/?p=617
a标签: 用来插入一个连接到控制器方法的html link.如下: #{a @Application.logout()}Disconnect#{/a}模板内容被解析后变成: <a href=&q ...
- 自制javascript游戏-点燃火绳
自制javascript游戏-点燃火绳 这是一款多关卡的游戏,目录有21个地图,游戏采纯原生 js库JY编写,所以编写得很简单迅速,这款游戏的思路来源于,一个人撸管太多,手会不会连鼠标也拿不稳,为了验 ...
- Egret Engine(白鹭引擎)介绍及windows下安装
Egret Engine简要介绍----- Egret Engine(白鹭引擎)[Egret Engine官网:http://www.egret-labs.org/]是一款使用TypeScript语言 ...
- 为网站添加ico图标
1.将目标图片装换为.ico格式 在线转换工具:http://www.bitbug.net/ 2.将装换后的图标放入项目文件中,一般命名为favicon 3.在项目文件的index页面中引入 < ...
- 如何在 在SharePoint 2013/2010 解决方案中添加 ashx (HttpHandler)
本文讲述如何在 在SharePoint 2013/2010 解决方案中添加 ashx (HttpHandler). 一般处理程序(HttpHandler)是·NET众多web组件的一种,ashx是其扩 ...
- C语言内存对齐详解
一.字节对齐基本概念 现代计算机中内存空间都是按照byte划分的,从理论上讲似乎对任何类型的变量的访问可以从任何地址开始,但实际情况是在访问特定类型变量的时候经常在特定的内存地址访问,这就需要各种类型 ...
- C迷途指针
在计算机编程领域中,迷途指针,或称悬空指针.野指针,指的是不指向任何合法的对象的指针. 当所指向的对象被释放或者收回,但是对该指针没有作任何的修改,以至于该指针仍旧指向已经回收的内存地址,此情况下该指 ...