hiveql basic
set hive.cli.print.current.db=true;
set hive.mapred.mode=strict;
set hive.mapred.mode=nonstrict;
SHOW PARTITIONS tablename;
--Dynamic Partition Inserts --by position not by names
INSERT OVERWRITE TABLE employees
PARTITION (country, state)
SELECT ..., se.cnty, se.st
FROM staged_employees se;SET hive.map.aggr=true;
----with this way , we can not generate the temporary table
FROM (
SELECT upper(name), salary, deductions["Federal Taxes"] as fed_taxes,
round(salary * (1 - deductions["Federal Taxes"])) as salary_minus_fed_taxes
FROM employees
) e
SELECT e.name, e.salary_minus_fed_taxes
WHERE e.salary_minus_fed_taxes > 70000;
--When Hive Can Avoid MapReduce
set hive.exec.mode.local.auto=true;
--Hive supports the classic SQL JOINstatement, but only equi-joinsare supported.
--Hive also assumes that the lasttable in the query is the largest
--It attempts to buffer the other tables and then stream the last table through
-- you should structure your join queries so the largest table is last.
SELECT /*+ STREAMTABLE(s) */ s.ymd, s.symbol, s.price_close, d.dividend
FROM stocks s JOIN dividends d ON s.ymd = d.ymd AND s.symbol = d.symbol
WHERE s.symbol = 'AAPL';
set hive.auto.convert.join=true;
hive.mapjoin.smalltable.filesize=25000000;--table size less than this can use in map phase
SELECT /*+ MAPJOIN(d) */ s.ymd, s.symbol, s.price_close, d.dividend
FROM stocks s JOIN dividends d ON s.ymd = d.ymd AND s.symbol = d.symbol
WHERE s.symbol = 'AAPL';
set hive.input.format=org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat;
set hive.optimize.bucketmapjoin=true;
set hive.optimize.bucketmapjoin.sortedmerge=true;
--Using DISTRIBUTE BY ... SORT BYor the shorthand CLUSTER BYclauses is a way to exploit
--the parallelism of SORT BY, yet achieve a total ordering across the output files.
--this method is better than use order by (just one reducer);
--Queries that Sample Data
SELECT * from numbers TABLESAMPLE(BUCKET 3 OUT OF 10 ON rand()) s;
SELECT * FROM numbersflat TABLESAMPLE(0.1 PERCENT) s;--block sampling
--index
CREATE INDEX employees_index
ON TABLE employees (country)
AS 'org.apache.hadoop.hive.ql.index.compact.CompactIndexHandler'
WITH DEFERRED REBUILD;
hiveql basic的更多相关文章
- Atitit HTTP 认证机制基本验证 (Basic Authentication) 和摘要验证 (Digest Authentication)attilax总结
Atitit HTTP认证机制基本验证 (Basic Authentication) 和摘要验证 (Digest Authentication)attilax总结 1.1. 最广泛使用的是基本验证 ( ...
- Basic Tutorials of Redis(9) -First Edition RedisHelper
After learning the basic opreation of Redis,we should take some time to summarize the usage. And I w ...
- Basic Tutorials of Redis(8) -Transaction
Data play an important part in our project,how can we ensure correctness of the data and prevent the ...
- Basic Tutorials of Redis(7) -Publish and Subscribe
This post is mainly about the publishment and subscription in Redis.I think you may subscribe some o ...
- Basic Tutorials of Redis(6) - List
Redis's List is different from C#'s List,but similar with C#'s LinkedList.Sometimes I confuse with t ...
- Basic Tutorials of Redis(5) - Sorted Set
The last post is mainly about the unsorted set,in this post I will show you the sorted set playing a ...
- Basic Tutorials of Redis(4) -Set
This post will introduce you to some usages of Set in Redis.The Set is a unordered set,it means that ...
- Basic Tutorials of Redis(3) -Hash
When you first saw the name of Hash,what do you think?HashSet,HashTable or other data structs of C#? ...
- Basic Tutorials of Redis(2) - String
This post is mainly about how to use the commands to handle the Strings of Redis.And I will show you ...
随机推荐
- C# 生成BMP图片
;i<;i++) { Bitmap bmp=new Bitmap(this.pictureBox1.Image); Graphics g=Graphics.FromImage((Image)bm ...
- 创建WCF服务自我寄宿
WCF服务的寄宿方式 WCF寄宿方式是一种非常灵活的操作,可以寄宿在各种进程之中,常见的寄宿有: IIS服务.Windows服务.Winform程序.控制台程序中进行寄宿,从而实现WCF服务的运行,为 ...
- 【NOIP训练】【Tarjan求割边】上学
题目描述 给你一张图,询问当删去某一条边时,起点到终点最短路是否改变. 输入格式 第一行输入两个正整数,分别表示点数和边数.第二行输入两个正整数,起点标号为,终点标号为.接下来行,每行三个整数,表示有 ...
- JavaScript 中有关数组对象的方法
JS 处理数组多种方法 js 中的数据类型分为两大类:原始类型和对象类型. 原始类型包括:数值.字符串.布尔值.null.undefined 对象类型包括:对象即是属性的集合,当然这里又两个特殊的对象 ...
- ADO.NET 增删改查的基本用法
ADO.NET:数据访问技术 就是将C#和MSSQL连接起来的一个纽带 可以通过ADO.NET将内存中的临时数据写入到数据库中也可以将数据库中的数据提取到内存中供程序调用 所有数据访问技术的基础 连接 ...
- SQL数据库基础(六)
子查询,又叫做嵌套查询. 将一个查询语句做为一个结果集供其他SQL语句使用,就像使用普通的表一样,被当作结果集的查询语句被称为子查询. 子查询有两种类型: 一种是只返回一个单值的子查询,这时它可以用在 ...
- Atitit.电脑图片与拍摄图片的分别
Atitit.电脑图片与拍摄图片的分别 1. Extname都是jpg的..1 1.1. 数码照片的Exif信息, 1 1.2. 是否有人脸1 1.3. 是否skin图1 1.4. 是否大面积色素单一 ...
- Sharepoint学习笔记—习题系列--70-573习题解析 -(Q81-Q84)
Question 81You need to create a Web Part that creates a copy of the out-of-the-box Contribute permis ...
- ubuntu下安装程序的三种方法
引言 在ubuntu当中,安装应用程序我所知道的有三种方法,分别是apt-get,dpkg安装deb和make install安装源码包三种.下面针对每一种方法各举例来说明. apt-get方法 使用 ...
- cell自适应高度
MyModel.h #import <Foundation/Foundation.h> #import <UIKit/UIKit.h> @interface MyModel : ...