hiveql basic
set hive.cli.print.current.db=true;
set hive.mapred.mode=strict;
set hive.mapred.mode=nonstrict;
SHOW PARTITIONS tablename;
--Dynamic Partition Inserts --by position not by names
INSERT OVERWRITE TABLE employees
PARTITION (country, state)
SELECT ..., se.cnty, se.st
FROM staged_employees se;SET hive.map.aggr=true;
----with this way , we can not generate the temporary table
FROM (
SELECT upper(name), salary, deductions["Federal Taxes"] as fed_taxes,
round(salary * (1 - deductions["Federal Taxes"])) as salary_minus_fed_taxes
FROM employees
) e
SELECT e.name, e.salary_minus_fed_taxes
WHERE e.salary_minus_fed_taxes > 70000;
--When Hive Can Avoid MapReduce
set hive.exec.mode.local.auto=true;
--Hive supports the classic SQL JOINstatement, but only equi-joinsare supported.
--Hive also assumes that the lasttable in the query is the largest
--It attempts to buffer the other tables and then stream the last table through
-- you should structure your join queries so the largest table is last.
SELECT /*+ STREAMTABLE(s) */ s.ymd, s.symbol, s.price_close, d.dividend
FROM stocks s JOIN dividends d ON s.ymd = d.ymd AND s.symbol = d.symbol
WHERE s.symbol = 'AAPL';
set hive.auto.convert.join=true;
hive.mapjoin.smalltable.filesize=25000000;--table size less than this can use in map phase
SELECT /*+ MAPJOIN(d) */ s.ymd, s.symbol, s.price_close, d.dividend
FROM stocks s JOIN dividends d ON s.ymd = d.ymd AND s.symbol = d.symbol
WHERE s.symbol = 'AAPL';
set hive.input.format=org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat;
set hive.optimize.bucketmapjoin=true;
set hive.optimize.bucketmapjoin.sortedmerge=true;
--Using DISTRIBUTE BY ... SORT BYor the shorthand CLUSTER BYclauses is a way to exploit
--the parallelism of SORT BY, yet achieve a total ordering across the output files.
--this method is better than use order by (just one reducer);
--Queries that Sample Data
SELECT * from numbers TABLESAMPLE(BUCKET 3 OUT OF 10 ON rand()) s;
SELECT * FROM numbersflat TABLESAMPLE(0.1 PERCENT) s;--block sampling
--index
CREATE INDEX employees_index
ON TABLE employees (country)
AS 'org.apache.hadoop.hive.ql.index.compact.CompactIndexHandler'
WITH DEFERRED REBUILD;
hiveql basic的更多相关文章
- Atitit HTTP 认证机制基本验证 (Basic Authentication) 和摘要验证 (Digest Authentication)attilax总结
Atitit HTTP认证机制基本验证 (Basic Authentication) 和摘要验证 (Digest Authentication)attilax总结 1.1. 最广泛使用的是基本验证 ( ...
- Basic Tutorials of Redis(9) -First Edition RedisHelper
After learning the basic opreation of Redis,we should take some time to summarize the usage. And I w ...
- Basic Tutorials of Redis(8) -Transaction
Data play an important part in our project,how can we ensure correctness of the data and prevent the ...
- Basic Tutorials of Redis(7) -Publish and Subscribe
This post is mainly about the publishment and subscription in Redis.I think you may subscribe some o ...
- Basic Tutorials of Redis(6) - List
Redis's List is different from C#'s List,but similar with C#'s LinkedList.Sometimes I confuse with t ...
- Basic Tutorials of Redis(5) - Sorted Set
The last post is mainly about the unsorted set,in this post I will show you the sorted set playing a ...
- Basic Tutorials of Redis(4) -Set
This post will introduce you to some usages of Set in Redis.The Set is a unordered set,it means that ...
- Basic Tutorials of Redis(3) -Hash
When you first saw the name of Hash,what do you think?HashSet,HashTable or other data structs of C#? ...
- Basic Tutorials of Redis(2) - String
This post is mainly about how to use the commands to handle the Strings of Redis.And I will show you ...
随机推荐
- fibonacci数列从a到b的个数
Description 我们定义斐波那契数列如下: f1=1 f2=2 f(n)=f(n-1)+f(n-2)(n>=3) 现在,给定两个数a和b,计算有多少个斐波那契数列中的数在a和b之间( ...
- 最全面的jdbcUtils,总有一种适合你
附加jar包,TxQueryRunner.java文件,dbconfig.properties配置文件(点击链接下载): http://files.cnblogs.com/files/xiaoming ...
- Maven的安装使用以及 Maven+Spring hello world example
关于Maven Maven是一个用于项目构建的工具,通过它便捷的管理项目的生命周期.即项目的jar包依赖,开发,测试,发布打包. 做过.NET的人应该会联想到Nuget,是的Maven其实就是java ...
- Jaxb解析xml准换为javabean
先说下这个的背景吧,前些日子,有个以前的小同事说刚接触webservice,想解析下xml,记得我学的时候还是dom4j,sax的解析方式,最近看别人的代码用的jaxb的方式,觉得注解起来很简练,所以 ...
- nodejs连接mysql并进行简单的增删查改
最近在入门nodejs,正好学习到了如何使用nodejs进行数据库的连接,觉得比较重要,便写一下随笔,简单地记录一下 使用在安装好node之后,我们可以使用npm命令,在项目的根目录,安装nodejs ...
- win7 下配置Openssl
最近刚刚装了openssl,遇到了很多问题,于是记录了下来: 我的PC环境是:系统win7,32位,Microsoft Visual Studio 2010: 下面开始安装: 1.安装前的准备:首先下 ...
- ASP.NET页面间传值总结
本文我们将讨论的是ASP.NET页面间数据传递的几种方法,对此希望能帮助大家正确的理解ASP.NET页面间数据传递的用处以及便利性. Web页面是无状态的,服务器对每一次请求都认为来自不同用户,因此, ...
- android XMl 解析神奇xstream 一: 解析android项目中 asset 文件夹 下的 aa.xml 文件
简介 XStream 是一个开源项目,一套简单实用的类库,用于序列化对象与 XML 对象之间的相互转换. 将 XML 文件内容解析为一个对象或将一个对象序列化为 XML 文件. 1.下载工具 xstr ...
- 【读书笔记】iOS-UIFont-如何知道字体的PostScript名称
一,名词解释 PostScript字体: 按 PostScript 页面描述语言 (PDL) 规则定义的字体,并且只能在 PostScript 兼容的打印机上打印. 二,打开Launchpad---- ...
- mac 下安装android studio(转)
1)下载最新jdk8,下载android studio 2)安装jdk8,双击jdk8的安装包,将jdk8的安装包拖到Application,可能会出现这种问题:要求Mac OS X10.7.3或更高 ...