[Hive] - Hive参数含义详解

　　hive中参数分为三类，第一种system环境变量信息，是系统环境变量信息；第二种是env环境变量信息，是当前用户环境变量信息；第三种是hive参数变量信息，是由hive-site.xml文件定义的以及当前hive会话定义的环境变量信息。其中第三种hive参数变量信息中又由hadoop hdfs参数(直接是hadoop的)、mapreduce参数、metastore元数据存储参数、metastore连接参数以及hive运行参数构成。

Hive-0.13.1-cdh5.3.6参数变量信息详解
参数	默认值	含义(用处)
datanucleus.autoCreateSchema	true	creates necessary schema on a startup if one doesn't exist. set this to false, after creating it once；如果数据元数据不存在，那么直接创建，如果设置为false，那么在之后创建。
datanucleus.autoStartMechanismMode	checked	throw exception if metadata tables are incorrect;如果数据元信息检查失败，抛出异常。可选value: checked, unchecked
datanucleus.cache.level2	false	Use a level 2 cache. Turn this off if metadata is changed independently of Hive metastore server; 是否使用二级缓存机制。
datanucleus.cache.level2.type	SOFT	SOFT=soft reference based cache, WEAK=weak reference based cache， none=no cache.二级缓存机制的类型，none是不使用，SOFT表示使用软引用，WEAK表示使用弱引用。
datanucleus.connectionPoolingType	BoneCP	metastore数据连接池使用。
datanucleus.fixedDatastore	false
datanucleus.identifierFactory	datanucleus1	Name of the identifier factory to use when generating table/column names etc.创建metastore数据库的工厂类。
datanucleus.plugin.pluginRegistryBundleCheck	LOG	Defines what happens when plugin bundles are found and are duplicated [EXCEPTION\|LOG\|NONE]
datanucleus.rdbms.useLegacyNativeValueStrategy	true
datanucleus.storeManagerType	rdbms	元数据存储方式
datanucleus.transactionIsolation	read-committed	事务机制，Default transaction isolation level for identity generation.
datanucleus.validateColumns	false	validates existing schema against code. turn this on if you want to verify existing schema,对于存在的表是否进行检查schema
datanucleus.validateConstraints	false	对于存在的表是否检查约束
datanucleus.validateTables	false	检查表
dfs.block.access.key.update.interval	600
hive.archive.enabled	false	Whether archiving operations are permitted；是否允许进行归档操作。
hive.auto.convert.join	true	Whether Hive enables the optimization about converting common join into mapjoin based on the input file size；是否允许进行data join 优化
hive.auto.convert.join.noconditionaltask	true	Whether Hive enables the optimization about converting common join into mapjoin based on the input file size. If this parameter is on, and the sum of size for n-1 of the tables/partitions for a n-way join is smaller than the specified size, the join is directly converted to a mapjoin (there is no conditional task).针对没有条件的task，是否直接使用data join。
hive.auto.convert.join.noconditionaltask.size	10000000	If hive.auto.convert.join.noconditionaltask is off, this parameter does not take affect. However, if it is on, and the sum of size for n-1 of the tables/partitions for a n-way join is smaller than this size, the join is directly converted to a mapjoin(there is no conditional task). The default is 10MB；如果${hive.auto.convert.join.noconditionaltask}设置为true，那么表示控制文件的大小值，默认10M；也就是说如果小于10M，那么直接使用data join。
hive.auto.convert.join.use.nonstaged	false	For conditional joins, if input stream from a small alias can be directly applied to join operator without filtering or projection, the alias need not to be pre-staged in distributed cache via mapred local task. Currently, this is not working with vectorization or tez execution engine.对于有条件的数据join，对于小文件是否使用分布式缓存。
hive.auto.convert.sortmerge.join	false	Will the join be automatically converted to a sort-merge join, if the joined tables pass the criteria for sort-merge join.如果可以转换，自动转换为标准的sort-merge join方式。
hive.auto.convert.sortmerge.join.bigtable.selection.policy	org.apache.hadoop.hive.ql.optimizer.AvgPartitionSizeBasedBigTableSelectorForAutoSMJ
hive.auto.convert.sortmerge.join.to.mapjoin	false	是否穿件sort-merge join到map join方式
hive.auto.progress.timeout	0	How long to run autoprogressor for the script/UDTF operators (in seconds). Set to 0 for forever. 执行脚本和udtf过期时间，设置为0表示永不过期。
hive.autogen.columnalias.prefix.includefuncname	false	hive自动产生的临时列名是否加function名称，默认不加
hive.autogen.columnalias.prefix.label	_c	hive的临时列名主体部分
hive.binary.record.max.length	1000	hive二进制记录最长长度
hive.cache.expr.evaluation	true	If true, evaluation result of deterministic expression referenced twice or more will be cached. For example, in filter condition like ".. where key + 10 > 10 or key + 10 = 0" "key + 10" will be evaluated/cached once and reused for following expression ("key + 10 = 0"). Currently, this is applied only to expressions in select or filter operator. 是否允许缓存表达式的执行，默认允许；先阶段只缓存select和where中的表达式结果。
hive.cli.errors.ignore	false
hive.cli.pretty.output.num.cols	-1
hive.cli.print.current.db	false	是否显示当前操作database名称，默认不显示
hive.cli.print.header	false	是否显示具体的查询头部信息，默认不显示。比如不显示列名。
hive.cli.prompt	hive	hive的前缀提示信息,，修改后需要重新启动客户端。
hive.cluster.delegation.token.store.class	org.apache.hadoop.hive.thrift.MemoryTokenStore	hive集群委托token信息存储类
hive.cluster.delegation.token.store.zookeeper.znode	/hive/cluster/delegation	hive zk存储
hive.compactor.abortedtxn.threshold	1000	分区压缩文件阀值
hive.compactor.check.interval	300	压缩间隔时间，单位秒
hive.compactor.delta.num.threshold	10	子分区阀值
hive.compactor.delta.pct.threshold	0.1	压缩比例
hive.compactor.initiator.on	false
hive.compactor.worker.threads	0
hive.compactor.worker.timeout	86400	单位秒
hive.compat	0.12	兼容版本信息
hive.compute.query.using.stats	false
hive.compute.splits.in.am	true
hive.conf.restricted.list	hive.security.authenticator.manager,hive.security.authorization.manager
hive.conf.validation	true
hive.convert.join.bucket.mapjoin.tez	false
hive.counters.group.name	HIVE
hive.debug.localtask	false
hive.decode.partition.name	false
hive.default.fileformat	TextFile	指定默认的fileformat格式化器。默认为textfile。
hive.default.rcfile.serde	org.apache.hadoop.hive.serde2.columnar.ColumnarSerDe	rcfile对应的序列化类
hive.default.serde	org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe	默认的序列化类
hive.display.partition.cols.separately	true	hive分区单独的显示列名
hive.downloaded.resources.dir	/tmp/${hive.session.id}_resources	hive下载资源存储文件
hive.enforce.bucketing	false	是否允许使用桶
hive.enforce.bucketmapjoin	false	是否允许桶进行map join
hive.enforce.sorting	false	是否允许在插入的时候使用sort排序。
hive.enforce.sortmergebucketmapjoin	false
hive.entity.capture.transform	false
hive.entity.separator	@	Separator used to construct names of tables and partitions. For example, dbname@tablename@partitionname
hive.error.on.empty.partition	false	Whether to throw an exception if dynamic partition insert generates empty results.当启用动态hive的时候，如果插入的partition为空，是否抛出异常信息。
hive.exec.check.crossproducts	true	检查是否包含向量积
hive.exec.compress.intermediate	false	中间结果是否压缩，压缩机制采用hadoop的配置信息mapred.output.compress*
hive.exec.compress.output	false	最终结果是否压缩
hive.exec.concatenate.check.index	true
hive.exec.copyfile.maxsize	33554432
hive.exec.counters.pull.interval	1000
hive.exec.default.partition.name	__HIVE_DEFAULT_PARTITION__
hive.exec.drop.ignorenonexistent	true	当执行删除的时候是否忽略不存在的异常信息，默认忽略，如果忽略，那么会报错。
hive.exec.dynamic.partition	true	是否允许动态指定partition，如果允许的话，那么我们修改内容的时候可以不指定partition的值。
hive.exec.dynamic.partition.mode	strict	动态partition模式，strict模式要求至少给定一个静态的partition值。nonstrict允许全部partition为动态的值。
hive.exec.infer.bucket.sort	false
hive.exec.infer.bucket.sort.num.buckets.power.two	false
hive.exec.job.debug.capture.stacktraces	true
hive.exec.job.debug.timeout	30000
hive.exec.local.scratchdir	/tmp/hadoop
hive.exec.max.created.files	100000	在mr程序中最大创建的hdfs文件个数
hive.exec.max.dynamic.partitions	1000	动态分区的总的分区最大个数
hive.exec.max.dynamic.partitions.pernode	100	每个MR节点的最大创建个数
hive.exec.mode.local.auto	false	是否允许hive运行本地模式
hive.exec.mode.local.auto.input.files.max	4	hive本地模式最大输入文件数量
hive.exec.mode.local.auto.inputbytes.max	134217728	hive本地模式组大输入字节数
hive.exec.orc.default.block.padding	true
hive.exec.orc.default.buffer.size	262144
hive.exec.orc.default.compress	ZLIB
hive.exec.orc.default.row.index.stride	10000
hive.exec.orc.default.stripe.size	268435456
hive.exec.orc.dictionary.key.size.threshold	0.8
hive.exec.orc.memory.pool	0.5
hive.exec.orc.skip.corrupt.data	false
hive.exec.orc.zerocopy	false
hive.exec.parallel	false	是否允许并行执行，默认不允许。
hive.exec.parallel.thread.number	8	并行执行线程个数，默认8个。
hive.exec.perf.logger	org.apache.hadoop.hive.ql.log.PerfLogger
hive.exec.rcfile.use.explicit.header	true
hive.exec.rcfile.use.sync.cache	true
hive.exec.reducers.bytes.per.reducer	1000000000	size per reducer.The default is 1G, i.e if the input size is 10G, it will use 10 reducers. 默认reducer节点处理数据的规模，默认1G。
hive.exec.reducers.max	999	reducer允许的最大个数。当mapred.reduce.tasks指定为负值的时候，该参数起效。
hive.exec.rowoffset	false
hive.exec.scratchdir	/etc/hive-hadoop
hive.exec.script.allow.partial.consumption	false
hive.exec.script.maxerrsize	100000
hive.exec.script.trust	false
hive.exec.show.job.failure.debug.info	true
hive.exec.stagingdir	.hive-staging
hive.exec.submitviachild	false
hive.exec.tasklog.debug.timeou	20000
hive.execution.engine	mr	执行引擎mr或者Tez(hadoop2)
hive.exim.uri.scheme.whitelist	hdfs,pfile
hive.explain.dependency.append.tasktype	false
hive.fetch.output.serde	org.apache.hadoop.hive.serde2.DelimitedJSONSerDe
hive.fetch.task.aggr	false
hive.fetch.task.conversion	minimal
hive.fetch.task.conversion.threshold	-1
hive.file.max.footer	100
hive.fileformat.check	true
hive.groupby.mapaggr.checkinterval	100000
hive.groupby.orderby.position.alias	false
hive.groupby.skewindata	false
hive.hadoop.supports.splittable.combineinputformat	false
hive.hashtable.initialCapacity	100000
hive.hashtable.loadfactor	0.75
hive.hbase.generatehfiles	false
hive.hbase.snapshot.restoredir	/tmp
hive.hbase.wal.enabled	true
hive.heartbeat.interval	1000
hive.hmshandler.force.reload.conf	false
hive.hmshandler.retry.attempts	1
hive.hmshandler.retry.interval	1000
hive.hwi.listen.host	0.0.0.0
hive.hwi.listen.port	9999
hive.hwi.war.file	lib/hive-hwi-${version}.war
hive.ignore.mapjoin.hint	true
hive.in.test	false
hive.index.compact.binary.search	true
hive.index.compact.file.ignore.hdfs	false
hive.index.compact.query.max.entries	10000000
hive.index.compact.query.max.size	10737418240
hive.input.format	org.apache.hadoop.hive.ql.io.CombineHiveInputFormat
hive.insert.into.external.tables	true
hive.insert.into.multilevel.dirs	false
hive.jobname.length	50
hive.join.cache.size	25000
hive.join.emit.interval	1000
hive.lazysimple.extended_boolean_literal	false
hive.limit.optimize.enable	false
hive.limit.optimize.fetch.max	50000
hive.limit.optimize.limit.file	10
hive.limit.pushdown.memory.usage	-1.0
hive.limit.query.max.table.partition	-1
hive.limit.row.max.size	100000
hive.localize.resource.num.wait.attempts	5
hive.localize.resource.wait.interval	5000
hive.lock.manager	org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager
hive.mapred.partitioner	org.apache.hadoop.hive.ql.io.DefaultHivePartitioner
hive.mapred.reduce.tasks.speculative.execution	true
hive.mapred.supports.subdirectories	false
hive.metastore.uris	thrift://hh:9083
hive.metastore.warehouse.dir	/user/hive/warehouse
hive.multi.insert.move.tasks.share.dependencies	false
hive.multigroupby.singlereducer	true
hive.zookeeper.clean.extra.nodes	false	在会话结束的时候是否清楚额外的节点数据
hive.zookeeper.client.port	2181	客户端端口号
hive.zookeeper.quorum		zk的服务器端ip
hive.zookeeper.session.timeout	600000	zk的client端会话过期时间
hive.zookeeper.namespace	hive_zookeeper_namespace
javax.jdo.PersistenceManagerFactoryClass	org.datanucleus.api.jdo.JDOPersistenceManagerFactory
javax.jdo.option.ConnectionDriverName	改为：com.mysql.jdbc.Driver
javax.jdo.option.ConnectionPassword	改为：hive
javax.jdo.option.ConnectionURL	xxx
javax.jdo.option.ConnectionUserName	xxx
javax.jdo.option.DetachAllOnCommit	true
javax.jdo.option.Multithreaded	true
javax.jdo.option.NonTransactionalRead	true

[Hive] - Hive参数含义详解的更多相关文章

机器学习——随机森林，RandomForestClassifier参数含义详解
1.随机森林模型 clf = RandomForestClassifier(n_estimators=200, criterion='entropy', max_depth=4) rf_clf = c ...
Hive配置项的含义详解
关于MetaStore:metastore是个独立的关系数据库,用来持久化schema和系统元数据. hive.metastore.local:控制hive是否连接一个远程metastore服务器还是 ...
Apache的配置文件http.conf参数含义详解
Apache的配置由httpd.conf文件配置,因此下面的配置指令都是在httpd.conf文件中修改. 主站点的配置(基本配置) (1) 基本配置: ServerRoot "/mnt/s ...
大数据学习系列之五 ----- Hive整合HBase图文详解
引言在上一篇大数据学习系列之四 ----- Hadoop+Hive环境搭建图文详解(单机) 和之前的大数据学习系列之二 ----- HBase环境搭建(单机) 中成功搭建了Hive和HBase的环 ...
Hive 的collect_set使用详解
Hive 的collect_set使用详解 https://blog.csdn.net/liyantianmin/article/details/48262109 对于非group by字段,可以 ...
MySQL高可用架构之Mycat-关于Mycat安装和参数设置详解
MySQL高可用架构之Mycat-关于Mycat安装和参数设置详解作者:尹正杰版权声明:原创作品,谢绝转载!否则将追究法律责任. 一.Mycat介绍 1>.什么是Mycat Mycat背后是 ...
Oracle Statspack报告中各项指标含义详解~~学习性能必看！！！
Oracle Statspack报告中各项指标含义详解~~学习性能必看!!! Data Buffer Hit Ratio#<#90# 数据块在数据缓冲区中的命中率,通常应该在90%以上,否则考虑 ...
Spring boot注解(annotation)含义详解
Spring boot注解(annotation)含义详解 @Service用于标注业务层组件@Controller用于标注控制层组件(如struts中的action)@Repository用于标注数 ...
Linux命令 ls -l 输出内容含义详解
Linux命令 ls -l s输出内容含义详解 1. ls 只显示文件名或者文件目录 2. ls -l(这个参数是字母L的小写,不是数字1) 用来查看详细的文件资料在某个目录下键入ls -l可 ...

随机推荐

shell基本理论知识
(1)查看系统上安装了哪些shell # cat /etc/shells # /etc/shells: valid login shells /bin/sh /bin/dash /bin/bash / ...
view添加阴影无效
需求:需要给cell里的imageview添加阴影问题:按照标准的代码添加阴影,然并卵:代码如下: imageview.layer.shadowColor = [[UIColor blackColo ...
leetcode--009 Linked List Cycle I
aaarticlea/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZgAAACACAIAAAC5q+hAAAAJ+UlEQVR4nO2dwbXrKBJAOyelRShEQw
mysql server的安装和配置
YSQL-5.7.9.1解压版例如我的在D:\Program Files\MySQL\MySQL Server 5.7(解压时名字mysql-installer-community-5.7.9.1可 ...
UVa 10382 - Watering Grass
题目大意:有一条长为l,宽为w的草坪,在草坪上有n个洒水器,给出洒水器的位置和洒水半径,求能浇灌全部草坪范围的洒水器的最小个数. 经典贪心问题:区间覆盖.用计算几何对洒水器的覆盖范围简单处理一下即可得 ...
centos 6.5下编译安装、配置高性能服务器Nginx
1.nginx是什么? Nginx是一款轻量级的Web 服务器/反向代理服务器及电子邮件(IMAP/POP3)代理服务器,由俄罗斯的程序设计师Igor Sysoev所开发,其特点是占有内存少,并发能力 ...
最近iOS开发遇到的问题
1)计算器,编辑框键盘: 2)类间参量引用传递: 3)饼状图: 4)折线图: 5)uicollection使用: 6)富文本开发: 7)separatorInset,layoutMargins uit ...
1.4.2.3. SETUP（Core Data 应用程序实践指南）
初始化Core Data的三个方法: init,初始化托管对象模型.持久化存储协调器.托管对象上下文 - (id)init { ) { NSLog(@"Running %@ '%@'&quo ...
阿里云服务器windows系统C盘一键清理脚本
@ECHO OFF @echo @echo @echo 清理几个比较多垃圾文件的地方 DEL /F /S /Q "C:\WINDOWS\PCHealth\ERRORREP\QSIGNOFF\ ...
doubango(5)--SIP协议栈传输层的启动
SIP协议的INVITE消息发起流程当通过sip协议发起一个会话时,需要通过invite消息实现该流程.而SIP协议是一个基于事务的协议,每一个sip会话的都是通过sip部件间的一系列消息来完成的. ...

[Hive] - Hive参数含义详解

[Hive] - Hive参数含义详解的更多相关文章

随机推荐

热门专题