【原创】大数据基础之Impala(3)部分调优
1)将coordinator和executor角色分离
By default, each host in the cluster that runs the impalad daemon can act as the coordinator for an Impala query, execute the fragments of the execution plan for the query, or both. During highly concurrent workloads for large-scale queries, especially on large clusters, the dual roles can cause scalability issues:
- The extra work required for a host to act as the coordinator could interfere with its capacity to perform other work for the earlier phases of the query. For example, the coordinator can experience significant network and CPU overhead during queries containing a large number of query fragments. Each coordinator caches metadata for all table partitions and data files, which can be substantial and contend with memory needed to process joins, aggregations, and other operations performed by query executors.
- Having a large number of hosts act as coordinators can cause unnecessary network overhead, or even timeout errors, as each of those hosts communicates with the statestored daemon for metadata updates.
- The “soft limits” imposed by the admission control feature are more likely to be exceeded when there are a large number of heavily loaded hosts acting as coordinators.
2)default_pool_max_requests,默认是200,要根据自己集群的内存规模以及每个查询需要的内存进行调整;
Maximum number of concurrent outstanding requests allowed to run before incoming requests are queued. Because this limit applies cluster-wide, but each Impala node makes independent decisions to run queries immediately or queue them, it is a soft limit; the overall number of concurrent queries might be slightly higher during times of heavy load. A negative value indicates no limit. Ignored if fair_scheduler_config_path and llama_site_path are set.
3)开启kerberos之后,通过jdbc访问需要做客户端load balance,因为jdbc url里需要携带对应server的principal;
【原创】大数据基础之Impala(3)部分调优的更多相关文章
- 【原创】大数据基础之Impala(1)简介、安装、使用
impala2.12 官方:http://impala.apache.org/ 一 简介 Apache Impala is the open source, native analytic datab ...
- 大数据技术 - MapReduce的Shuffle及调优
本章内容我们学习一下 MapReduce 中的 Shuffle 过程,Shuffle 发生在 map 输出到 reduce 输入的过程,它的中文解释是 “洗牌”,顾名思义该过程涉及数据的重新分配,主要 ...
- 【原创】大数据基础之Impala(2)实现细节
一 架构 Impala is a massively-parallel query execution engine, which runs on hundreds of machines in ex ...
- 【原创】大数据基础之Zookeeper(2)源代码解析
核心枚举 public enum ServerState { LOOKING, FOLLOWING, LEADING, OBSERVING; } zookeeper服务器状态:刚启动LOOKING,f ...
- 【原创】大数据基础之Benchmark(2)TPC-DS
tpc 官方:http://www.tpc.org/ 一 简介 The TPC is a non-profit corporation founded to define transaction pr ...
- 【原创】大数据基础之词频统计Word Count
对文件进行词频统计,是一个大数据领域的hello word级别的应用,来看下实现有多简单: 1 Linux单机处理 egrep -o "\b[[:alpha:]]+\b" test ...
- 【原创】大数据基础之ElasticSearch(4)es数据导入过程
1 准备analyzer 内置analyzer 参考:https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis- ...
- 【原创】大数据基础之Hive(5)性能调优Performance Tuning
1 compress & mr hive默认的execution engine是mr hive> set hive.execution.engine;hive.execution.eng ...
- 【原创】大数据基础之ElasticSearch(5)重要配置及调优
Index Settings 重要索引配置 Index level settings can be set per-index. Settings may be: 1 static 静态索引配置 Th ...
随机推荐
- 基于配置文件的方式配置AOP
之前说的都是通过注释的方式配置,接下来说说如何使用配置文件配置AOP 还是原来的代码,去掉所有注释,接下来配置最基本的几个bean. 然后使用<aop:config>标签进行配置,然后配切 ...
- MyBatis 3源码解析(一)
一.SqlSessionFactory 对象初始化 //加载全局配置文件 String resource = "mybatis-config.xml"; InputStream i ...
- Nginx CONTENT阶段 concat模块
L67 concat_delimiter : 根据js 指定 分隔符 比如 “|” 那么每个文件分隔符为 “|” concat_types : 指定要合并文件的类型 concat_unique : s ...
- MVC EF 移除建表时自动加上s的复数形式
移除建表时自动加上s的复数形式 public class DBContext : DbContext { public DBContext() : base("name=DBContext& ...
- 「NOI2013」小 Q 的修炼 解题报告
「NOI2013」小 Q 的修炼 第一次完整的做出一个提答,花了半个晚上+一个上午+半个下午 总体来说太慢了 对于此题,我认为的难点是观察数据并猜测性质和读入操作 我隔一会就思考这个sb字符串读起来怎 ...
- yii2 gridview默认排序
Yii2 GridView 使用起来很方便,但是默认排序很是个问题,数据默认按 主键 正序排列 但是在使用过程中,大多数数据默认是 倒序才符合正常思维的. 第一次 的解决方法是在 直接为 Model添 ...
- 百度地图--JS版
百度地图JS版本 ----选择关键字地图展示对应地址---- CSS body, html { width: %; height: %; margin: ; font-family: "微软 ...
- 树莓派wiringPi,BCM,BOARD编码对应管脚
wiringPi,BCM,BOARD编码 由于上课需要, 嵌入式学习从树莓派开始 树莓派中执行: $> gpio readall 即可得到关于树莓派管脚的各种信息 上面的图可能不是特别清楚, 可 ...
- Spring MVC通过AOP切面编程 来拦截controller 实现日志的写入
首选需要参考的是:[参考]http://www.cnblogs.com/guokai870510826/p/5977948.html http://www.cnblogs.com/guokai8 ...
- [ZJOI2019]线段树(线段树)
看到这题,首先想到将求和转期望,即每次操作进行概率为1/2,求节点打标记概率. 首先对于每次区间修改操作,对节点进行分类: 1.这个点和其父亲都和修改区间无交,这种情况可以无视. 2.这个点和修改区间 ...