【原创】大数据基础之Impala(3)部分调优
1)将coordinator和executor角色分离
By default, each host in the cluster that runs the impalad daemon can act as the coordinator for an Impala query, execute the fragments of the execution plan for the query, or both. During highly concurrent workloads for large-scale queries, especially on large clusters, the dual roles can cause scalability issues:
- The extra work required for a host to act as the coordinator could interfere with its capacity to perform other work for the earlier phases of the query. For example, the coordinator can experience significant network and CPU overhead during queries containing a large number of query fragments. Each coordinator caches metadata for all table partitions and data files, which can be substantial and contend with memory needed to process joins, aggregations, and other operations performed by query executors.
- Having a large number of hosts act as coordinators can cause unnecessary network overhead, or even timeout errors, as each of those hosts communicates with the statestored daemon for metadata updates.
- The “soft limits” imposed by the admission control feature are more likely to be exceeded when there are a large number of heavily loaded hosts acting as coordinators.
2)default_pool_max_requests,默认是200,要根据自己集群的内存规模以及每个查询需要的内存进行调整;
Maximum number of concurrent outstanding requests allowed to run before incoming requests are queued. Because this limit applies cluster-wide, but each Impala node makes independent decisions to run queries immediately or queue them, it is a soft limit; the overall number of concurrent queries might be slightly higher during times of heavy load. A negative value indicates no limit. Ignored if fair_scheduler_config_path and llama_site_path are set.
3)开启kerberos之后,通过jdbc访问需要做客户端load balance,因为jdbc url里需要携带对应server的principal;
【原创】大数据基础之Impala(3)部分调优的更多相关文章
- 【原创】大数据基础之Impala(1)简介、安装、使用
impala2.12 官方:http://impala.apache.org/ 一 简介 Apache Impala is the open source, native analytic datab ...
- 大数据技术 - MapReduce的Shuffle及调优
本章内容我们学习一下 MapReduce 中的 Shuffle 过程,Shuffle 发生在 map 输出到 reduce 输入的过程,它的中文解释是 “洗牌”,顾名思义该过程涉及数据的重新分配,主要 ...
- 【原创】大数据基础之Impala(2)实现细节
一 架构 Impala is a massively-parallel query execution engine, which runs on hundreds of machines in ex ...
- 【原创】大数据基础之Zookeeper(2)源代码解析
核心枚举 public enum ServerState { LOOKING, FOLLOWING, LEADING, OBSERVING; } zookeeper服务器状态:刚启动LOOKING,f ...
- 【原创】大数据基础之Benchmark(2)TPC-DS
tpc 官方:http://www.tpc.org/ 一 简介 The TPC is a non-profit corporation founded to define transaction pr ...
- 【原创】大数据基础之词频统计Word Count
对文件进行词频统计,是一个大数据领域的hello word级别的应用,来看下实现有多简单: 1 Linux单机处理 egrep -o "\b[[:alpha:]]+\b" test ...
- 【原创】大数据基础之ElasticSearch(4)es数据导入过程
1 准备analyzer 内置analyzer 参考:https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis- ...
- 【原创】大数据基础之Hive(5)性能调优Performance Tuning
1 compress & mr hive默认的execution engine是mr hive> set hive.execution.engine;hive.execution.eng ...
- 【原创】大数据基础之ElasticSearch(5)重要配置及调优
Index Settings 重要索引配置 Index level settings can be set per-index. Settings may be: 1 static 静态索引配置 Th ...
随机推荐
- Spring boot的第一个demo
由于SpringBoot的问世使开发的门槛有降低了不少,就其这一点,早已使其名声大振,如雷贯耳.我之前是使用spring开发的,刚才使用了spring boot试验了一下,果然名不虚传,开发速度贼快. ...
- MyBatis 3源码解析(二)
二.获取SqlSession对象 1.首先调用DefaultSqlSessionFactory 的 openSession 方法,代码如下: @Override public SqlSession o ...
- python可迭代对象和迭代器和生成器
可迭代对象 刚开始我认为这两者是等同的,但后来发现并不是这样:下面直接抛出结论: )可迭代对象包含迭代器. )如果一个对象拥有__iter__方法,其是可迭代对象:如果一个对象拥有next方法,其是迭 ...
- sts 创建springMVC项目---- maven和tomcat 错误处理
今天学习spring的时候,学到了springMVC, 因为springMVC 就是beginning spring 书籍的第三章,为了更深入或更简单的起步学习springMVC, 我又找了另外一本书 ...
- BZOJ 2730 矿场搭建
割点 割点以外的点坍塌不影响其他人逃生,因为假设我们任取两个个非割点s建立救援站,非割点的任意点坍塌,我们都可以从割点走到一个救援出口. 所以我们只考虑割点坍塌的情况. 我们可以先找出图中所有的割点. ...
- ECMA262,JavaScript引擎,浏览器
相关阅读:https://www.cnblogs.com/970119449blog/p/8080133.html 相关阅读:https://www.jb51.net/article/75888.ht ...
- Linux 内核文档翻译 - kobject.txt
原文地址:Linux 内核文档翻译 - kobject.txt 作者:qh997 Everything you never wanted to know about kobjects, ksets, ...
- Java第二次实训
package fsafsa; import java.util.Scanner; public class fafas { public static void main(String[] args ...
- 3194. 【HNOI模拟题】化学(无标号无根树计数)
Problem 求\(n\)个点的每个点度数不超过\(4\)的无标号无根树个数. Data constraint \(1\le n\le 500\) Solution 尝试着把问题一般化.我们来考虑一 ...
- echarts Map(地图) 不同颜色区块显示
以河南地图为例: 代码如下: <h3>天翼日必达完成率</h3> <div id="map" style="height:340px; te ...