1)将coordinator和executor角色分离

By default, each host in the cluster that runs the impalad daemon can act as the coordinator for an Impala query, execute the fragments of the execution plan for the query, or both. During highly concurrent workloads for large-scale queries, especially on large clusters, the dual roles can cause scalability issues:

  • The extra work required for a host to act as the coordinator could interfere with its capacity to perform other work for the earlier phases of the query. For example, the coordinator can experience significant network and CPU overhead during queries containing a large number of query fragments. Each coordinator caches metadata for all table partitions and data files, which can be substantial and contend with memory needed to process joins, aggregations, and other operations performed by query executors.
  • Having a large number of hosts act as coordinators can cause unnecessary network overhead, or even timeout errors, as each of those hosts communicates with the statestored daemon for metadata updates.
  • The “soft limits” imposed by the admission control feature are more likely to be exceeded when there are a large number of heavily loaded hosts acting as coordinators.

2)default_pool_max_requests,默认是200,要根据自己集群的内存规模以及每个查询需要的内存进行调整;

Maximum number of concurrent outstanding requests allowed to run before incoming requests are queued. Because this limit applies cluster-wide, but each Impala node makes independent decisions to run queries immediately or queue them, it is a soft limit; the overall number of concurrent queries might be slightly higher during times of heavy load. A negative value indicates no limit. Ignored if fair_scheduler_config_path and llama_site_path are set.

3)开启kerberos之后,通过jdbc访问需要做客户端load balance,因为jdbc url里需要携带对应server的principal;

【原创】大数据基础之Impala(3)部分调优的更多相关文章

  1. 【原创】大数据基础之Impala(1)简介、安装、使用

    impala2.12 官方:http://impala.apache.org/ 一 简介 Apache Impala is the open source, native analytic datab ...

  2. 大数据技术 - MapReduce的Shuffle及调优

    本章内容我们学习一下 MapReduce 中的 Shuffle 过程,Shuffle 发生在 map 输出到 reduce 输入的过程,它的中文解释是 “洗牌”,顾名思义该过程涉及数据的重新分配,主要 ...

  3. 【原创】大数据基础之Impala(2)实现细节

    一 架构 Impala is a massively-parallel query execution engine, which runs on hundreds of machines in ex ...

  4. 【原创】大数据基础之Zookeeper(2)源代码解析

    核心枚举 public enum ServerState { LOOKING, FOLLOWING, LEADING, OBSERVING; } zookeeper服务器状态:刚启动LOOKING,f ...

  5. 【原创】大数据基础之Benchmark(2)TPC-DS

    tpc 官方:http://www.tpc.org/ 一 简介 The TPC is a non-profit corporation founded to define transaction pr ...

  6. 【原创】大数据基础之词频统计Word Count

    对文件进行词频统计,是一个大数据领域的hello word级别的应用,来看下实现有多简单: 1 Linux单机处理 egrep -o "\b[[:alpha:]]+\b" test ...

  7. 【原创】大数据基础之ElasticSearch(4)es数据导入过程

    1 准备analyzer 内置analyzer 参考:https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis- ...

  8. 【原创】大数据基础之Hive(5)性能调优Performance Tuning

    1 compress & mr hive默认的execution engine是mr hive> set hive.execution.engine;hive.execution.eng ...

  9. 【原创】大数据基础之ElasticSearch(5)重要配置及调优

    Index Settings 重要索引配置 Index level settings can be set per-index. Settings may be: 1 static 静态索引配置 Th ...

随机推荐

  1. Docker版本与安装介绍

    Docker版本与安装介绍 Docker-CE 和 Docker-EE Centos 上安装 Docker-CE Ubuntu 上安装 Docker-CE Docker-CE和Docker-EE Do ...

  2. webpack(一) 配置

    一.entry  & output mode: 'development', // entry: './src/index', // entry: ['./src/index', './src ...

  3. centos7之zabbix监控mysql(mariadb)数据库

    一.Zabbix3.2.6使用自带模板监控MySQL 添加zabbix_agent客户端方法:http://www.cnblogs.com/lei0213/p/8858269.html mysql服务 ...

  4. MyBatis 3源码解析(四)

    四.MyBatis 查询实现 Employee empById = mapper.getEmpById(1); 首先会调用MapperProxy的invoke方法 @Override public O ...

  5. CS程序自动更新实现原理及代码(支持多版本多文件更新)

    公司主要项目为CS端,经常遇到客户需求变更及bug处理,在没有引用自动更新之前每次更新程序,必须手动对每个客户端进行更新,这样导致技术支持工作量特别大,也给客户不好的印象,因此我需要一个自动更新程序! ...

  6. Codeforces Round #522 (Div. 2, based on Technocup 2019 Elimination Round 3)B. Personalized Cup

    题意:把一长串字符串 排成矩形形式  使得行最小  同时每行不能相差大于等于两个字符 每行也不能大于20个字符 思路: 因为使得行最小 直接行从小到大枚举即可   每行不能相差大于等于两个字符相当于  ...

  7. [PRIMITIVE TECHNOLOGY]澳洲小哥的黑皮豆/black been/摩顿湾板栗(栗子)/Moreton Bay Chestnut

    wiki:https://en.wikipedia.org/wiki/Castanospermum inner:http://blog.sciencenet.cn/blog-309517-770951 ...

  8. Linux 下 fcitx 崩溃

    killall fcitx killall sogou-qimpanel fcitx & 输入以上命令

  9. docker容器网络

    1.我们在使用docker run创建Docker容器时,可以用--net选项指定容器的网络模式,Docker有以下4种网络模式: · host模式,使用--net=host指定 · containe ...

  10. sshpass-Linux命令之非交互SSH密码验证

    sshpass-Linux命令之非交互SSH密码验证 参考网址:https://www.cnblogs.com/chenlaichao/p/7727554.html ssh登陆不能在命令行中指定密码. ...