【原创】大数据基础之Impala(3)部分调优
1)将coordinator和executor角色分离
By default, each host in the cluster that runs the impalad daemon can act as the coordinator for an Impala query, execute the fragments of the execution plan for the query, or both. During highly concurrent workloads for large-scale queries, especially on large clusters, the dual roles can cause scalability issues:
- The extra work required for a host to act as the coordinator could interfere with its capacity to perform other work for the earlier phases of the query. For example, the coordinator can experience significant network and CPU overhead during queries containing a large number of query fragments. Each coordinator caches metadata for all table partitions and data files, which can be substantial and contend with memory needed to process joins, aggregations, and other operations performed by query executors.
- Having a large number of hosts act as coordinators can cause unnecessary network overhead, or even timeout errors, as each of those hosts communicates with the statestored daemon for metadata updates.
- The “soft limits” imposed by the admission control feature are more likely to be exceeded when there are a large number of heavily loaded hosts acting as coordinators.
2)default_pool_max_requests,默认是200,要根据自己集群的内存规模以及每个查询需要的内存进行调整;
Maximum number of concurrent outstanding requests allowed to run before incoming requests are queued. Because this limit applies cluster-wide, but each Impala node makes independent decisions to run queries immediately or queue them, it is a soft limit; the overall number of concurrent queries might be slightly higher during times of heavy load. A negative value indicates no limit. Ignored if fair_scheduler_config_path and llama_site_path are set.
3)开启kerberos之后,通过jdbc访问需要做客户端load balance,因为jdbc url里需要携带对应server的principal;
【原创】大数据基础之Impala(3)部分调优的更多相关文章
- 【原创】大数据基础之Impala(1)简介、安装、使用
impala2.12 官方:http://impala.apache.org/ 一 简介 Apache Impala is the open source, native analytic datab ...
- 大数据技术 - MapReduce的Shuffle及调优
本章内容我们学习一下 MapReduce 中的 Shuffle 过程,Shuffle 发生在 map 输出到 reduce 输入的过程,它的中文解释是 “洗牌”,顾名思义该过程涉及数据的重新分配,主要 ...
- 【原创】大数据基础之Impala(2)实现细节
一 架构 Impala is a massively-parallel query execution engine, which runs on hundreds of machines in ex ...
- 【原创】大数据基础之Zookeeper(2)源代码解析
核心枚举 public enum ServerState { LOOKING, FOLLOWING, LEADING, OBSERVING; } zookeeper服务器状态:刚启动LOOKING,f ...
- 【原创】大数据基础之Benchmark(2)TPC-DS
tpc 官方:http://www.tpc.org/ 一 简介 The TPC is a non-profit corporation founded to define transaction pr ...
- 【原创】大数据基础之词频统计Word Count
对文件进行词频统计,是一个大数据领域的hello word级别的应用,来看下实现有多简单: 1 Linux单机处理 egrep -o "\b[[:alpha:]]+\b" test ...
- 【原创】大数据基础之ElasticSearch(4)es数据导入过程
1 准备analyzer 内置analyzer 参考:https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis- ...
- 【原创】大数据基础之Hive(5)性能调优Performance Tuning
1 compress & mr hive默认的execution engine是mr hive> set hive.execution.engine;hive.execution.eng ...
- 【原创】大数据基础之ElasticSearch(5)重要配置及调优
Index Settings 重要索引配置 Index level settings can be set per-index. Settings may be: 1 static 静态索引配置 Th ...
随机推荐
- Android中自定义IP控件
最近在搞Android项目,之前并没有系统的去学过这方面的编程,只能边看书边撸代码.在项目的开发的过程中,需要一个IP控件,后面了解到Android中并没有这样的控件,于是网上搜索,发现得到的结果并不 ...
- hdu-1711(hash)
题意:给你T组数据,每组数据分别输入n,m和长度为n的数字数组,和长度为m的数字数组,问你长度为m的数组第一次出现在长度为n的数组的位置 解题思路:标准字符串匹配问题,一般用kmp解,拿来练hash ...
- dp回文
.dp回文子串 通常在dp数组中存放的是 从i到j是否是回文子串 1.动态规划 2.中心扩展法 #include<iostream> #include<algorithm> # ...
- 【THUSC2017】【LOJ2979】换桌 线段树 网络流
题目大意 有 \(n\) 个圆形的桌子排成一排,每个桌子有 \(m\) 个座位. 最开始每个位置上都有一个人.现在每个人都要重新选择一个座位,第 \(i\) 桌的第 \(j\) 个人的新座位只能在第 ...
- Neovim中NERDTree等多处cursorline不高亮
标题表达的不是很清楚,看下图把 解决方法 添加下面内容到init.vim " 针对NERDTree " https://github.com/scrooloose/nerdtree ...
- (二叉树 递归) leetcode 145. Binary Tree Postorder Traversal
Given a binary tree, return the postorder traversal of its nodes' values. Example: Input: [1,null,2, ...
- 商务电话思维图(XMind für Geschäftliche Telefongespräche)
在和德国人打交道时,经常会遇到打电话的情景,应该怎么应对呢?不用担心,记住下面这个导图,轻松搞定德语电话的常用句型. 最后,按照惯例,来张美景.人越是上了年纪,就活的越是小心.但无论外界怎么样,请保持 ...
- NGINX+PHP配置
NGINX做为WEB服务器,运行PHP开发的程序和页面: server { listen 80; listen 443 ssl; ssl_certificate /usr/local/nginx/co ...
- 细说Cookie--转
Cookie虽然是个很简单的东西,但它又是WEB开发中一个很重要的客户端数据来源,而且它可以实现扩展性很好的会话状态, 所以我认为每个WEB开发人员都有必要对它有个清晰的认识.本文将对Cookie这个 ...
- 解决beego1.12新版本没有log.info
去https://github.com/astaxie/beego/中,找到旧的版本下载其log.go 至本地beego目录中