腾讯云TDSQL PostgreSQL版 -最佳实践

查看是否为分布键查询

postgres=# explain select * from tbase_1 where f1=1;

QUERY PLAN

--------------------------------------------------------------------------------

Remote Fast Query Execution (cost=0.00..0.00 rows=0 width=0)

Node/s: dn001, dn002

-> Gather (cost=1000.00..7827.20 rows=1 width=14)

Workers Planned: 2

-> Parallel Seq Scan on tbase_1 (cost=0.00..6827.10 rows=1 width=14)

Filter: (f1 = 1)

(6 rows)

postgres=# explain select * from tbase_1 where f2=1;

QUERY PLAN

--------------------------------------------------------------------------------

Remote Fast Query Execution (cost=0.00..0.00 rows=0 width=0)

Node/s: dn001

-> Gather (cost=1000.00..7827.20 rows=1 width=14)

Workers Planned: 2

-> Parallel Seq Scan on tbase_1 (cost=0.00..6827.10 rows=1 width=14)

Filter: (f2 = 1)

(6 rows)

如上，第一个查询为非分布键查询，需要发往所有节点，这样最慢的节点决定了整个业务的速度，需要保持所有节点的响应性能一致，如第二个查询所示，业务设计查询时尽可能带上分布键。

查看是否使用索引

postgres=# create index tbase_2_f2_idx on tbase_2(f2);

CREATE INDEX

postgres=# explain select * from tbase_2 where f2=1;

QUERY PLAN

-------------------------------------------------------------------------------------

Remote Fast Query Execution (cost=0.00..0.00 rows=0 width=0)

Node/s: dn001, dn002

-> Index Scan using tbase_2_f2_idx on tbase_2 (cost=0.42..4.44 rows=1 width=14)

Index Cond: (f2 = 1)

(4 rows)

postgres=# explain select * from tbase_2 where f3='1';

QUERY PLAN

--------------------------------------------------------------------------------

Remote Fast Query Execution (cost=0.00..0.00 rows=0 width=0)

Node/s: dn001, dn002

-> Gather (cost=1000.00..7827.20 rows=1 width=14)

Workers Planned: 2

-> Parallel Seq Scan on tbase_2 (cost=0.00..6827.10 rows=1 width=14)

Filter: (f3 = '1'::text)

(6 rows)

postgres=#

第一个查询使用了索引，第二个没有使用索引，通常情况下，使用索引可以加速查询速度，但索引也会增加更新的开销。

查看是否为分布 key join

postgres=# explain select tbase_1.* from tbase_1,tbase_2 where tbase_1.f1=tbase_2.f1 ;

QUERY PLAN

------------------------------------------------------------------------------------------------

Remote Subquery Scan on all (dn001,dn002) (cost=29.80..186.32 rows=3872 width=40)

-> Hash Join (cost=29.80..186.32 rows=3872 width=40)

Hash Cond: (tbase_1.f1 = tbase_2.f1)

-> Remote Subquery Scan on all (dn001,dn002) (cost=100.00..158.40 rows=880 width=40)

Distribute results by S: f1

-> Seq Scan on tbase_1 (cost=0.00..18.80 rows=880 width=40)

-> Hash (cost=18.80..18.80 rows=880 width=4)

-> Seq Scan on tbase_2 (cost=0.00..18.80 rows=880 width=4)

(8 rows)

postgres=# explain select tbase_1.* from tbase_1,tbase_2 where tbase_1.f2=tbase_2.f1 ;

QUERY PLAN

---------------------------------------------------------------------------------

Remote Fast Query Execution (cost=0.00..0.00 rows=0 width=0)

Node/s: dn001, dn002

-> Hash Join (cost=18904.69..46257.08 rows=500564 width=14)

Hash Cond: (tbase_1.f2 = tbase_2.f1)

-> Seq Scan on tbase_1 (cost=0.00..9225.64 rows=500564 width=14)

-> Hash (cost=9225.64..9225.64 rows=500564 width=4)

-> Seq Scan on tbase_2 (cost=0.00..9225.64 rows=500564 width=4)

(7 rows)

第一个查询需要数据重分布，而第二个不需要，分布键 join 查询性能会更高。

查看 join 发生的节点

postgres=# explain select tbase_1.* from tbase_1,tbase_2 where tbase_1.f1=tbase_2.f1 ;

QUERY PLAN

-----------------------------------------------------------------------------------------------

Hash Join (cost=29.80..186.32 rows=3872 width=40)

Hash Cond: (tbase_1.f1 = tbase_2.f1)

-> Remote Subquery Scan on all (dn001,dn002) (cost=100.00..158.40 rows=880 width=40)

-> Seq Scan on tbase_1 (cost=0.00..18.80 rows=880 width=40)

-> Hash (cost=126.72..126.72 rows=880 width=4)

-> Remote Subquery Scan on all (dn001,dn002) (cost=100.00..126.72 rows=880 width=4)

-> Seq Scan on tbase_2 (cost=0.00..18.80 rows=880 width=4)

(7 rows)

postgres=# set prefer_olap to on;

SET

postgres=# explain select tbase_1.* from tbase_1,tbase_2 where tbase_1.f1=tbase_2.f1 ;

QUERY PLAN

------------------------------------------------------------------------------------------------

Remote Subquery Scan on all (dn001,dn002) (cost=29.80..186.32 rows=3872 width=40)

-> Hash Join (cost=29.80..186.32 rows=3872 width=40)

Hash Cond: (tbase_1.f1 = tbase_2.f1)

-> Remote Subquery Scan on all (dn001,dn002) (cost=100.00..158.40 rows=880 width=40)

Distribute results by S: f1

-> Seq Scan on tbase_1 (cost=0.00..18.80 rows=880 width=40)

-> Hash (cost=18.80..18.80 rows=880 width=4)

-> Seq Scan on tbase_2 (cost=0.00..18.80 rows=880 width=4)

(8 rows)

第一个 join 在 cn 节点执行，第二个在 dn 上重分布后再 join，业务设计上，一般 OLTP 类业务在 cn 上进行少数据量 join ，性能会更好。

查看并行的 worker 数

postgres=# explain select count(1) from tbase_1;

QUERY PLAN

---------------------------------------------------------------------------------------

Finalize Aggregate (cost=118.81..118.83 rows=1 width=8)

-> Remote Subquery Scan on all (dn001,dn002) (cost=118.80..118.81 rows=1 width=0)

-> Partial Aggregate (cost=18.80..18.81 rows=1 width=8)

-> Seq Scan on tbase_1 (cost=0.00..18.80 rows=880 width=0)

(4 rows)

postgres=# analyze tbase_1;

ANALYZE

postgres=# explain select count(1) from tbase_1;

QUERY PLAN

----------------------------------------------------------------------------------------------------

Parallel Finalize Aggregate (cost=14728.45..14728.46 rows=1 width=8)

-> Parallel Remote Subquery Scan on all (dn001,dn002) (cost=14728.33..14728.45 rows=1 width=0)

-> Gather (cost=14628.33..14628.44 rows=1 width=8)

Workers Planned: 2

-> Partial Aggregate (cost=13628.33..13628.34 rows=1 width=8)

-> Parallel Seq Scan on tbase_1 (cost=0.00..12586.67 rows=416667 width=0)

(6 rows)

上面第一个查询没走并行，第二个查询 analyze 后走并行才是正确的，建议大数据量更新再执行 analyze。

查看各节点的执行计划是否一致

./tbase_run_sql_dn_master.sh "explain select * from tbase_2 where f2=1"

dn006 --- psql -h 172.16.0.13 -p 11227 -d postgres -U tbase -c "explain select * from tbase_2 where f2=1"

QUERY PLAN

-----------------------------------------------------------------------------

Bitmap Heap Scan on tbase_2 (cost=2.18..7.70 rows=4 width=40)

Recheck Cond: (f2 = 1)

-> Bitmap Index Scan on tbase_2_f2_idx (cost=0.00..2.18 rows=4 width=0)

Index Cond: (f2 = 1)

(4 rows)

dn002 --- psql -h 172.16.0.42 -p 11012 -d postgres -U tbase -c "explain select * from tbase_2 where f2=1"

QUERY PLAN

-------------------------------------------------------------------------------

Index Scan using tbase_2_f2_idx on tbase_2 (cost=0.42..4.44 rows=1 width=14)

Index Cond: (f2 = 1)

(2 rows)

两个 dn 的执行计划不一致，最大可能是数据倾斜或者是执行计划被禁用。

如有可能，DBA 可以配置在系统空闲时执行全库 analyze 和 vacuum。

腾讯云TDSQL PostgreSQL版 -最佳实践｜优化 SQL 语句的更多相关文章

腾讯云TDSQL MySQL版 - 开发指南二级分区
TDSQL MySQL版目前支持 Range 和 List 两种格式的二级分区,具体建表语法和 MySQL 分区语法类似. 二级分区语法一级 Hash,二级 List 分区示例如下: MySQL ...
腾讯云TDSQL MySQL版 - 开发指南分布式事务
由于事务操作的数据通常跨多个物理节点,在分布式数据库中,类似方案即称为分布式事务. TDSQL MySQL版支持普通分布式事务协议和 XA 分布式事务协议.TDSQL MySQL版(内核5.7或以上 ...
286万QPS！腾讯云TDSQL打造数据库领域的“超音速战机”
Bloodhound SSC超音速汽车将陆地极限速度提升到1678公里/小时,号称陆地“超音速战斗机”.无独有偶,同样也在2017年,在英特尔®.腾讯金融云团队的共同见证下,腾讯云数据库TDSQL采用 ...
腾讯云TDSQL审计原理揭秘
版权声明:本文由孙勇福原创文章,转载请注明出处: 文章原文链接:https://www.qcloud.com/community/article/244 来源:腾云阁 https://www.qclo ...
MaxCompute 构建企业云数据仓库CDW的最佳实践建议
在本文中阿里云资深产品专家云郎分享了基于阿里云 MaxCompute 构建企业云数据仓库CDW的最佳实践建议. 本文内容根据演讲视频以及PPT整理而成. 大家下午好,我是云郎,之前在甲骨文做企业架构师 ...
强强联袂！腾讯云TDSQL与国双战略签约，锚定国产数据库巨大市场
日前,腾讯云计算(北京)有限责任公司与北京国双科技有限公司签署了<国产数据库产品战略合作协议>,双方将在数据库技术方面展开深度合作,通过分布式交易型数据库的联合研发.产品服务体系建设.品牌 ...
腾讯云TDSQL监控库密码忘记问题解决实战
首先,给大家介绍一下TDSQL.TDSQL MySQL 版(TDSQL for MySQL)是腾讯打造的一款分布式数据库产品,具备强一致高可用.全球部署架构.分布式水平扩展.高性能.企业级安全等特性, ...
腾讯云CDB的AI技术实践：CDBTune
欢迎大家前往腾讯云+社区,获取更多腾讯海量技术实践干货哦~ 作者:邢家树,高级工程师,目前就职于腾讯TEG基础架构部数据库团队.腾讯数据库技术团队维护MySQL内核分支TXSQL,100%兼容原生My ...
揭秘华为云GaussDB(for Influx)最佳实践：hint查询
摘要:GaussDB(for Influx)通过提供hint功能,在单时间线的查询场景下,性能有大幅度的提升,能有效满足客户某些特定场景的查询需求. 本文分享自华为云社区<华为云GaussDB( ...

随机推荐

drf-Request与Response
一.Request 在Rest Framework 传入视图的request对象已经不再是Django默认的HTTPResponse对象了,而是Rest Framework提供的Request类的对象 ...
SpringCloud：SpringBoot整合SpringCloud项目
划分模块这里我划分了四个模块 Common: 存放bean和Dao模块 Consumer: 消费者模块,提供对外暴露接口服务 EurekaServer: Eureka注册中心模块,主要用于启动注册中 ...
Shell中[和[[的异同
1. 概念上来说 "[[",是关键字,许多shell(如ash bsh)并不支持这种方式.ksh, bash(据说从2.02起引入对[[的支持)等支持."[&qu ...
linux添加用户并授权访问目录
1.创建用户及访问目录 useradd test -d /data/app -M设置密码passwd test 将访问目录权限全部赋予用户chown -R test /data/app2. 创建组(如 ...
.NET Core/.NET5/.NET6 开源项目汇总11：WPF组件库1
系列目录 [已更新最新开发文章,点击查看详细] WPF(Windows Presentation Foundation)是微软推出的基于Windows 的用户界面框架,属于.NET Frame ...
Nginx PHP测试装
Nginx yum -y install gcc gcc-c++ make automake autoconf pcre pcre-devel zlib zlib-devel openssl open ...
Activiti7 回退与会签
1. 回退(驳回) 回退的思路就是动态更改节点的流向.先遇水搭桥,最后再过河拆桥. 具体操作如下: 取得当前节点的信息取得当前节点的上一个节点的信息保存当前节点的流向新建流向,由当前节点指向上 ...
python05篇 json和函数
一.json json就是一个字符串,只不过是所有语言能解析这个字符串.1.1 把python的数据类型转为json import json d = {'name': 'xiaohei', 'cars ...
python02篇字典、元组、切片
一.字典 1.1 字典的常用方法 # 字典数据类型 {} key-value # list是挨个循环查找,字典是根据key查找value,比list遍历效率高 d = { 'username': ' ...
C语言：float表示范围
#include <stdio.h> #include <limits.h> //整数限制 #include <float.h> //浮点数限制 void main ...

腾讯云TDSQL PostgreSQL版 -最佳实践 ｜优化 SQL 语句

腾讯云TDSQL PostgreSQL版 -最佳实践 ｜优化 SQL 语句的更多相关文章

随机推荐

热门专题

腾讯云TDSQL PostgreSQL版 -最佳实践｜优化 SQL 语句

腾讯云TDSQL PostgreSQL版 -最佳实践｜优化 SQL 语句的更多相关文章