PostgreSQL 13支持增量排序(Incremental Sorting)

PostgreSQL 13一个重要的功能是支持增量排序,使用order by 时可以加速排序,SQL如下

select * from test order by a,b limit 10;

如果在字段a上面建立了索引,需要对字段a、b进行排序,如果一个结果已经按几个前导键排序,这就允许对附加的b进行批量排序。

enable_incremental_sort

PostgreSQL新增了配置enable_incremental_sort用于控制是否开启增量排序,此参数默认开启

测试准备

在PostgreSQL 13中创建测试表进行测试

postgres=# create table test(id int,c1 int ,c2 int,info varchar(300),crt_time timestamp);
CREATE TABLE
postgres=# insert into test select t,t,2,'test',clock_timestamp() from generate_series(1,1000000)t;
INSERT 0 1000000
postgres=# create index i_test_id on test(id);
CREATE INDEX
--查看数据如下
postgres=# select * from test order by id,c1 limit 10;
id | c1 | c2 | info | crt_time
----+----+----+------+----------------------------
1 | 1 | 2 | test | 2022-06-02 14:23:38.253289
2 | 2 | 2 | test | 2022-06-02 14:23:38.253777
3 | 3 | 2 | test | 2022-06-02 14:23:38.253785
4 | 4 | 2 | test | 2022-06-02 14:23:38.253787
5 | 5 | 2 | test | 2022-06-02 14:23:38.25379
6 | 6 | 2 | test | 2022-06-02 14:23:38.253791
7 | 7 | 2 | test | 2022-06-02 14:23:38.253793
8 | 8 | 2 | test | 2022-06-02 14:23:38.253795
9 | 9 | 2 | test | 2022-06-02 14:23:38.253809
10 | 10 | 2 | test | 2022-06-02 14:23:38.25381
(10 rows)

PostgreSQL 13 测试

  • 这里我是在pg14中做的测试,pg13这个参数名叫enable_incrementalsort
postgres=# show enable_incremental_sort;
enable_incremental_sort
-------------------------
on
(1 row) postgres=# explain analyze select * from test order by id,c1 limit 10;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.46..1.16 rows=10 width=25) (actual time=0.159..0.163 rows=10 loops=1)
-> Incremental Sort (cost=0.46..70373.03 rows=1000000 width=25) (actual time=0.157..0.159 rows=10 loops=1)
Sort Key: id, c1
Presorted Key: id
Full-sort Groups: 1 Sort Method: quicksort Average Memory: 25kB Peak Memory: 25kB
-> Index Scan using i_test_id on test (cost=0.42..25373.02 rows=1000000 width=25) (actual time=0.103..0.106 rows=11 loops=1)
Planning Time: 0.427 ms
Execution Time: 0.265 ms
(8 rows)
  • 可以看到Incremental SortPresorted Key: id并且走了i_test_id索引,SQL耗时0.265ms

关闭enable_incremental_sort

postgres=# set enable_incremental_sort=off;
SET
postgres=# explain analyze select * from test order by id,c1 limit 10;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------
Limit (cost=38962.64..38962.67 rows=10 width=25) (actual time=272.945..272.953 rows=10 loops=1)
-> Sort (cost=38962.64..41462.64 rows=1000000 width=25) (actual time=272.933..272.937 rows=10 loops=1)
Sort Key: id, c1
Sort Method: top-N heapsort Memory: 25kB
-> Seq Scan on test (cost=0.00..17353.00 rows=1000000 width=25) (actual time=0.028..118.098 rows=1000000 loops=1)
Planning Time: 0.305 ms
Execution Time: 273.023 ms
(7 rows)
  • 关闭增量排序后SQL耗时273.023 ms,性能差了几个数量级

PostgreSQL 12 测试

  • Abase 7.0基于PostgreSQL 12.3

同样使用上面的建表语句,执行SQL如下

postgres=#  explain analyze select * from test order by id,c1 limit 10;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------
Limit (cost=38962.64..38962.67 rows=10 width=536) (actual time=288.847..288.851 rows=10 loops=1)
-> Sort (cost=38962.64..41462.64 rows=1000000 width=536) (actual time=288.839..288.840 rows=10 loops=1)
Sort Key: id, c1
Sort Method: top-N heapsort Memory: 25kB
-> Seq Scan on test (cost=0.00..17353.00 rows=1000000 width=536) (actual time=0.078..173.460 rows=1000000 loops=1)
Planning Time: 24.726 ms
Execution Time: 289.135 ms
(7 rows)

PG 12中执行计划和PG 14关闭enable_incremental_sort参数一样,性能较低

当然这只是一个简单的查询,如果包含where,以及连表等情况是否也可以使用 Incremental Sort

带条件

加上c1 > 100000,c1没有创建索引

postgres=# explain analyze select * from test where c1 > 100000 order by id,c1 limit 10;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.47..1.23 rows=10 width=25) (actual time=49.470..49.476 rows=10 loops=1)
-> Incremental Sort (cost=0.47..68345.40 rows=899386 width=25) (actual time=49.467..49.469 rows=10 loops=1)
Sort Key: id, c1
Presorted Key: id
Full-sort Groups: 1 Sort Method: quicksort Average Memory: 25kB Peak Memory: 25kB
-> Index Scan using i_test_id on test (cost=0.42..27873.02 rows=899386 width=25) (actual time=49.383..49.387 rows=11 loops=1)
Filter: (c1 > 100000)
Rows Removed by Filter: 100000
Planning Time: 0.879 ms
Execution Time: 49.594 ms
(10 rows)

加上 id > 100000,id有索引

postgres=# explain analyze select * from test where id > 100000 order by id,c1 limit 10;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.46..1.19 rows=10 width=25) (actual time=0.160..0.164 rows=10 loops=1)
-> Incremental Sort (cost=0.46..65542.05 rows=899386 width=25) (actual time=0.148..0.150 rows=10 loops=1)
Sort Key: id, c1
Presorted Key: id
Full-sort Groups: 1 Sort Method: quicksort Average Memory: 25kB Peak Memory: 25kB
-> Index Scan using i_test_id on test (cost=0.42..25069.68 rows=899386 width=25) (actual time=0.115..0.119 rows=11 loops=1)
Index Cond: (id > 100000)
Planning Time: 0.408 ms
Execution Time: 0.258 ms
(9 rows)

可以看到即使where条件没有索引,排序字段有索引也可以使用增量排序功能,而且效果也还不错。做了一个过滤操作 Filter: (c1 > 100000)

PG 13 多字段排序

  • 根据id,c1,c2进行排序,一样可以走增量排序
postgres=# explain analyze select * from test  order by id,c1,c2 limit 10;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.46..1.16 rows=10 width=25) (actual time=0.175..0.179 rows=10 loops=1)
-> Incremental Sort (cost=0.46..70373.03 rows=1000000 width=25) (actual time=0.172..0.174 rows=10 loops=1)
Sort Key: id, c1, c2
Presorted Key: id
Full-sort Groups: 1 Sort Method: quicksort Average Memory: 25kB Peak Memory: 25kB
-> Index Scan using i_test_id on test (cost=0.42..25373.02 rows=1000000 width=25) (actual time=0.126..0.130 rows=11 loops=1)
Planning Time: 0.485 ms
Execution Time: 0.237 ms
(8 rows)

PG 13 join

  • 复制一张test2
postgres=# create table test2 as select * from test;
SELECT 1000000 postgres=# create index i_test2_id on test2(id);
CREATE INDEX
  • join连表查询,并且排序字段test.id,test.c1
postgres=# explain analyze select *from test join test2 on test.id = test2.id order by test.id,test.c1 limit 10;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=1.93..3.04 rows=10 width=50) (actual time=0.089..0.092 rows=10 loops=1)
-> Incremental Sort (cost=1.93..110738.33 rows=1000000 width=50) (actual time=0.087..0.089 rows=10 loops=1)
Sort Key: test.id, test.c1
Presorted Key: test.id
Full-sort Groups: 1 Sort Method: quicksort Average Memory: 26kB Peak Memory: 26kB
-> Merge Join (cost=1.85..65738.33 rows=1000000 width=50) (actual time=0.044..0.068 rows=11 loops=1)
Merge Cond: (test.id = test2.id)
-> Index Scan using i_test_id on test (cost=0.42..25373.02 rows=1000000 width=25) (actual time=0.022..0.036 rows=11 loops=1)
-> Index Scan using i_test2_id on test2 (cost=0.42..25373.02 rows=1000000 width=25) (actual time=0.014..0.018 rows=11 loops=1)
Planning Time: 0.599 ms
Execution Time: 0.174 ms
(11 rows) postgres=# set enable_incremental_sort=off ;
SET
postgres=# explain analyze select *from test join test2 on test.id = test2.id order by test.id,test.c1 limit 10;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=87347.97..87347.99 rows=10 width=50) (actual time=1964.394..1964.407 rows=10 loops=1)
-> Sort (cost=87347.97..89847.97 rows=1000000 width=50) (actual time=1964.391..1964.402 rows=10 loops=1)
Sort Key: test.id, test.c1
Sort Method: top-N heapsort Memory: 26kB
-> Merge Join (cost=1.85..65738.33 rows=1000000 width=50) (actual time=0.070..1690.949 rows=1000000 loops=1)
Merge Cond: (test.id = test2.id)
-> Index Scan using i_test_id on test (cost=0.42..25373.02 rows=1000000 width=25) (actual time=0.042..571.732 rows=1000000 loops=1)
-> Index Scan using i_test2_id on test2 (cost=0.42..25373.02 rows=1000000 width=25) (actual time=0.017..585.722 rows=1000000 loops=1)
Planning Time: 1.292 ms
Execution Time: 1964.517 ms
(10 rows)

join后排序也可以走增量排序,使用增量排序耗时:0.174 ms,而关闭增量后耗时1964.517 ms

  • 如果join后排序的字段来自不同的表test.id,test2.c1
postgres=# explain analyze select *from test join test2 on test.id = test2.id order by test.id,test2.c1 limit 10;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=1.93..3.04 rows=10 width=50) (actual time=0.151..0.155 rows=10 loops=1)
-> Incremental Sort (cost=1.93..110738.33 rows=1000000 width=50) (actual time=0.149..0.151 rows=10 loops=1)
Sort Key: test.id, test2.c1
Presorted Key: test.id
Full-sort Groups: 1 Sort Method: quicksort Average Memory: 26kB Peak Memory: 26kB
-> Merge Join (cost=1.85..65738.33 rows=1000000 width=50) (actual time=0.075..0.088 rows=11 loops=1)
Merge Cond: (test.id = test2.id)
-> Index Scan using i_test_id on test (cost=0.42..25373.02 rows=1000000 width=25) (actual time=0.040..0.044 rows=11 loops=1)
-> Index Scan using i_test2_id on test2 (cost=0.42..25373.02 rows=1000000 width=25) (actual time=0.025..0.028 rows=11 loops=1)
Planning Time: 0.778 ms
Execution Time: 0.230 ms
(11 rows) postgres=# set enable_incremental_sort=off ;
SET
postgres=# explain analyze select *from test join test2 on test.id = test2.id order by test.id,test2.c1 limit 10;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=87347.97..87347.99 rows=10 width=50) (actual time=1493.513..1493.519 rows=10 loops=1)
-> Sort (cost=87347.97..89847.97 rows=1000000 width=50) (actual time=1493.510..1493.513 rows=10 loops=1)
Sort Key: test.id, test2.c1
Sort Method: top-N heapsort Memory: 26kB
-> Merge Join (cost=1.85..65738.33 rows=1000000 width=50) (actual time=0.065..1228.403 rows=1000000 loops=1)
Merge Cond: (test.id = test2.id)
-> Index Scan using i_test_id on test (cost=0.42..25373.02 rows=1000000 width=25) (actual time=0.027..318.044 rows=1000000 loops=1)
-> Index Scan using i_test2_id on test2 (cost=0.42..25373.02 rows=1000000 width=25) (actual time=0.027..390.231 rows=1000000 loops=1)
Planning Time: 0.761 ms
Execution Time: 1493.685 ms
(10 rows)

join后排序的字段来自不同的表test.id,test2.c1,也可以走增量排序,开启增量耗时:0.230,关闭后耗时:1493.685 ms

来看看一个比较慢的SQL:

  • 这个SQL两表关联,而且使用了c2=2这一列全部为2,并且使用offset 100000
postgres=# explain analyze select *from test join test2 on test.id = test2.id where test.c2 = 2 order by test.id,test2.c1 limit 10 offset 100000;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=11325.58..11326.72 rows=10 width=50) (actual time=198.125..198.131 rows=10 loops=1)
-> Incremental Sort (cost=2.02..113237.64 rows=1000000 width=50) (actual time=0.127..193.661 rows=100010 loops=1)
Sort Key: test.id, test2.c1
Presorted Key: test.id
Full-sort Groups: 3126 Sort Method: quicksort Average Memory: 29kB Peak Memory: 29kB
-> Merge Join (cost=1.94..68237.64 rows=1000000 width=50) (actual time=0.052..152.908 rows=100011 loops=1)
Merge Cond: (test.id = test2.id)
-> Index Scan using i_test_id on test (cost=0.42..27873.02 rows=1000000 width=25) (actual time=0.026..46.138 rows=100011 loops=1)
Filter: (c2 = 2)
-> Index Scan using i_test2_id on test2 (cost=0.42..25373.02 rows=1000000 width=25) (actual time=0.020..51.088 rows=100011 loops=1)
Planning Time: 0.707 ms
Execution Time: 198.252 ms
(12 rows)

因为增量排序的缘故,查询还是很快

  • 如果我们关闭增量排序功能
postgres=# explain analyze select *from test join test2 on test.id = test2.id where test.c2 = 2 order by test.id,test2.c1 limit 10 offset 100000;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=156536.56..156536.59 rows=10 width=50) (actual time=2496.085..2496.093 rows=10 loops=1)
-> Sort (cost=156286.56..158786.56 rows=1000000 width=50) (actual time=2469.643..2491.429 rows=100010 loops=1)
Sort Key: test.id, test2.c1
Sort Method: external merge Disk: 72432kB
-> Merge Join (cost=1.94..68237.64 rows=1000000 width=50) (actual time=0.082..1371.433 rows=1000000 loops=1)
Merge Cond: (test.id = test2.id)
-> Index Scan using i_test_id on test (cost=0.42..27873.02 rows=1000000 width=25) (actual time=0.040..433.114 rows=1000000 loops=1)
Filter: (c2 = 2)
-> Index Scan using i_test2_id on test2 (cost=0.42..25373.02 rows=1000000 width=25) (actual time=0.033..401.784 rows=1000000 loops=1)
Planning Time: 0.807 ms
Execution Time: 2530.205 ms
(11 rows)

这个SQL耗时 2530.205 ms,和198.252 ms比增量排序提升还是很明显

但是我们观察到上面的SQL中使用id进行关联,且用id排序的时候查询效率较高,如果排序的字段换成crt_time效果如何?

postgres=# explain analyze select *from test join test2 on test.id = test2.id where test.c2 = 2 order by test.crt_time,test2.c1 limit 10 offset 100000;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=156536.56..156536.59 rows=10 width=50) (actual time=2702.107..2702.133 rows=10 loops=1)
-> Sort (cost=156286.56..158786.56 rows=1000000 width=50) (actual time=2667.324..2697.033 rows=100010 loops=1)
Sort Key: test.crt_time, test2.c1
Sort Method: external merge Disk: 72432kB
-> Merge Join (cost=1.94..68237.64 rows=1000000 width=50) (actual time=0.161..1524.794 rows=1000000 loops=1)
Merge Cond: (test.id = test2.id)
-> Index Scan using i_test_id on test (cost=0.42..27873.02 rows=1000000 width=25) (actual time=0.074..488.803 rows=1000000 loops=1)
Filter: (c2 = 2)
-> Index Scan using i_test2_id on test2 (cost=0.42..25373.02 rows=1000000 width=25) (actual time=0.073..487.688 rows=1000000 loops=1)
Planning Time: 1.835 ms
Execution Time: 2746.486 ms
(11 rows)

当join关联的字段和order by的字段不一样时,虽然order by的字段有索引但也不能走,如果字段一致那么也能利用增量排序。

使用test.crt_time排序和上面关闭增量排序执行计划一样

总结

  • 增量排序对于单表多字段排序来说效率还是提升明显

  • join连表查询如果关联的键和排序键一样也能走增量排序,如果不一样则不能走增量排序

参考资料:

https://postgres.fun/20200721193000.html

新版本调研 · 13 Beta 1 初体验

https://mp.weixin.qq.com/s/mBIL2uzIHB7qVByBIVRmhg

PostgreSQL 13支持增量排序(Incremental Sorting)的更多相关文章

  1. 关于VC++的增量链接(Incremental Linking)

    增量链接(Incremental Linking)这个词语在使用Visual C++时经常会遇到(其实不只是VS系列,其它链接器也有这个特性), 就比如经常遇到的:上一个增量链接没有生成它, 正在执行 ...

  2. PostgreSQL 13.4的安装记录

    PostgreSQL 13.4的安装记录 想着MySQL被Oracle给买了,总得做点别的准备不是,找了找别的开源的关系型数据库,貌似PostgreSQL的评价很不错,就试试先 因为是window10 ...

  3. 自建数据源(RSO2),并支持增量

    声明:原创作品,转载时请注明文章来自SAP师太技术博客( 博/客/园www.cnblogs.com):www.cnblogs.com/jiangzhengjun,并以超链接形式标明文章原始出处,否则将 ...

  4. 谢尔排序/缩减增量排序(C++)

    谢尔排序/缩减增量排序(C++) 谢尔排序/缩减增量排序: 他通过比较相距一定间隔的元素来工作,各趟比较所用的距离随着算法的进行而减小,直到只比较相邻元素的最后一趟排序为止.(好复杂) 看了一下实现代 ...

  5. Python数据结构应用5——排序(Sorting)

    在具体算法之前,首先来看一下排序算法衡量的标准: 比较:比较两个数的大小的次数所花费的时间. 交换:当发现某个数不在适当的位置时,将其交换到合适位置花费的时间. 冒泡排序(Bubble Sort) 这 ...

  6. java算法----排序----(6)希尔排序(最小增量排序)

    package log; public class Test4 { /** * java算法---希尔排序(最小增量排序) * * @param args */ public static void ...

  7. 通过orderby关键字,LINQ可以实现升序和降序排序。LINQ还支持次要排序。

    通过orderby关键字,LINQ可以实现升序和降序排序.LINQ还支持次要排序. LINQ默认的排序是升序排序,如果你想使用降序排序,就要使用descending关键字. static void M ...

  8. 拓扑排序 (Topological Sorting)

    拓扑排序(Topological Sorting) 一.拓扑排序 含义 构造AOV网络全部顶点的拓扑有序序列的运算称为拓扑排序(Topological Sorting). 在图论中,拓扑排序(Topo ...

  9. MySQL为什么不支持中文排序?

    前言 或许都知道,MySQL不支持中文排序,这样的说法可以说对也可以说也不对.接下来我们分析一下: 首先执行命令,查看编码集: SHOW VARIABLES LIKE 'character_set%' ...

随机推荐

  1. 每日学习+AS小相册+导入图片标红的原因

    学习内容: 1.TextView(怎么设置文本). Button(怎么设置按钮事件). ImageView(怎么设置图片) 2.LinearLayout的基本使用 今日成果:做了一个小相册 遇到的问题 ...

  2. HTML 和 form 表单常用标签

    HTML和CSS 常用标签: p:段落,自动换行 span:和div类似,但是默认不换行 br:换行 hr:分割线 h1-h6:标题标签 a:超链接 瞄点:通过给a链接设置#XX作为链接,给需要链接的 ...

  3. oracle system,sys用户 忘记密码,怎么修改密码

    sys用户是Oracle中权限最高的用户,而system是一个用于数据库管理的用户.在数据库安装完之后,应立即修改sys,system这两个用户的密码,以保证数据库的安全.但是我们有时候会遗忘密码或者 ...

  4. JavaScript学习总结4-规范

    昨天学习了JS的严格检查模式,今天做一点补充 1 <!DOCTYPE html> 2 <html lang="en"> 3 <head> 4 & ...

  5. 《Mybatis 手撸专栏》第6章:数据源池化技术实现

    作者:小傅哥 博客:https://bugstack.cn - 手写Mybatis系列文章 沉淀.分享.成长,让自己和他人都能有所收获! 一.前言 码农,只会做不会说? 你有发现吗,其实很大一部分码农 ...

  6. FreeRTOS --(17)任务通知浅析

    转载自https://blog.csdn.net/zhoutaopower/article/details/107467305 在 FreeRTOS 中,还有一个东西也可以用作任务与任务,中断与任务的 ...

  7. 用python实现matlib的 生成高斯模糊核

    最近在做一个关于模糊图片恢复的数学建模,遇到了一个大问题,特记录一下. 在matlib中有  PSF = fspecial('motion', LEN, THETA);  来生成模糊核函数,但在pyt ...

  8. [还不会搭建博客吗?]centos7系统部署hexo博客新手入门-进阶,看这一篇就够了

    @ 目录 *本文说明 请大家务必查看 前言 首先介绍一下主角:Hexo 什么是 Hexo? 环境准备 详细版 入门:搭建步骤 安装git: 安装node: 安装Hexo: 进阶:hexo基本操作 发布 ...

  9. 1.3 Linux和UNIX的关系及区别(详解版)

    UNIX 与 Linux 之间的关系是一个很有意思的话题.在目前主流的服务器端操作系统中,UNIX 诞生于 20 世纪 60 年代末,Windows 诞生于 20 世纪 80 年代中期,Linux 诞 ...

  10. Linux vs Unix - Linux与Unix到底有什么不同?

    来自:Linux迷链接:https://www.linuxmi.com/linux-vs-unix.html Linux和Unix这两个术语可以互换地用来指同一操作系统.这在很大程度上是由于他们惊人的 ...