函数调用次数与性能

在查询语句中,如果 Select 子句调用了较为耗时的函数或子查询,需要特别考虑函数调用次数对于SQL整体执行时间的影响。

一、数据准备,SQL 语句

  • 模拟较耗时的用户函数

确保执行子查询的时长是1秒。

create or replace function f001()
returns int stable language sql
as
$$
select 1 from pg_sleep(1);
$$;
  • 模拟返回多行数据的子查询

结果集中,关联条件列含有重复值。

test=# create table t1 as select sn, id from (select generate_series(1, 3) sn),    (select generate_series(3, 8) id);
SELECT 18
test=# create table t2 as select generate_series(1, 5) id;
SELECT 5 test=# select * from t1;
sn | id
----+----
1 | 3
2 | 3
3 | 3
1 | 4
2 | 4
3 | 4
1 | 5
2 | 5
3 | 5
1 | 6
2 | 6
3 | 6
1 | 7
2 | 7
3 | 7
1 | 8
2 | 8
3 | 8
(18 rows) test=# select * from t2;
id
----
1
2
3
4
5
(5 rows)  

二、查询全体数据的优化方案

1、初始SQL

with a as (select sn, id  from t1 )
select a.*, (select id + f001() sq_sum from t2 b where b.id = a.id)
from a
where 1 = 1; sn | id | sq_sum
----+----+--------
1 | 3 | 4
2 | 3 | 4
3 | 3 | 4
1 | 4 | 5
2 | 4 | 5
3 | 4 | 5
1 | 5 | 6
2 | 5 | 6
3 | 5 | 6
1 | 6 |
2 | 6 |
3 | 6 |
1 | 7 |
2 | 7 |
3 | 7 |
1 | 8 |
2 | 8 |
3 | 8 | QUERY PLAN
-------------------------------------------------------------------------------------------------------------
Seq Scan on t1 (cost=0.00..101806.05 rows=2260 width=12) (actual time=1000.638..9008.584 rows=18 loops=1)
SubPlan 1
-> Seq Scan on t2 b (cost=0.00..45.03 rows=13 width=4) (actual time=500.465..500.468 rows=0 loops=18)
Filter: (id = t1.id)
Rows Removed by Filter: 4
Planning Time: 0.149 ms
Execution Time: 9008.667 ms
(7 rows)

对于t1 表的每条记录,都要访问一次t2 ,如果 t2 表有对应的满足条件的记录,就要调用一次函数。

2、CTE

使用临时表的结果进行连接,避免循环。注意,这里CTE 有materilaized 选项,主要是把耗时的部分先执行出结果,避免与其他部分查询合并,引发多次执行。

with a as (select sn, id  from t1),
b as materialized (select id, id + f001() sq_sum from t2 )
select a.*, (select sq_sum from b where b.id = a.id)
from a
where 1 = 1; QUERY PLAN
---------------------------------------------------------------------------------------------------------------
Seq Scan on t1 (cost=676.75..129868.35 rows=2260 width=12) (actual time=5004.377..5004.392 rows=18 loops=1)
CTE b
-> Seq Scan on t2 (cost=0.00..676.75 rows=2540 width=8) (actual time=1000.562..5004.348 rows=5 loops=1)
SubPlan 2
-> CTE Scan on b (cost=0.00..57.15 rows=13 width=4) (actual time=166.821..278.021 rows=0 loops=18)
Filter: (id = t1.id)
Rows Removed by Filter: 4
Planning Time: 0.107 ms
Execution Time: 5004.441 ms
(9 rows)

3、LEFT JOIN 子查询

与上例一样,还是利用materialize 特性,避免了函数的多次调用。

with a as (select sn, id from t1)
select a.*, b.sq_sum
from a left join (select id, id + f001() sq_sum from t2 ) b on b.id = a.id
where 1 = 1 ; QUERY PLAN
-------------------------------------------------------------------------------------------------------------------
Nested Loop Left Join (cost=0.00..86821.70 rows=28702 width=12) (actual time=3002.905..5005.627 rows=18 loops=1)
Join Filter: (t2.id = t1.id)
Rows Removed by Join Filter: 81
-> Seq Scan on t1 (cost=0.00..32.60 rows=2260 width=8) (actual time=0.007..0.011 rows=18 loops=1)
-> Materialize (cost=0.00..689.45 rows=2540 width=8) (actual time=55.621..278.088 rows=5 loops=18)
-> Seq Scan on t2 (cost=0.00..676.75 rows=2540 width=8) (actual time=1001.176..5005.568 rows=5 loops=1)
Planning Time: 0.180 ms
Execution Time: 5005.650 ms
(8 rows)

三、查询局部数据(过滤主表的条件)的优化方案

1、初始SQL

with a as (select sn, id from t1)
select a.*, (select id + f001() sq_sum from t2 b where b.id = a.id)
from a
where 1 = 1 and a.id = 3; sn | id | sq_sum
----+----+--------
1 | 3 | 4
2 | 3 | 4
3 | 3 | 4
(3 rows) QUERY PLAN
--------------------------------------------------------------------------------------------------------------
Seq Scan on t1 (cost=0.00..533.61 rows=11 width=12) (actual time=1000.356..3002.635 rows=3 loops=1)
Filter: (id = 3)
Rows Removed by Filter: 15
SubPlan 1
-> Seq Scan on t2 b (cost=0.00..45.03 rows=13 width=4) (actual time=1000.859..1000.866 rows=1 loops=3)
Filter: (id = t1.id)
Rows Removed by Filter: 4
Planning Time: 0.164 ms
Execution Time: 3002.657 ms
(9 rows)

2、CTE

with a as (select sn, id from t1),
b as materialized (select id, id + f001() sq_sum from t2)
select a.*, (select sq_sum from b where b.id = a.id) sq_sum
from a
where 1 = 1 and a.id=3 ; QUERY PLAN
---------------------------------------------------------------------------------------------------------------
Seq Scan on t1 (cost=676.75..1343.65 rows=11 width=12) (actual time=5004.958..5004.965 rows=3 loops=1)
Filter: (id = 3)
Rows Removed by Filter: 15
CTE b
-> Seq Scan on t2 (cost=0.00..676.75 rows=2540 width=8) (actual time=1001.387..5004.930 rows=5 loops=1)
SubPlan 2
-> CTE Scan on b (cost=0.00..57.15 rows=13 width=4) (actual time=1001.132..1668.316 rows=1 loops=3)
Filter: (id = t1.id)
Rows Removed by Filter: 4
Planning Time: 0.105 ms
Execution Time: 5004.983 ms
(11 rows)

3、LEFT JOIN 子查询

explain analyse
with a as (select sn, id from t1)
select a.*, b.sq_sum
from a left join (select id, id + f001() sq_sum from t2) b on b.id = a.id
where 1 = 1 and a.id = 3 ; QUERY PLAN
----------------------------------------------------------------------------------------------------------------
Nested Loop Left Join (cost=0.00..85.46 rows=143 width=12) (actual time=1000.972..1000.980 rows=3 loops=1)
Join Filter: (t2.id = t1.id)
-> Seq Scan on t1 (cost=0.00..38.25 rows=11 width=8) (actual time=0.008..0.009 rows=3 loops=1)
Filter: (id = 3)
Rows Removed by Filter: 15
-> Materialize (cost=0.00..45.10 rows=13 width=8) (actual time=333.654..333.656 rows=1 loops=3)
-> Seq Scan on t2 (cost=0.00..45.03 rows=13 width=8) (actual time=1000.959..1000.964 rows=1 loops=1)
Filter: (id = 3)
Rows Removed by Filter: 4
Planning Time: 0.107 ms
Execution Time: 1000.996 ms
(11 rows)

四、查询局部数据(过滤从表的条件)的优化方案

1、初始SQL

with a as (select sn, id from (select generate_series(1, 3) sn), (select generate_series(3, 8) id))
select a.*, (select id + f001() sq_sum from (select generate_series(1, 5) id) b where b.id = a.id) sq_sum
from a
where 1 = 1 and sq_sum = 6 ;

sn | id | sq_sum
----+----+--------
1 | 5 | 6
2 | 5 | 6
3 | 5 | 6
(3 行记录) 时间:6007.526 ms (00:06.008)
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.00..3.43 rows=3 width=12) (actual time=4004.455..6006.681 rows=3 loops=1)
-> Subquery Scan on "SYSINTERNAL-4-1" (cost=0.00..2.27 rows=1 width=4) (actual time=3003.273..3003.289 rows=1 loops=1)
Filter: ((SubPlan 2) = 6)
Rows Removed by Filter: 5
-> ProjectSet (cost=0.00..0.05 rows=6 width=4) (actual time=0.001..0.005 rows=6 loops=1)
-> Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.000..0.001 rows=1 loops=1)
SubPlan 2
-> Subquery Scan on b_1 (cost=0.00..0.36 rows=1 width=4) (actual time=500.541..500.544 rows=0 loops=6)
Filter: (b_1.id = "SYSINTERNAL-4-1".id)
Rows Removed by Filter: 4
-> ProjectSet (cost=0.00..0.04 rows=5 width=4) (actual time=0.001..0.003 rows=5 loops=6)
-> Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.000..0.001 rows=1 loops=6)
-> ProjectSet (cost=0.00..0.03 rows=3 width=4) (actual time=0.002..0.006 rows=3 loops=1)
-> Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.000..0.001 rows=1 loops=1)
SubPlan 1
-> Subquery Scan on b (cost=0.00..0.36 rows=1 width=4) (actual time=1001.116..1001.122 rows=1 loops=3)
Filter: (b.id = "SYSINTERNAL-4-1".id)
Rows Removed by Filter: 4
-> ProjectSet (cost=0.00..0.04 rows=5 width=4) (actual time=0.001..0.006 rows=5 loops=3)
-> Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.000..0.001 rows=1 loops=3)
Planning Time: 0.121 ms
Execution Time: 6006.709 ms
(22 行记录) 时间:6007.303 ms (00:06.007)

2、CTE

explain analyse
with a as (select sn, id from (select generate_series(1, 3) sn), (select generate_series(3, 8) id)),
b as (select id, id + f001() sq_sum from generate_series(1, 5) id)
select a.*, (select sq_sum from b where b.id = a.id) sq_sum
from a
where 1 = 1 and sq_sum = 6 ; QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=1.31..2.54 rows=3 width=12) (actual time=5005.414..5005.422 rows=3 loops=1)
CTE b
-> Function Scan on generate_series id (cost=0.00..1.31 rows=5 width=8) (actual time=1001.093..5005.376 rows=5 loops=1)
-> Subquery Scan on "SYSINTERNAL-4-1" (cost=0.00..0.80 rows=1 width=4) (actual time=5005.406..5005.410 rows=1 loops=1)
Filter: ((SubPlan 3) = 6)
Rows Removed by Filter: 5
-> ProjectSet (cost=0.00..0.05 rows=6 width=4) (actual time=0.002..0.005 rows=6 loops=1)
-> Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.000..0.001 rows=1 loops=1)
SubPlan 3
-> CTE Scan on b b_1 (cost=0.00..0.11 rows=1 width=4) (actual time=500.540..834.232 rows=0 loops=6)
Filter: (id = "SYSINTERNAL-4-1".id)
Rows Removed by Filter: 4
-> ProjectSet (cost=0.00..0.03 rows=3 width=4) (actual time=0.002..0.003 rows=3 loops=1)
-> Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.000..0.000 rows=1 loops=1)
SubPlan 2
-> CTE Scan on b (cost=0.00..0.11 rows=1 width=4) (actual time=0.001..0.001 rows=1 loops=3)
Filter: (id = "SYSINTERNAL-4-1".id)
Rows Removed by Filter: 4
Planning Time: 0.143 ms
Execution Time: 5005.449 ms
(20 行记录) 时间:5006.115 ms (00:05.006)

3、LEFT JOIN

explain analyse
with a as (select sn, id from (select generate_series(1, 3) sn), (select generate_series(3, 8) id))
select a.*, b.sq_sum
from a left join (select id, id + f001() sq_sum from generate_series(1, 5) id) b on b.id = a.id
where 1 = 1 and sq_sum = 6 ; QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=1.39..1.62 rows=3 width=12) (actual time=5005.285..5005.291 rows=3 loops=1)
-> Hash Join (cost=1.39..1.53 rows=1 width=8) (actual time=5005.281..5005.285 rows=1 loops=1)
Hash Cond: ((generate_series(3, 8)) = b.id)
-> ProjectSet (cost=0.00..0.05 rows=6 width=4) (actual time=0.002..0.005 rows=6 loops=1)
-> Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.001..0.001 rows=1 loops=1)
-> Hash (cost=1.38..1.38 rows=1 width=8) (actual time=5005.270..5005.270 rows=1 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Subquery Scan on b (cost=0.00..1.38 rows=1 width=8) (actual time=5005.264..5005.266 rows=1 loops=1)
Filter: (b.sq_sum = 6)
Rows Removed by Filter: 4
-> Function Scan on generate_series id (cost=0.00..1.31 rows=5 width=8) (actual time=1001.089..5005.254 rows=5 loops=1)
-> ProjectSet (cost=0.00..0.03 rows=3 width=4) (actual time=0.002..0.003 rows=3 loops=1)
-> Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.000..0.001 rows=1 loops=1)
Planning Time: 0.113 ms
Execution Time: 5005.320 ms
(15 行记录) 时间:5005.873 ms (00:05.006)

4、LATERAL 连接 CTE

explain analyse
with a as (select sn, id from (select generate_series(1, 3) sn), (select generate_series(3, 8) id)),
b as (select id, id + f001() sq_sum from generate_series(1, 5) id)
select a.*, b.sq_sum
from a join lateral (select * from b where b.id = a.id ) b on true
where 1 = 1 and sq_sum = 6;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=1.44..1.67 rows=3 width=12) (actual time=5005.492..5005.499 rows=3 loops=1)
CTE b
-> Function Scan on generate_series id (cost=0.00..1.31 rows=5 width=8) (actual time=1001.103..5005.440 rows=5 loops=1)
-> Hash Join (cost=0.12..0.27 rows=1 width=8) (actual time=5005.485..5005.489 rows=1 loops=1)
Hash Cond: ((generate_series(3, 8)) = b.id)
-> ProjectSet (cost=0.00..0.05 rows=6 width=4) (actual time=0.002..0.006 rows=6 loops=1)
-> Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.000..0.001 rows=1 loops=1)
-> Hash (cost=0.11..0.11 rows=1 width=8) (actual time=5005.472..5005.472 rows=1 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> CTE Scan on b (cost=0.00..0.11 rows=1 width=8) (actual time=5005.457..5005.459 rows=1 loops=1)
Filter: (sq_sum = 6)
Rows Removed by Filter: 4
-> ProjectSet (cost=0.00..0.03 rows=3 width=4) (actual time=0.004..0.005 rows=3 loops=1)
-> Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.001..0.001 rows=1 loops=1)
Planning Time: 0.205 ms
Execution Time: 5005.531 ms
(16 行记录) 时间:5006.765 ms (00:05.007)

5、LATERAL 连接子查询

explain analyse
with a as (select sn, id from (select generate_series(1, 3) sn), (select generate_series(3, 8) id))
select a.*, b.sq_sum
from a left join lateral (select id, id + f001() sq_sum from generate_series(1, 5) id where id = a.id ) b on true
where 1 = 1 and sq_sum = 6 ; QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.00..2.44 rows=18 width=12) (actual time=3003.267..3003.280 rows=3 loops=1)
-> Nested Loop (cost=0.00..2.15 rows=6 width=8) (actual time=3003.257..3003.268 rows=1 loops=1)
-> ProjectSet (cost=0.00..0.05 rows=6 width=4) (actual time=0.002..0.007 rows=6 loops=1)
-> Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.000..0.002 rows=1 loops=1)
-> Subquery Scan on b (cost=0.00..0.33 rows=1 width=4) (actual time=500.541..500.542 rows=0 loops=6)
Filter: (b.sq_sum = 6)
Rows Removed by Filter: 0
-> Function Scan on generate_series id (cost=0.00..0.32 rows=1 width=8) (actual time=500.539..500.540 rows=0 loops=6)
Filter: (id = (generate_series(3, 8)))
Rows Removed by Filter: 4
-> Materialize (cost=0.00..0.08 rows=3 width=4) (actual time=0.007..0.009 rows=3 loops=1)
-> ProjectSet (cost=0.00..0.03 rows=3 width=4) (actual time=0.005..0.006 rows=3 loops=1)
-> Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.001..0.002 rows=1 loops=1)
Planning Time: 0.125 ms
Execution Time: 3003.306 ms
(15 行记录) 时间:3003.950 ms (00:03.004)

总结

  • Select 子句中,表达式会逐行运算,总时长与结果集成正比。
  • CTE子句,先计算全部结果,然后关联主表,总时长是可控。
  • 使用子查询和 LATERAL 连接,可以避免结果集的重复计算。

SQL优化篇之-如何减少耗时查询的调用次数的更多相关文章

  1. 【MySQL】SQL优化系列之 in与range 查询

    首先我们来说下in()这种方式的查询 在<高性能MySQL>里面提及用in这种方式可以有效的替代一定的range查询,提升查询效率,因为在一条索引里面,range字段后面的部分是不生效的. ...

  2. 聊聊数据库~4.SQL优化篇

    1.5.查询的艺术 上期回顾:https://www.cnblogs.com/dotnetcrazy/p/10399838.html 本节脚本:https://github.com/lotapp/Ba ...

  3. MyBatis动态SQL第一篇之实现多条件查询(if、where、trim标签)

    一.动态SQL概述 以前在使用JDBC操作数据时,如果查询条件特别多,将条件串联成SQL字符串是一件痛苦的事情.通常的解决方法是写很多的if-else条件语句对字符串进行拼接,并确保不能忘了空格或在字 ...

  4. MySQL优化篇(一),我可以和面试官多聊几句吗?——SQL优化流程与优化数据库对象

    我可以和面试官多聊几句吗?只是想偷点技能过来.MySQL优化篇(基于MySQL8.0测试验证),上部分:优化SQL语句.数据库对象,MyISAM表锁和InnoDB锁问题. MyISAM表锁和InnoD ...

  5. sql优化个人总结(全)

    sql优化总结--博客 第一次自己写博客,以后要坚持每掌握一个技能点,就要写一篇博客出来,做一个不满足于一个只会写if...else的程序员. 最近三个月入职了一家新的公司,做的是CRM系统,将公司多 ...

  6. MySQL中的sql优化

    目标: 掌握SQL调优的原则 掌握SQL调优的基本逻辑 掌握优秀SQL的编写方案 掌握何为慢SQL以及检测方案 SQL优化原则 1.减少数据量(表中数据太多可以分表,例如超过500万数据  双11一个 ...

  7. 常见SQL优化方法

    SQL优化的一些方法 1.对查询进行优化,应尽量避免全表扫描,首先应考虑在 where 及 order by 涉及的列上建立索引. 2.应尽量避免在 where 子句中对字段进行 null 值判断,否 ...

  8. sql优化的几种方式

    一.为什么要对SQL进行优化 我们开发项目上线初期,由于业务数据量相对较少,一些SQL的执行效率对程序运行效率的影响不太明显,而开发和运维人员也无法判断SQL对程序的运行效率有多大,故很少针对SQL进 ...

  9. SQL优化原理

    SQL优化过程: 1,捕获高负荷的SQL语句-->2得到SQL语句的执行计划和统计信息--->3分析SQL语句的执行计划和统计信息--->4采取措施,对SQL语句进行调整.1找出高负 ...

  10. SQL Server数据库性能优化之SQL语句篇【转】

    SQL Server数据库性能优化之SQL语句篇http://www.blogjava.net/allen-zhe/archive/2010/07/23/326927.html 近期项目需要, 做了一 ...

随机推荐

  1. Swoole从入门到入土(7)——TCP服务器[大杂烩]

    这一篇是异步风格的TCP服务器的最后一篇,主要的目的是疏理之前几篇没提到的一些比较有用的属性.配置.函数与事件,当然不是全部.如果想知道全部,请查看官网. 1.属性 Swoole\Server-> ...

  2. js数组类型

    js数组类型: 数组检测 1.判断变量是否为数组类型: arr1 instanceof Array Array.isArray(arr1); true 转换方法 toString()方法,以便返回数组 ...

  3. CSS加JS实现网页返回顶部功能

    最近在设计自己的博客,前端页面在内容很多的时候往下拖动会有滚动条.通常我们都需要一个返回顶部的功能来实现快速来到网页顶部.当然实现方式不止一种,这里我采用的最实用的一种.使用CSS+Jquery方式 ...

  4. 我的小程序之旅五:微信公众号扫码登录PC端网页

    代码仓库:https://gitee.com/wlovet/gzh-qrlogin 一.准备材料 1.已认证的公众号(必须为服务号,订阅号没有该接口的权限) 2.一个网址,用于微信回调,推荐一个内网穿 ...

  5. webrtc 的理解

    常规视频的传输包括以下几个步骤:采集,编码,推流,转码,分发,拉流,解码和渲染 在一个实时的音视频系统架构里,上面的每个环节都会有一定程度的优化空间. 以下内容摘自:rtmp直播和webrtc直播对比 ...

  6. 硬件开发笔记(二):硬件开发基本流程,制作一个USB转RS232的模块(一):开发基本过程和元器件选型

    前言   做个usb转串口,同时兼容ttl,讲述硬件模块基础的开发流程,本篇描述了全流程过程,然后选型了合适的元器件.   基本流程   以下是笔者个人从事过相关硬件研发,总结出来的流程,仅代表个人意 ...

  7. Dockerfile和docker-compose详解

    Dockerfile镜像制作 docker/podman中, 镜像是容器的基础,每次执行docker run的时候都会指定哪个基本镜像作为容器运行的基础.我们之前的docker的操作都是使用来自doc ...

  8. 四种色彩模式ARGB_8888、ARGB_4444、 RGB_565、 ALPHA_8

    A:透明度. R:红色. G:绿色. B:蓝色. Bitmap.Config ARGB_8888:有四个8位组成,A,R,G,B各占八位,也就是各占一个字节.也就是一个像素点占4个字节,32位. Bi ...

  9. Shopee虾皮api接口 搜索商品、评价信息 python数据采集

    iDataRiver平台 https://www.idatariver.com/zh-cn/ 提供开箱即用的Shopee电商数据采集API,供用户按需调用. 接口使用详情请参考Shopee接口文档 接 ...

  10. 【风控算法】二、SQL->Python->PySpark计算KS,AUC及PSI

    KS,AUC 和 PSI 是风控算法中最常计算的几个指标,本文记录了多种工具计算这些指标的方法. 生成本文的测试数据: import pandas as pd import numpy as np i ...