Postgres的索引01
一.PG 9.3有以下索引类型
1.b-tree
- 1.1支持前导模糊查询,如
xxx%
或者^'xxx'
- 1.2忽略大小写字符前导模糊查询,如
ILIKE 'XXX%'
或者~*'^xxx'
- 1.3支持常见的条件运算符
< = <= = >= >
2.hash
- 仅支持
=
条件运算符
3.gin
- 支持多列值索引,例如数据类型,全文检索类型
<@
被包含 array[1,2,3] <@ array[2,3,4]@>
包含 array[1,2,3] @> array[2]=
相等 array[1,2,3] = array[1,2,3]&&
相交 array[1,2,3]&& array[2]
4.gist
- 不是单类索引,算是一种索引框架,支持许多不同的索引策略,可以自定义条件运算符
- 支持近邻排序,如取某一个点的10个近邻
select * from places order by localtion <-> point '(101,456)' limit 10;
<<
-- 严格在左侧, 例如circle '((0,0),1)' << circle '((5,0),1)'&<
-- 表示左边的平面体不会扩展到超过右边的平面体的右边. 例如box '((0,0),(1,1))' &< box '((0,0),(2,2))'&>
-- 表示左边的平面体不会扩展到超过右边的平面体的左边. 例如box '((0,0),(3,3))' &> box '((0,0),(2,2))'>>
-- 严格在右<<|
-- 严格在下&<|
-- 不会扩展到超出上面|&>
-- 不会扩展到超出下面|>>
-- 严格在上@>
-- 包含<@
-- 被包含~=
-- 相同&&
-- 相交
http://www.postgresql.org/docs/9.3/static/functions-geometry.html
5.sp-gist
- 与gist类似,也是一张索引框架,支持基于磁盘存储的非平衡数据结构,如四叉树、k-d树、radix树
- 支持操作符
<<
>>
~=
<@
<^
在下面,circle'((0,0),1)' <^ circle'((0,5),1)
左边的圆在右边的圆的下边>^
在上面,circle'((0,5),1)' 》^ circle'((0,0),1)
左边的圆在右边的圆的上边
二.使用索引的好处
1.利用索引进行排序减少CPU开销
- 1.1 查询条件就是索引列
postgres=# \c db1
You are now connected to database "db1" as user "yzw".
db1=# create table test(id int,info text,crt_time timestamp);
CREATE TABLE
db1=# insert into test select generate_series(1,10000), md5(random()::text),clock_timestamp();
INSERT 0 10000
db1=# create index idx_test_1 on test(id);
CREATE INDEX
db1=# explain analyze select * from test where id<100 order by id;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------
Sort (cost=396.80..405.13 rows=3333 width=44) (actual time=0.106..0.111 rows=99 loops=1)
Sort Key: id
Sort Method: quicksort Memory: 32kB
-> Bitmap Heap Scan on test (cost=66.12..201.78 rows=3333 width=44) (actual time=0.050..0.059 rows=99 loops=1)
Recheck Cond: (id < 100)
Heap Blocks: exact=1
-> Bitmap Index Scan on idx_test_1 (cost=0.00..65.28 rows=3333 width=0) (actual time=0.036..0.036 rows=99 loops=1)
Index Cond: (id < 100)
Planning time: 0.520 ms
Execution time: 0.178 ms
(10 rows)
- 1.2 查询条件不是索引列
db1=# explain analyze select * from test where info='c969799412fed1c8f91eff5e65353a85' order by id;
QUERY PLAN
-------------------------------------------------------------------------------------------------------
Sort (cost=219.01..219.01 rows=1 width=45) (actual time=1.112..1.112 rows=1 loops=1)
Sort Key: id
Sort Method: quicksort Memory: 25kB
-> Seq Scan on test (cost=0.00..219.00 rows=1 width=45) (actual time=0.011..1.104 rows=1 loops=1)
Filter: (info = 'c969799412fed1c8f91eff5e65353a85'::text)
Rows Removed by Filter: 9999
Planning time: 0.081 ms
Execution time: 1.129 ms
(8 rows)
> 为何都有排序的节点Sort Key?
# 关闭enable_seqscan全表扫描后,查询索引列没有了排序节点
db1=# set enable_seqscan=off;
SET
db1=# explain analyze select * from test where id<100 order by id;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------
Index Scan using idx_test_1 on test (cost=0.29..10.04 rows=100 width=45) (actual time=0.005..0.016 rows=99 loops=1)
Index Cond: (id < 100)
Planning time: 0.119 ms
Execution time: 0.034 ms
(4 rows)
enable_seqscan 9.4默认是on,9.3是off?
2.加速带条件的查询,删除,更新
- 2.1 正常开启全表扫描和索引扫描情况下,有索引的列查找走索引
db1=# set enable_seqscan=on;
SET
db1=# explain analyze select * from test where id=1;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------
Index Scan using idx_test_1 on test (cost=0.29..8.30 rows=1 width=45) (actual time=0.014..0.015 rows=1 loops=1)
Index Cond: (id = 1)
Planning time: 0.067 ms
Execution time: 0.032 ms
(4 rows)
- 2.2在没有索引条件下的查询效率,即使有索引列也会走全表扫描
db1=# show enable_indexscan;
enable_indexscan
------------------
on
(1 row)
db1=# show enable_bitmapscan;
enable_bitmapscan
-------------------
on
(1 row)
db1=# set enable_indexscan=off,enable_bitmapscan=off;
db1=# set enable_indexscan=off;set enable_bitmapscan=off;
SET
SET
db1=# show enable_indexscan;show enable_bitmapscan;
enable_indexscan
------------------
off
(1 row)
enable_bitmapscan
-------------------
off
(1 row)
# 关闭索引后,变成全表扫描了
db1=# explain analyze select * from test where id=1;
QUERY PLAN
-------------------------------------------------------------------------------------------------
Seq Scan on test (cost=0.00..219.00 rows=1 width=45) (actual time=0.012..0.943 rows=1 loops=1)
Filter: (id = 1)
Rows Removed by Filter: 9999
Planning time: 0.138 ms
Execution time: 0.971 ms
(5 rows)
- 2.3 加速join操作
db1=# set enable_indexscan=on;set enable_bitmapscan=on;
SET
SET
db1=# insert into test1 select generate_series(1,10000), md5(random()::text),clock_timestamp();
INSERT 0 10000
test1表没有建索引,走全表扫描,test表走id索引,并且出现嵌套循环
db1=# explain analyze select t1.*,t2.* from test t1 join test1 t2 on(t1.id=t2.id and t2.id=1);
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.29..227.31 rows=1 width=90) (actual time=0.032..0.896 rows=1 loops=1)
-> Index Scan using idx_test_1 on test t1 (cost=0.29..8.30 rows=1 width=45) (actual time=0.019..0.020 rows=1 loops=1)
Index Cond: (id = 1)
-> Seq Scan on test1 t2 (cost=0.00..219.00 rows=1 width=45) (actual time=0.010..0.873 rows=1 loops=1)
Filter: (id = 1)
Rows Removed by Filter: 9999
Planning time: 0.124 ms
Execution time: 0.927 ms
(8 rows)
给test1表增加索引后,也走索引,test1表的索引数据在内存,因此速度更快
db1=# create index idx_test1_id on test1(id);
CREATE INDEX
db1=# explain analyze select t1.*,t2.* from test t1 join test1 t2 on(t1.id=t2.id and t2.id=1);
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.57..16.62 rows=1 width=90) (actual time=0.033..0.034 rows=1 loops=1)
-> Index Scan using idx_test_1 on test t1 (cost=0.29..8.30 rows=1 width=45) (actual time=0.011..0.012 rows=1 loops=1)
Index Cond: (id = 1)
-> Index Scan using idx_test1_id on test1 t2 (cost=0.29..8.30 rows=1 width=45) (actual time=0.020..0.020 rows=1 loops=1)
Index Cond: (id = 1)
Planning time: 0.240 ms
Execution time: 0.059 ms
(7 rows)
merge join,两个join的表按照join列做好排序后,再进行join,也能用上索引,通常来说,能够使用merge join的地方,使用hash join更快
db1=# show enable_hashjoin;
enable_hashjoin
-----------------
on
(1 row)
db1=# show enable_mergejoin;
enable_mergejoin
------------------
on
(1 row)
# 关闭hashjoin
set enable_hashjoin=off;
db1=# explain analyze select t1.*,t2.* from test t1 join test1 t2 on t1.id=t2.id;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------
Merge Join (cost=0.57..884.57 rows=10000 width=90) (actual time=0.020..10.837 rows=10000 loops=1)
Merge Cond: (t1.id = t2.id)
-> Index Scan using idx_test_1 on test t1 (cost=0.29..367.29 rows=10000 width=45) (actual time=0.006..2.453 rows=10000 loops=1)
-> Index Scan using idx_test1_id on test1 t2 (cost=0.29..367.29 rows=10000 width=45) (actual time=0.006..3.625 rows=10000 loops=1)
Planning time: 0.309 ms
Execution time: 11.304 ms
(6 rows)
# 如果没有索引,效率最差,先全表扫描,然后排序,再join
db1=# explain analyze select t1.*,t2.* from test t1 join test1 t2 on t1.id=t2.id;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------
Merge Join (cost=1716.77..1916.77 rows=10000 width=90) (actual time=3.090..7.286 rows=10000 loops=1)
Merge Cond: (t1.id = t2.id)
-> Sort (cost=858.39..883.39 rows=10000 width=45) (actual time=1.571..2.007 rows=10000 loops=1)
Sort Key: t1.id
Sort Method: quicksort Memory: 1166kB
-> Seq Scan on test t1 (cost=0.00..194.00 rows=10000 width=45) (actual time=0.005..0.789 rows=10000 loops=1)
-> Sort (cost=858.39..883.39 rows=10000 width=45) (actual time=1.514..2.039 rows=10000 loops=1)
Sort Key: t2.id
Sort Method: quicksort Memory: 1166kB
-> Seq Scan on test1 t2 (cost=0.00..194.00 rows=10000 width=45) (actual time=0.003..0.748 rows=10000 loops=1)
Planning time: 0.171 ms
Execution time: 7.614 ms
(12 rows)
# 自动使用hash join
db1=# set enable_hashjoin=on;set enable_indexscan=on;set enable_bitmapscan=on;
SET
db1=# explain analyze select t1.*,t2.* from test t1 join test1 t2 on t1.id=t2.id;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------
Hash Join (cost=319.00..763.00 rows=10000 width=90) (actual time=2.208..7.150 rows=10000 loops=1)
Hash Cond: (t1.id = t2.id)
-> Seq Scan on test t1 (cost=0.00..194.00 rows=10000 width=45) (actual time=0.005..0.966 rows=10000 loops=1)
-> Hash (cost=194.00..194.00 rows=10000 width=45) (actual time=2.160..2.160 rows=10000 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 782kB
-> Seq Scan on test1 t2 (cost=0.00..194.00 rows=10000 width=45) (actual time=0.003..0.959 rows=10000 loops=1)
Planning time: 0.211 ms
Execution time: 7.502 ms
(8 rows)
3.加速外键约束更新和删除操作
create table p(id int primary key, info text, crt_time timestamp);
create table f(id int primary key, p_id int references p(id) on delete cascade on update cascade, info text, crt_time timestamp);
insert into p select generate_series(1,10000), md5(random()::text), clock_timestamp();
insert into f select generate_series(1,10000), generate_series(1,10000), md5(random()::text), clock_timestamp();
f表的p_id列未加索引情况下
db1=# explain (analyze,verbose,costs,buffers,timing) update p set id=1 where id=0;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
Update on public.p (cost=0.29..8.30 rows=1 width=47) (actual time=0.053..0.053 rows=0 loops=1)
Buffers: shared hit=7
-> Index Scan using p_pkey on public.p (cost=0.29..8.30 rows=1 width=47) (actual time=0.019..0.019 rows=1 loops=1)
Output: 1, info, crt_time, ctid
Index Cond: (p.id = 0)
Buffers: shared hit=3
Planning time: 0.068 ms
Trigger RI_ConstraintTrigger_a_16424 for constraint f_p_id_fkey on p: time=1.225 calls=1 # p表上耗时长
Trigger RI_ConstraintTrigger_c_16426 for constraint f_p_id_fkey on f: time=0.068 calls=1
Execution time: 1.377 ms
(10 rows)
增加p表索引后
create index idx_f_1 on f(p_id);
db1=# explain (analyze,verbose,costs,buffers,timing) update p set id=0 where id=1;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
Update on public.p (cost=0.29..8.30 rows=1 width=47) (actual time=0.055..0.055 rows=0 loops=1)
Buffers: shared hit=7
-> Index Scan using p_pkey on public.p (cost=0.29..8.30 rows=1 width=47) (actual time=0.022..0.023 rows=1 loops=1)
Output: 0, info, crt_time, ctid
Index Cond: (p.id = 1)
Buffers: shared hit=3
Planning time: 0.079 ms
Trigger RI_ConstraintTrigger_a_16424 for constraint f_p_id_fkey on p: time=0.132 calls=1 # p表耗时短
Trigger RI_ConstraintTrigger_c_16426 for constraint f_p_id_fkey on f: time=0.085 calls=1
Execution time: 0.307 ms
(10 rows)
4.索引在排他约束中的使用
- 要求左右操作符互换对结果没有影响,例如x=y,y=x结果都是true或者unknown
db1=# CREATE TABLE test2(id int,geo point,EXCLUDE USING btree (id WITH pg_catalog.=));
CREATE TABLE
db1=# insert into test2 (id) values (1);
INSERT 0 1
db1=# insert into test2 (id) values (1);
ERROR: conflicting key value violates exclusion constraint "test2_id_excl"
DETAIL: Key (id)=(1) conflicts with existing key (id)=(1).
> 模拟unique
5.加速唯一值约束、排他约束
- 主键
- 唯一键
CREATE TABLE test3(id int,geo point,EXCLUDE USING spGIST (geo WITH pg_catalog.~=));
select * from pg_indexes where tablename='test3';
db1=# select * from pg_indexes where tablename='test3';
schemaname | tablename | indexname | tablespace | indexdef
------------+-----------+----------------+------------+---------------------------------------------------------
public | test3 | test3_geo_excl | | CREATE INDEX test3_geo_excl ON test3 USING spgist (geo)
(1 row)
三.索引的弊端
- 随着表的记录块的变迁需要更新,因此会对这类操作带来一定的性能影响
- 块不变更的情况下触发hot特性,可以不需要更新索引
- 写多读少的场景,索引弊端可能大于其好处
四.注意事项
- 1.正常创建索引时,会阻断除查询意外的其他操作
- 2.使用并行CONCURRENTLY选项后,可以允许同时对标的DML操作,但是对于频繁DML的表,这种创建索引的时间非常长
- 3.某些索引不记录WAL,所以如果有利于WAL进行数据恢复的情况,如crash recovery,流复制,warm standby等,这类索引在使用前需要重建(HASH索引)
Postgres的索引01的更多相关文章
- postgres索引创建、 存储过程的创建以及在c#中的调用
postgres创建索引参考 http://www.cnblogs.com/stephen-liu74/archive/2012/05/09/2298182.html CREATE TABLE tes ...
- [翻译] 为什么Uber的数据库从Postgres 切换到 MySql
Uber工程师团队发布了一个重要的博客文章:他们的数据库从Postgres从移动到MySQL.毫不夸张地说,阅读这篇文章是一种享受,特别是因为他们提到这些细节:磁盘格式和那对他们2个数据库的表现的影响 ...
- SQL Server查询所有存储过程信息、触发器、索引
1. [代码]查询所有存储过程 01 select Pr_Name as [存储过程], [参数]=stuff((select ','+[Parameter] 02 from ( 03 se ...
- day--41 mysql索引原理与慢查询优化
mysql索引原理与慢查询优化一:什么是索引 01:索引的出现是为了提高查询数据的效率 02:索引在mysql叫做“键” 或则“key“(primary key,uniquekey ,还有一个inde ...
- 转:为什么Uber宣布从Postgres切换到MySQL?
转: http://mp.weixin.qq.com/s?__biz=MzAwMDU1MTE1OQ==&mid=2653547609&idx=1&sn=cbb55ee823dd ...
- 优化MySchool数据库设计之【巅峰对决】
优化MySchool数据库设计 之独孤九剑 船舶停靠在港湾是很安全的,但这不是造船的目的 By:北大青鸟五道口原玉明老师 1.学习方法: 01.找一本好书 初始阶段不适合,可以放到第二个阶段,看到知识 ...
- EMVTag系列15《选择应用响应数据》
1. 接触交易选择应用响应数据 标签 长度 数据域 9102 A5 变长 FCI专用模板 强制 50 1–16 应用标签 纯电子现金:PBOC DEBIT 借记卡:PBOC DEBIT 贷记卡:PBO ...
- mysql 基本使用
SQL分类 -------------------数据库------------ 创建数据库 create database xxx; 查询所有的数据库 show databases; 查询当前数据 ...
- PostgreSQL指南
PostgreSQL指南 历史简介 最近几年Postgres的关注度变得越来越高. 它加快了Postgres的发展步伐, 与此同时其他 的关系数据库系统的发展放缓. 在数据库领域中 Postgre S ...
随机推荐
- Windows VHD Create, Attach, 获得Disk序号
// create_vhd.cpp : Defines the entry point for the console application. // #include "stdafx.h& ...
- [Qt] 去掉QMessageBox标题栏上的图标
msgBox.setWindowFlags(Qt::Drawer);
- cocos2dx初体验
我们创建工程后总会自带一个HelloWorld类,短短的几行代码就出来了一个游戏的雏形,请问我们真的理解它了吗?如果我们能早一点弄明白这几行代码,我们或许会比现在走得更远. 理解HelloWorld类 ...
- MySQL系列(三)
本章内容: 视图.增/删/改/查 触发器.增/删/改/查 存储过程.增/删/改/查 存储过程三种传参,pymysql 怎么用? 函数.增/删/改/查/return值 内置函数 事务 1.1视图 视图是 ...
- Docker基本使用(一)
一.为什么使用容器? 1. 上线流程繁琐开发->测试->申请资源->审批->部署->测试等环节2. 资源利用率低普遍服务器利用率低,造成过多浪费3. 扩容/缩容不及时业务 ...
- [USACO3.2]魔板 Magic Squares
松下问童子,言师采药去. 只在此山中,云深不知处.--贾岛 题目:魔板 Magic Squares 网址:https://www.luogu.com.cn/problem/P2730 这是一张有8个大 ...
- HashMap源码解析JDK8
一.HashMap基础 1.1 HashMap的定义 我们先看一下HashMap的定义: public class HashMap<K,V> extends AbstractMap< ...
- C语言实现数组循环左移
c语言实现数组左移: 例如输入: 8 3 1 2 3 4 5 6 7 8 输出: 4 5 6 7 8 1 2 3 #include <stdio.h> int main(int argc, ...
- 第 38 章 OCR - Optical Character Recognition
38.1. Tesseract 查找Tesseract安装包 $ apt-cache search Tesseract ocrodjvu - tool to perform OCR on DjVu d ...
- shell之路 Linux核心命令【第一篇】管道符与重定向
输出重定向 命令输出重定向的语法为: command > file 或 command >> file 这样,输出到显示器的内容就可以被重定向到文件.果不希望文件内容被覆盖,可以使用 ...