示例执行计划:

postgres=# explain select * from tbl where id =1 or id=2 or id=3;
QUERY PLAN
----------------------------------------------------------------------------------
Bitmap Heap Scan on tbl (cost=124.97..354.97 rows=10000 width=12)
Recheck Cond: ((id = 1) OR (id = 2) OR (id = 3))
-> BitmapOr (cost=124.97..124.97 rows=10000 width=0)
-> Bitmap Index Scan on idx_tbl (cost=0.00..114.28 rows=10000 width=0)
Index Cond: (id = 1)
-> Bitmap Index Scan on idx_tbl (cost=0.00..1.59 rows=1 width=0)
Index Cond: (id = 2)
-> Bitmap Index Scan on idx_tbl (cost=0.00..1.59 rows=1 width=0)
Index Cond: (id = 3)
(9 rows)

> what does "Bitmap Heap Scan" phase do?

A plain indexscan fetches one tuple-pointer at a time from the index,
and immediately visits that tuple in the table. A bitmap scan fetches
all the tuple-pointers from the index in one go, sorts them using an
in-memory "bitmap" data structure, and then visits the table tuples in
physical tuple-location order. The bitmap scan improves locality of
reference to the table at the cost of more bookkeeping overhead to
manage the "bitmap" data structure --- and at the cost that the data
is no longer retrieved in index order, which doesn't matter for your
query but would matter if you said ORDER BY.

> what is "Recheck condition" and why is it needed?

If the bitmap gets too large we convert it to "lossy" style, in which we
only remember which pages contain matching tuples instead of remembering
each tuple individually. When that happens, the table-visiting phase
has to examine each tuple on the page and recheck the scan condition to
see which tuples to return.

> I thought "Bitmap Index Scan" was only used when there are two or more applicable indexes in the plan, so I don't understand why is it used now?

True, we can combine multiple bitmaps via AND/OR operations to merge
results from multiple indexes before visiting the table ... but it's
still potentially worthwhile even for one index. A rule of thumb is
that plain indexscan wins for fetching a small number of tuples, bitmap
scan wins for a somewhat larger number of tuples, and seqscan wins if
you're fetching a large percentage of the whole table.

stackoverflow相关问题

问题

How does PostgreSQL knows by just a bitmap anything about rows' physical order?
回答

The bitmap is one bit per heap page. The bitmap index scan sets the bits based on the heap page address that the index entry points to.

So when it goes to do the bitmap heap scan, it just does a linear table scan, reading the bitmap to see whether it should bother with a particular page or seek over it.
问题

Or generates the bitmap so that any element of it can be mapped to the pointer to a page easily?
回答

No, the bitmap corresponds 1:1 to heap pages.

I wrote some more on this here.

OK, it looks like you might be misunderstanding what "bitmap" means in this context.

It's not a bit string like "101011" that's created for each heap page, or each index read, or whatever.

The whole bitmap is a single bit array, with as many bits as there are heap pages in the relation being scanned.

One bitmap is created by the first index scan, starting off with all entries 0 (false). Whenever an index entry that matches the search condition is found, the heap address pointed to by that index entry is looked up as an offset into the bitmap, and that bit is set to 1 (true). So rather than looking up the heap page directly, the bitmap index scan looks up the corresponding bit position in the bitmap.

The second and further bitmap index scans do the same thing with the other indexes and the search conditions on them.

Then each bitmap is ANDed together. The resulting bitmap has one bit for each heap page, where the bits are true only if they were true in all the individual bitmap index scans, i.e. the search condition matched for every index scan. These are the only heap pages we need to bother to load and examine. Since each heap page might contain multiple rows, we then have to examine each row to see if it matches all the conditions - that's what the "recheck cond" part is about.

One crucial thing to understand with all this is that the tuple address in an index entry points to the row's ctid, which is a combination of the heap page number and the offset within the heap page. A bitmap index scan ignores the offsets, since it'll check the whole page anyway, and sets the bit if any row on that page matches the condition.

Graphical example

Heap, one square = one page:
+---------------------------------------------+
|c____u_____X___u___X_________u___cXcc______u_|
+---------------------------------------------+
Rows marked c match customers pkey condition.
Rows marked u match username condition.
Rows marked X match both conditions. Bitmap scan from customers_pkey:
+---------------------------------------------+
|100000000001000000010000000000000111100000000| bitmap 1
+---------------------------------------------+
One bit per heap page, in the same order as the heap
Bits 1 when condition matches, 0 if not Bitmap scan from ix_cust_username:
+---------------------------------------------+
|000001000001000100010000000001000010000000010| bitmap 2
+---------------------------------------------+

Once the bitmaps are created a bitwise AND is performed on them:

+---------------------------------------------+
|100000000001000000010000000000000111100000000| bitmap 1
|000001000001000100010000000001000010000000010| bitmap 2
&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&
|000000000001000000010000000000000010000000000| Combined bitmap
+-----------+-------+--------------+----------+
| | |
v v v
Used to scan the heap only for matching pages:
+---------------------------------------------+
|___________X_______X______________X__________|
+---------------------------------------------+

The bitmap heap scan then seeks to the start of each page and reads the page:

+---------------------------------------------+
|___________X_______X______________X__________|
+---------------------------------------------+
seek------->^seek-->^seek--------->^
| | |
------------------------
only these pages read

and each read page is then re-checked against the condition since there can be >1 row per page and not all necessarily match the condition.


The bitmap of pages is created dynamically for each query. It is not cached or re-used, and is discarded at the end of the bitmap index scan.

It doesn't make sense to create the page bitmap in advance because its contents depend on the query predicates.

Say you're searching for x=1 and y=2. You have b-tree indexes on x and y. PostgreSQL doesn't combine x and y into a bitmap then search the bitmap. It scans index x for the page address of all pages with x=1 and makes a bitmap where the pages that might contain x=1 are true. Then it scans y looking for the page addresses where y might equal 2, making a bitmap from that. Then it ANDs them to find pages where both x=1 and y=2 might be true. Finally, it scans the table its self, reading only the pages that might contain candidate values, reading each page and keeping only the rows where x=1 and y=2.

Now, if you're looking for something like a cached, pre-built bitmap index, there is such a thing in PostgreSQL 9.5: BRIN indexes. These are intended for very large tables, and provide a way to find ranges of the table that can be skipped over because they're known not to contain a desired value.


To combine multiple indexes, the system scans each needed index and prepares a bitmap in memory giving the locations of table rows that are reported as matching that index's conditions. The bitmaps are then ANDed and ORed together as needed by the query. Finally, the actual table rows are visited and returned. The table rows are visited in physical order, because that is how the bitmap is laid out; this means that any ordering of the original indexes is lost, and so a separate sort step will be needed if the query has an ORDER BY clause. For this reason, and because each additional index scan adds extra time, the planner will sometimes choose to use a simple index scan even though additional indexes are available that could have been used as well.

注:

这里关于stackoverflow部分画的图有个疑问,满足where条件的 heap page标为1,不满足的则标为0,然后做的and操作,得出的是同时含有where条件的heap page标记为1,但我查询的目的,并不是要找出记录在同一个heap page中的记录,而是找到满足where条件的记录,只要满足where条件,不必一定要在同一个heap page中的。希望了解的同学告知一下。

参考:

https://www.postgresql.org/message-id/12553.1135634231@sss.pgh.pa.us

https://dba.stackexchange.com/questions/119386/understanding-bitmap-heap-scan-and-bitmap-index-scan

https://stackoverflow.com/questions/33100637/undestanding-bitmap-indexes-in-postgresql

https://yq.aliyun.com/articles/70462

https://www.postgresql.org/docs/9.6/static/indexes-bitmap-scans.html

What is the bitmap index?的更多相关文章

  1. Oracle中关于bitmap index的使用问题

    您如果熟悉 Oracle 数据库,我想您对 Thomas Kyte 的大名一定不会陌生. Tomas 主持的 asktom.oracle.com 网站享誉 Oracle 界数十年,绝非幸致.最近在图书 ...

  2. bitmap index

    bitmap index 说明: set echo on drop table t purge; create table t ( processed_flag ) ); create bitmap ...

  3. [每日一题] 11gOCP 1z0-052 :2013-09-27 bitmap index.................................................C37

    转载请注明出处:http://blog.csdn.net/guoyjoe/article/details/12106027 正确答案C 这道题目是需要我们掌握位图索引知识点. 一.首先我们来看位图索引 ...

  4. 位图索引(Bitmap Index)的故事

    您如果熟悉Oracle数据库,我想您对Thomas Kyte的大名一定不会陌生.Tomas主持的asktom.oracle.com网站享誉Oracle界数十年,绝非幸致.最近在图书馆借到这位Oracl ...

  5. 【Bitmap Index】B-Tree索引与Bitmap位图索引的锁代价比较研究

    通过以下实验,来验证Bitmap位图索引较之普通的B-Tree索引锁的“高昂代价”.位图索引会带来“位图段级锁”,实际使用过程一定要充分了解不同索引带来的锁代价情况. 1.为比较区别,创建两种索引类型 ...

  6. Oracle分区表之分区范围扫描(PARTITION RANGE ITERATOR)与位图范围扫描(BITMAP INDEX RANGE SCAN)

    一.前言: 一开始分区表和位图索引怎么会挂钩呢?可能现实就是这么的不期而遇:比如说一张表的字段是年月日—‘yyyy-mm-dd’,重复率高吧,适合建位图索引吧,而且这张表数据量也不小,也适合转换成分区 ...

  7. 位图索引:原理(BitMap index)

    http://www.cnblogs.com/LBSer/p/3322630.html 位图(BitMap)索引 前段时间听同事分享,偶尔讲起Oracle数据库的位图索引,顿时大感兴趣.说来惭愧,在这 ...

  8. PostgreSQL执行计划:Bitmap scan VS index only scan

    之前了解过postgresql的Bitmap scan,只是粗略地了解到是通过标记数据页面来实现数据检索的,执行计划中的的Bitmap scan一些细节并不十分清楚.这里借助一个执行计划来分析bitm ...

  9. Oracle索引(B*tree和Bitmap)学习

    在Oracle中,索引基本分为以下几种:B*Tree索引,反向索引,降序索引,位图索引,函数索引,interMedia全文索引等,其中最常用的是B*Tree索引和Bitmap索引. (1).与索引相关 ...

随机推荐

  1. Unity编辑器 - 输入控件聚焦问题

    Unity编辑器整理 - 输入控件聚焦问题 EditorGUI的输入控件在聚焦后,如果在其他地方改变值,聚焦的框不会更新,而且无法取消聚焦,如下图: 在代码中取消控件的聚焦 取消聚焦的"时机 ...

  2. 【system.file】使用说明

    对象:system.file 说明:提供一系列针对文件操作的方法. 注意:参数中的filePath 均为相对网站根目录路径 目录: 方法 返回 说明 system.file.exists(filePa ...

  3. JAVA基础学习之路(七)对象数组的定义及使用

    两种定义方式: 1.动态初始化: 定义并开辟数组:类名称 对象数组名[] = new 类名称[长度] 分布按成:类名称 对象数组名[] = null: 对象数组名 = new 类名称[长度]:   2 ...

  4. beego 笔记

    1.开发文档 https://beego.me/docs/intro/ 2.bee run projectname demo controller package autoscaler import ...

  5. vue移动音乐app开发学习(一):环境搭建

    本系列文章是为了记录学习中的知识点,便于后期自己观看.如果有需要的同学请登录慕课网,找到Vue 2.0 高级实战-开发移动端音乐WebApp进行观看,传送门. 一:使用vue-cli脚手架搭建: 1: ...

  6. Spring学习(二)—— java的动态代理机制

    在学习Spring的时候,我们知道Spring主要有两大思想,一个是IoC,另一个就是AOP,对于IoC,依赖注入就不用多说了,而对于Spring的核心AOP来说,我们不但要知道怎么通过AOP来满足的 ...

  7. 链表相加(Add Two Numbers)

    描述: 给定两个非空的链表,表示两个非负整数.数字以相反的顺序存储,每个节点包含一个数字.添加两个数字并将其作为链表返回. 您可以假设两个数字不包含任何前导零,除了数字0本身. 输入:(2 - > ...

  8. codesandbox

    codesandbox https://codesandbox.io https://codesandbox.io/dashboard https://codesandbox.io/dashboard ...

  9. 【bzoj3174】[Tjoi2013]拯救小矮人 贪心+dp

    题目描述 一群小矮人掉进了一个很深的陷阱里,由于太矮爬不上来,于是他们决定搭一个人梯.即:一个小矮人站在另一小矮人的 肩膀上,知道最顶端的小矮人伸直胳膊可以碰到陷阱口.对于每一个小矮人,我们知道他从脚 ...

  10. html页面导入文件 使用include后多出一空白行的解决

    用include引入的footer和header文件都在上面多出一空白行,是Unicode签名(bom)引起的. “标题/编码”,把 包括unicode签名(bom) 的勾取消就好了.