解析HOT原理

2020-06-09 19:31:01

一、疑问

　　前段时间；QQ群里有人对“这个表（0,4）这行数据我做了update操作，查看索引的page数据，看到索引一直指向（0,4），用ctid='(0,4)'查询业务表是查不到数据的；然后我做了表的vacuum，reindex甚至drop/create index，还是这样的”感到疑惑。

　　在PostgreSQL8.3实现了（heap only tuple）HOT特性。它存在的目的就是消除表非索引列更新对索引影响。但是它如何工作的呢？

二、解析

　　我们来模拟环境

postgres=# create table tbl_hot(id int primary key, info text);

CREATE TABLE

postgres=# insert into tbl_hot select generate_series(1, 4), 'lottu';

INSERT 0 4

postgres=# select ctid ,t.* from tbl_hot t;

 ctid  | id | info

-------+----+-------

 (0,1) |  1 | lottu

 (0,2) |  2 | lottu

 (0,3) |  3 | lottu

 (0,4) |  4 | lottu

(4 rows)

postgres=# \d tbl_hot

              Table "public.tbl_hot"

 Column |  Type   | Collation | Nullable | Default

--------+---------+-----------+----------+---------

 id     | integer |           | not null |

 info   | text    |           |          |

Indexes:

    "tbl_hot_pkey" PRIMARY KEY, btree (id)

我们创建表tbl_hot；并插入4条记录。这是我们更新（0,4）这条记录。如下

postgres=# update tbl_hot set info = 'rax' where id = 4;

UPDATE 1

postgres=# select ctid ,t.* from tbl_hot t;

 ctid  | id | info

-------+----+-------

 (0,1) |  1 | lottu

 (0,2) |  2 | lottu

 (0,3) |  3 | lottu

 (0,5) |  4 | rax

(4 rows)

更新之后我们看下索引有变化没？

postgres=# select * from bt_page_items('tbl_hot_pkey', 1);

 itemoffset | ctid  | itemlen | nulls | vars |          data

------------+-------+---------+-------+------+-------------------------

          1 | (0,1) |      16 | f     | f    | 01 00 00 00 00 00 00 00

          2 | (0,2) |      16 | f     | f    | 02 00 00 00 00 00 00 00

          3 | (0,3) |      16 | f     | f    | 03 00 00 00 00 00 00 00

          4 | (0,4) |      16 | f     | f    | 04 00 00 00 00 00 00 00

(4 rows)

bt_page_items函数是用来：返回关于B-树索引页面上所有项的详细信息，在B树叶子页面中，ctid指向一个堆元组。在内部页面中，ctid的块编号部分指向索引本身中的另一个页面。

　　我们可以看出索引没变化。索引存放是表数据的ctid+索引值。使用索引可以快速找到对应记录的ctid。现在记录id=4 索引的ctid（0,4）跟表对应ctid(0,5)不一致。那是不是索引失效了。我们来测试下

postgres=# explain select id from tbl_hot where id = 4;

                                   QUERY PLAN

---------------------------------------------------------------------------------

 Index Only Scan using tbl_hot_pkey on tbl_hot  (cost=0.15..8.17 rows=1 width=4)

   Index Cond: (id = 4)

(2 rows)

　　索引没失效；那如何找到对应的记录呢？我们先来看下表存储的page情况

get_raw_page: 根据参数表明、数据文件类型（main、fsm、vm）以及page位置，将当前表文件中的page内容返回。还有一个函数于此同名，只有两个参数，是将第二个参数省略，直接使用'main'。

heap_page_items: 参数是函数get_raw_page的返回值，返回值是将page内的项指针(ItemIddata)以及HeapTupleHeaderData的详细信息。

其中理解下下面字段含义

lp：这是插件自己定义的列，在源码中其实没有，这个是项指针的顺序。

lp_off：tuple在page中的位置

lp_flags： 含义如下

#define LP_UNUSED       0       /* unused (should always have lp_len=0) */

#define LP_NORMAL       1       /* used (should always have lp_len>0) */

#define LP_REDIRECT     2       /* HOT redirect (should have lp_len=0) */

#define LP_DEAD         3       /* dead, may or may not have storage */

t_ctid： 这个是指物理ID

t_infomask2：表字段的个数以及一些flags；其中flag含义

#define HEAP_NATTS_MASK         0x07FF

             /* 11 bits for number of attributes *//* bits 0x1800 are available */

#define HEAP_KEYS_UPDATED       0x2000

          /* tuple was updated and key cols* modified, or tuple deleted */

#define HEAP_HOT_UPDATED        0x4000  /* tuple was HOT-updated */

#define HEAP_ONLY_TUPLE         0x8000  /* this is heap-only tuple */

#define HEAP2_XACT_MASK         0xE000  /* visibility-related bits */

postgres=# select * from heap_page_items(get_raw_page('tbl_hot', 0));

 lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid |         t_data

----+--------+----------+--------+--------+--------+----------+--------+-------------+------------+--------+--------+-------+------------------------

  1 |   8152 |        1 |     34 |    554 |      0 |        0 | (0,1)  |           2 |       2306 |     24 |        |       | \x010000000d6c6f747475

  2 |   8112 |        1 |     34 |    554 |      0 |        0 | (0,2)  |           2 |       2306 |     24 |        |       | \x020000000d6c6f747475

  3 |   8072 |        1 |     34 |    554 |      0 |        0 | (0,3)  |           2 |       2306 |     24 |        |       | \x030000000d6c6f747475

  4 |   8032 |        1 |     34 |    554 |    555 |        0 | (0,5)  |       16386 |       1282 |     24 |        |       | \x040000000d6c6f747475

  5 |   8000 |        1 |     32 |    555 |      0 |        0 | (0,5)  |       32770 |      10498 |     24 |        |       | \x0400000009726178

(5 rows)

我们来理下：我们通过条件id=4；如何找到对应的记录

找到指向目标数据tuple的索引tuple(0,4)
根据获取索引tuple的位置(0,4)；找到行指针lp为4的位置。即对应的ctid为(0,5)
根据ctid为(0,5)；我们可以找到两条tuple。根据PG的MVCC机制连判断哪条tuple可见
可以找到对应tuple

更新多次原理也差不多。

这个时候你会有一个疑问“执行vacuum；清理表tuple(0,4);少了步骤2；那上面的流程就走不通了”。我们来解析下：

postgres=# vacuum tbl_hot;

VACUUM

postgres=# select lp, lp_off, lp_flags, t_ctid, t_infomask2 from heap_page_items(get_raw_page('tbl_hot', 0));

 lp | lp_off | lp_flags | t_ctid | t_infomask2

----+--------+----------+--------+-------------

  1 |   8152 |        1 | (0,1)  |           2

  2 |   8112 |        1 | (0,2)  |           2

  3 |   8072 |        1 | (0,3)  |           2

  4 |      5 |        2 |        |

  5 |   8040 |        1 | (0,5)  |       32770

(5 rows)

这时；为了解决这个问题，postgresql会在合适的时候进行行指针的重定向（redirect）,这个过程称为修剪。现在按照这种情况我们来理下：我们通过条件id=4；如何找到对应的记录

找到指向目标数据tuple的索引tuple(0,4)
根据获取索引tuple的位置(0,4)；找到行指针lp为4的位置；这是lp_flags为2表示指针重定向lp为5；即行指针对应的位置是8040
通过指针可以找到对应tuple。

这是tuple(0,4)；既然vacuum；表示可以再使用；但是这是标记是LP_REDIRECT；表明tuple非dead tuple；未进行回收；不可以重复使用。这时你可能会有一个疑问“那什么时候可以回收？”；答案是这个tuple(0,4)不会标记dead tuple。但是执行vacuum;该page是可以回收空间；这个是PG的MVCC处理机制-vacuum的内容；可以分到下个篇幅再讲。这里我们可以简单演示下：

postgres=# update tbl_hot set info = 'postgres' where id = 4;

UPDATE 1

postgres=# select lp, lp_off, lp_flags, t_ctid, t_infomask2 from heap_page_items(get_raw_page('tbl_hot', 0));

 lp | lp_off | lp_flags | t_ctid | t_infomask2

----+--------+----------+--------+-------------

  1 |   8152 |        1 | (0,1)  |           2

  2 |   8112 |        1 | (0,2)  |           2

  3 |   8072 |        1 | (0,3)  |           2

  4 |      5 |        2 |        |

  5 |   8040 |        1 | (0,6)  |       49154

  6 |   8000 |        1 | (0,6)  |       32770

(6 rows)

postgres=# vacuum tbl_hot;

VACUUM

postgres=# select lp, lp_off, lp_flags, t_ctid, t_infomask2 from heap_page_items(get_raw_page('tbl_hot', 0));

 lp | lp_off | lp_flags | t_ctid | t_infomask2

----+--------+----------+--------+-------------

  1 |   8152 |        1 | (0,1)  |           2

  2 |   8112 |        1 | (0,2)  |           2

  3 |   8072 |        1 | (0,3)  |           2

  4 |      6 |        2 |        |

  5 |      0 |        0 |        |

  6 |   8032 |        1 | (0,6)  |       32770

(6 rows)

postgres=# select ctid,t.* from tbl_hot t;

 ctid  | id |   info

-------+----+----------

 (0,1) |  1 | lottu

 (0,2) |  2 | lottu

 (0,3) |  3 | lottu

 (0,5) |  5 | lottu

 (0,6) |  4 | postgres

(5 rows)

　　最后；当更新的元祖是在其他page；这是索引也会更新；这可以理解是行迁移。这在oracle也是存在这种情况。但是相比oracle更频繁；当然可以设置降低fillfactor；减少这种情况出现。

三、参考

https://blog.csdn.net/xiaohai928ww/article/details/98603707

https://www.postgresql.org/docs/12/pageinspect.html