Hierarchical data in postgres
https://coderwall.com/p/whf3-a/hierarchical-data-in-postgres
-------------------------------------------------------------------------------
This tip will try to answer the following questions:
- How can we represent a tree of data in postgres
- How can we efficiently query for any entire single node and all of it's children (and children's children).
The test data
Since we want to keep this simple we will assume our data is just a bunch of sections. A section just has a nameand each section has a single parent section.
Section A
|--- Section A.1
Section B
|--- Section B.1
|--- Section B.1
|--- Section B.1.1
We'll use this simple data for examples below.
Simple self-referencing
When designing a self-referential table (something that joins itself to itself) the most obvious choice is to have some kind of parent_id column on our table that references itself.
CREATE TABLE section (
id INTEGER PRIMARY KEY,
name TEXT,
parent_id INTEGER REFERENCES section,
);
ALTER TABLE page ADD COLUMN parent_id INTEGER REFERENCES page;
CREATE INDEX section_parent_id_idx ON section (parent_id);
Now insert our example data, using the parent_id to related the nodes together:
INSERT INTO section (id, name, parent_id) VALUES (1, 'Section A', NULL);
INSERT INTO section (id, name, parent_id) VALUES (2, 'Section A.1', 1);
INSERT INTO section (id, name, parent_id) VALUES (3, 'Section B', NULL);
INSERT INTO section (id, name, parent_id) VALUES (4, 'Section B.1', 3);
INSERT INTO section (id, name, parent_id) VALUES (5, 'Section B.2', 3);
INSERT INTO section (id, name, parent_id) VALUES (6, 'Section B.2.1', 5);
This works great for simple queries such as, fetch the direct children of Section B:
SELECT * FROM section WHERE parent = 3
but it will require complex or recursive queries for questions like fetch me all the children and children's children of Section B:
WITH RECURSIVE nodes(id,name,parent_id) AS (
SELECT s1.id, s1.name, s1.parent_id
FROM section s1 WHERE parent_id = 3
UNION
SELECT s2.id, s2.name, s2.parent_id
FROM section s2, nodes s1 WHERE s2.parent_id = s1.id
)
SELECT * FROM nodes;
So we have answered the "how to build a tree" part of the question, but are not happy with the "how to query for a node and all it's children" part.
Enter ltree. (Short for "label tree" - I think?).
The ltree extension
The ltree extension is a great choice for querying hierarchical data. This is especially true for self-referential relationships.
Lets rebuild the above example using ltree. We'll use the page's primary keys as the "labels" within our ltree paths and a special "root" label to denote the top of the tree.
CREATE EXTENSION ltree;
CREATE TABLE section (
id INTEGER PRIMARY KEY,
name TEXT,
parent_path LTREE
);
CREATE INDEX section_parent_path_idx ON section USING GIST (parent_path);
We'll add in our data again, this time rather than using the id for the parent, we will construct an ltree path that represents the parent node.
INSERT INTO section (id, name, parent_path) VALUES (1, 'Section 1', 'root');
INSERT INTO section (id, name, parent_path) VALUES (2, 'Section 1.1', 'root.1');
INSERT INTO section (id, name, parent_path) VALUES (3, 'Section 2', 'root');
INSERT INTO section (id, name, parent_path) VALUES (4, 'Section 2.1', 'root.3');
INSERT INTO section (id, name, parent_path) VALUES (4, 'Section 2.2', 'root.3');
INSERT INTO section (id, name, parent_path) VALUES (5, 'Section 2.2.1', 'root.3.4');
Cool. So now we can make use of ltree's operators @> and <@ to answer our original question like:
SELECT * FROM section WHERE parent_path <@ 'root.3';
However we have introduced a few issues.
- Our simple
parent_idversion ensured referential consistancy by making use of theREFERENCESconstraint. We lost that by switching to ltree paths. - Ensuring that the ltree paths are valid can be a bit of a pain, and if paths become stale for some reason your queries may return unexpected results or you may "orphan" nodes.
The final solution
To fix these issues we want a hybrid of our original parent_id (for the referential consistency and simplicity of the child/parent relationship) and our ltree paths (for improved querying power/indexing). To achieve this we will hide the management of the ltree path behind a trigger and only ever update the parent_id column.
CREATE EXTENSION ltree;
CREATE TABLE section (
id INTEGER PRIMARY KEY,
name TEXT,
parent_id INTEGER REFERENCES section,
parent_path LTREE
);
CREATE INDEX section_parent_path_idx ON section USING GIST (parent_path);
CREATE INDEX section_parent_id_idx ON section (parent_id);
CREATE OR REPLACE FUNCTION update_section_parent_path() RETURNS TRIGGER AS $$
DECLARE
path ltree;
BEGIN
IF NEW.parent_id IS NULL THEN
NEW.parent_path = 'root'::ltree;
ELSEIF TG_OP = 'INSERT' OR OLD.parent_id IS NULL OR OLD.parent_id != NEW.parent_id THEN
SELECT parent_path || id::text FROM section WHERE id = NEW.parent_id INTO path;
IF path IS NULL THEN
RAISE EXCEPTION 'Invalid parent_id %', NEW.parent_id;
END IF;
NEW.parent_path = path;
END IF;
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER parent_path_tgr
BEFORE INSERT OR UPDATE ON section
FOR EACH ROW EXECUTE PROCEDURE update_section_parent_path();
Much better.
More
Written by Chris Farmiloe
Hierarchical data in postgres的更多相关文章
- asp.net Hierarchical Data
Introduction A Hierarchical Data is a data that is organized in a tree-like structure and structure ...
- mysql 树形数据,层级数据Managing Hierarchical Data in MySQL
原文:http://mikehillyer.com/articles/managing-hierarchical-data-in-mysql/ 引言 大多数用户都曾在数据库中处理过分层数据(hiera ...
- Managing Hierarchical Data in MySQL
Managing Hierarchical Data in MySQL Introduction Most users at one time or another have dealt with h ...
- Managing Hierarchical Data in MySQL(邻接表模型)[转载]
原文在:http://dev.mysql.com/tech-resources/articles/hierarchical-data.html 来源: http://www.cnblogs.com/p ...
- 云原生 PostgreSQL 集群 - PGO:来自 Crunchy Data 的 Postgres Operator
使用 PGO 在 Kubernetes 上运行 Cloud Native PostgreSQL:来自 Crunchy Data 的 Postgres Operator! Cloud Native Po ...
- [Postgres] Group and Aggregate Data in Postgres
How can we see a histogram of movies on IMDB with a particular rating? Or how much movies grossed at ...
- 《利用Python进行数据分析: Python for Data Analysis 》学习随笔
NoteBook of <Data Analysis with Python> 3.IPython基础 Tab自动补齐 变量名 变量方法 路径 解释 ?解释, ??显示函数源码 ?搜索命名 ...
- Following a Select Statement Through Postgres Internals
This is the third of a series of posts based on a presentation I did at the Barcelona Ruby Conferenc ...
- ZOJ 3826 Hierarchical Notation 模拟
模拟: 语法的分析 hash一切Key建设规划,对于记录在几个地点的每个节点原始的字符串开始输出. . .. 对每一个询问沿图走就能够了. .. . Hierarchical Notation Tim ...
随机推荐
- 《Scrum实战》第1次课课后任务
1.必做任务:从知行角度总结T平台 从知行角度总结T平台 头(知识,学习) 做得好的 宣贯会 引入敏捷思想 敏捷宣言 敏捷原则 质量风险前移原则 引入最佳实践 包括了XP的大部分实践 不足 项目管理框 ...
- day37-- &MySQL step1
m1.客户端与数据库服务器端是通过socket来交互数据,对数据库的理解:数据库就是一个文件夹,表就类比文件.m2.常用语句#查看数据库show databases:#创建数据库create data ...
- FCKeditor自定义编辑区CSS样式
在网站后台使用FCKeditor编辑器的时候,见到的效果可能并不完全是”所见即所得”的,因为如果在FCKeditor编辑区中使用了前台样式表中的样式,在编辑区中并不能把这些样式显示出来.解决这个问题的 ...
- 蓝桥杯Java输入输出相关
转载自:http://blog.csdn.net/Chen_Tongsheng/article/details/53354169 一.注意点 1. 类名称必须采用public class Main方式 ...
- webdriver高级应用- 修改Chrome设置伪装成手机M站
通过更改PC端Chrome浏览器的属性值,将PC端Chrome浏览器设定为手机端尺寸的浏览器,以便模拟手机端的浏览器,并完成各种页面操作. #encoding=utf-8from selenium i ...
- linux下文件显示被加锁如何解决?
1.很多时候从别的机器上拷贝过来的文件,没有权限打开,上面有一个小锁. 2.判断是权限没有,查询ls -al得知文件的的所有者,和所有者在的组都不是本机 3.使用chown改变用户的所有者和所有者所在 ...
- OA笔记
一:Asp.Net MVC请求处理原理(Asp.Net mvc 是怎样进入请求管道的.)请求-->IIS--->ISAPIRuntime-->HttpWorkRequest--> ...
- 【Luogu】P4234最小差值生成树(LCT)
题目链接 能把LCT打得每个函数都恰有一个错误也是挺令我惊讶的. 本题使用LCT维护生成树,具体做法是对原图中的每个边建一个点,然后连边的时候相当于是将边的起点跟“边”这个点连起来,边的终点也跟它连起 ...
- 【Luogu】P3786萃香抱西瓜(状压DP)
题目链接 水题,数据范围提示得太明显了吧,不用动脑子都能知道是状压. 不过还是有坑(当然更可能是我脑子有坑) f[i][j][k][l]表示当前是第i秒,萃香在(j,k),已经抱到的西瓜状态是l的最少 ...
- [luoguP2463] [SDOI2008]Sandy的卡片(后缀数组 + st表)
传送门 很容易想到,题目中的相同是指差分数组相同. 那么可以把差分数组连起来,中间加上一个没有出现过的且字典序小的数 双指针移动,用st表维护height数组中的最小值. 当然用单调队列应该也可以且更 ...