Hierarchical data in postgres
https://coderwall.com/p/whf3-a/hierarchical-data-in-postgres
-------------------------------------------------------------------------------
This tip will try to answer the following questions:
- How can we represent a tree of data in postgres
- How can we efficiently query for any entire single node and all of it's children (and children's children).
The test data
Since we want to keep this simple we will assume our data is just a bunch of sections. A section just has a nameand each section has a single parent section.
Section A
|--- Section A.1
Section B
|--- Section B.1
|--- Section B.1
|--- Section B.1.1
We'll use this simple data for examples below.
Simple self-referencing
When designing a self-referential table (something that joins itself to itself) the most obvious choice is to have some kind of parent_id column on our table that references itself.
CREATE TABLE section (
id INTEGER PRIMARY KEY,
name TEXT,
parent_id INTEGER REFERENCES section,
);
ALTER TABLE page ADD COLUMN parent_id INTEGER REFERENCES page;
CREATE INDEX section_parent_id_idx ON section (parent_id);
Now insert our example data, using the parent_id to related the nodes together:
INSERT INTO section (id, name, parent_id) VALUES (1, 'Section A', NULL);
INSERT INTO section (id, name, parent_id) VALUES (2, 'Section A.1', 1);
INSERT INTO section (id, name, parent_id) VALUES (3, 'Section B', NULL);
INSERT INTO section (id, name, parent_id) VALUES (4, 'Section B.1', 3);
INSERT INTO section (id, name, parent_id) VALUES (5, 'Section B.2', 3);
INSERT INTO section (id, name, parent_id) VALUES (6, 'Section B.2.1', 5);
This works great for simple queries such as, fetch the direct children of Section B:
SELECT * FROM section WHERE parent = 3
but it will require complex or recursive queries for questions like fetch me all the children and children's children of Section B:
WITH RECURSIVE nodes(id,name,parent_id) AS (
SELECT s1.id, s1.name, s1.parent_id
FROM section s1 WHERE parent_id = 3
UNION
SELECT s2.id, s2.name, s2.parent_id
FROM section s2, nodes s1 WHERE s2.parent_id = s1.id
)
SELECT * FROM nodes;
So we have answered the "how to build a tree" part of the question, but are not happy with the "how to query for a node and all it's children" part.
Enter ltree. (Short for "label tree" - I think?).
The ltree extension
The ltree extension is a great choice for querying hierarchical data. This is especially true for self-referential relationships.
Lets rebuild the above example using ltree. We'll use the page's primary keys as the "labels" within our ltree paths and a special "root" label to denote the top of the tree.
CREATE EXTENSION ltree;
CREATE TABLE section (
id INTEGER PRIMARY KEY,
name TEXT,
parent_path LTREE
);
CREATE INDEX section_parent_path_idx ON section USING GIST (parent_path);
We'll add in our data again, this time rather than using the id for the parent, we will construct an ltree path that represents the parent node.
INSERT INTO section (id, name, parent_path) VALUES (1, 'Section 1', 'root');
INSERT INTO section (id, name, parent_path) VALUES (2, 'Section 1.1', 'root.1');
INSERT INTO section (id, name, parent_path) VALUES (3, 'Section 2', 'root');
INSERT INTO section (id, name, parent_path) VALUES (4, 'Section 2.1', 'root.3');
INSERT INTO section (id, name, parent_path) VALUES (4, 'Section 2.2', 'root.3');
INSERT INTO section (id, name, parent_path) VALUES (5, 'Section 2.2.1', 'root.3.4');
Cool. So now we can make use of ltree's operators @> and <@ to answer our original question like:
SELECT * FROM section WHERE parent_path <@ 'root.3';
However we have introduced a few issues.
- Our simple
parent_idversion ensured referential consistancy by making use of theREFERENCESconstraint. We lost that by switching to ltree paths. - Ensuring that the ltree paths are valid can be a bit of a pain, and if paths become stale for some reason your queries may return unexpected results or you may "orphan" nodes.
The final solution
To fix these issues we want a hybrid of our original parent_id (for the referential consistency and simplicity of the child/parent relationship) and our ltree paths (for improved querying power/indexing). To achieve this we will hide the management of the ltree path behind a trigger and only ever update the parent_id column.
CREATE EXTENSION ltree;
CREATE TABLE section (
id INTEGER PRIMARY KEY,
name TEXT,
parent_id INTEGER REFERENCES section,
parent_path LTREE
);
CREATE INDEX section_parent_path_idx ON section USING GIST (parent_path);
CREATE INDEX section_parent_id_idx ON section (parent_id);
CREATE OR REPLACE FUNCTION update_section_parent_path() RETURNS TRIGGER AS $$
DECLARE
path ltree;
BEGIN
IF NEW.parent_id IS NULL THEN
NEW.parent_path = 'root'::ltree;
ELSEIF TG_OP = 'INSERT' OR OLD.parent_id IS NULL OR OLD.parent_id != NEW.parent_id THEN
SELECT parent_path || id::text FROM section WHERE id = NEW.parent_id INTO path;
IF path IS NULL THEN
RAISE EXCEPTION 'Invalid parent_id %', NEW.parent_id;
END IF;
NEW.parent_path = path;
END IF;
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER parent_path_tgr
BEFORE INSERT OR UPDATE ON section
FOR EACH ROW EXECUTE PROCEDURE update_section_parent_path();
Much better.
More
Written by Chris Farmiloe
Hierarchical data in postgres的更多相关文章
- asp.net Hierarchical Data
Introduction A Hierarchical Data is a data that is organized in a tree-like structure and structure ...
- mysql 树形数据,层级数据Managing Hierarchical Data in MySQL
原文:http://mikehillyer.com/articles/managing-hierarchical-data-in-mysql/ 引言 大多数用户都曾在数据库中处理过分层数据(hiera ...
- Managing Hierarchical Data in MySQL
Managing Hierarchical Data in MySQL Introduction Most users at one time or another have dealt with h ...
- Managing Hierarchical Data in MySQL(邻接表模型)[转载]
原文在:http://dev.mysql.com/tech-resources/articles/hierarchical-data.html 来源: http://www.cnblogs.com/p ...
- 云原生 PostgreSQL 集群 - PGO:来自 Crunchy Data 的 Postgres Operator
使用 PGO 在 Kubernetes 上运行 Cloud Native PostgreSQL:来自 Crunchy Data 的 Postgres Operator! Cloud Native Po ...
- [Postgres] Group and Aggregate Data in Postgres
How can we see a histogram of movies on IMDB with a particular rating? Or how much movies grossed at ...
- 《利用Python进行数据分析: Python for Data Analysis 》学习随笔
NoteBook of <Data Analysis with Python> 3.IPython基础 Tab自动补齐 变量名 变量方法 路径 解释 ?解释, ??显示函数源码 ?搜索命名 ...
- Following a Select Statement Through Postgres Internals
This is the third of a series of posts based on a presentation I did at the Barcelona Ruby Conferenc ...
- ZOJ 3826 Hierarchical Notation 模拟
模拟: 语法的分析 hash一切Key建设规划,对于记录在几个地点的每个节点原始的字符串开始输出. . .. 对每一个询问沿图走就能够了. .. . Hierarchical Notation Tim ...
随机推荐
- loj2024「JLOI / SHOI2016」侦查守卫
too hard #include <iostream> #include <cstdio> using namespace std; int n, d, m, uu, vv, ...
- luogu1208 尼克的任务
倒着推就是了 #include <iostream> #include <cstdio> #include <vector> using namespace std ...
- LoadRunner 11破解方法
名称:HP Loadrunner Software 11.00 版本号:11.00.0.0 安装环境:Win 7 软件安装成功后,会弹出提示告知license的有效期为10天. 破解方法: 1.下载破 ...
- Leetcode31--->Next Permutation(数字的下一个排列)
题目: 给定一个整数,存放在数组中,求出该整数的下一个排列(字典顺序):要求原地置换,且不能分配额外的内存 举例: 1,2,3 → 1,3,2: 3,2,1 → 1,2,3: 1,1,5 → 1, ...
- CSS3中的border-radius兼容IE低版本解决方法
ie-css3.htc先说道说道这斯是弄啥嘞ie-css3.htc是一个可以让IE浏览器支持部份CSS3属性的htc文件,不只是box-shadow,它还可以让你的IE浏览器支持圆角属性border- ...
- Node.js中的http.request方法的使用说明
方法说明: 函数的功能室作为客户端向HTTP服务器发起请求. 语法: http.get(options, callback) 由于该方法属于http模块,使用前需要引入http模块(var http= ...
- C++ POST方式访问网页
bool PostContent(CString strUrl, const CString &strPara, CString &strContent, CString &s ...
- Mark Down 简单标记语言
MarkDown介绍=============== ## 1.标题分级介绍 #一级标题###三级标题######六级标题 一级标题============== 二级标题---------------- ...
- Diango路由控制
路由的格式: #路由配置的格式: urls.py里面写 from diango.conf.urls import url urlpatterns = [ url(正则表达式,views视图函数,nam ...
- 答题小程序开发之socket编程 微信小程序答题 直播答题开发 直播弹幕使用web socket编程
最近有一个项目很火,那就是直播答题的,接到公司的这个任务,开发直播答题的聊天室功能.在线的人相互聊天.之前做过类似的,当时都是使用的ajax轮询的,这种非常的耗费服务器.所以这次就开始使用socket ...