After my last post about ltree and recursive data in PostgreSQL people have asked me privately about performance issues. To share this information, I decided to come up with a follow up post to discuss this topic in a bit more detail. WITH RECURSIVE in PostgreSQL is efficient. However, ltree does have its strengths as well. Let us take a closer look …
Table of Contents
In preparation for a test I have created a table which contains a balanced tree:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
test=# CREATE TABLE tree AS WITH RECURSIVE tree (id, parent, lvl) AS ( SELECT generate_series(1, 5) AS id, NULL :: int4 AS parent, 1 AS lvl UNION ALL SELECT n, id, lvl + 1 FROM tree, generate_series(power(5, lvl) :: int4 + (id - 1)*5 + 1, power(5, lvl) :: int4 + (id -1)*5 + 5 ) g(n) WHERE lvl < 10 ) SELECT * FROM tree ORDER BY id; |
There is a bit of magic involved here but let us not worry too much about the details at this point. Once the data is generated, you should see the following content in your table:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
test=# SELECT * FROM tree LIMIT 20; id | parent | lvl ----+--------+----- 1 | | 1 2 | | 1 3 | | 1 4 | | 1 5 | | 1 6 | 1 | 2 7 | 1 | 2 8 | 1 | 2 9 | 1 | 2 10 | 1 | 2 11 | 2 | 2 12 | 2 | 2 13 | 2 | 2 14 | 2 | 2 15 | 2 | 2 16 | 3 | 2 17 | 3 | 2 18 | 3 | 2 19 | 3 | 2 20 | 3 | 2 (20 rows) |
1 2 3 4 5 |
test=# SELECT count(*) FROM tree; count ---------- 12207030 (1 row) |
12 million rows will be a nice way to show how efficient ltree can really be. As stated before, using a recursive query on this data might take a bit more time than running the query on prepared data. What you can do in the next listing is to resolve the tree and materialize an ltree column containing all the data. Make sure ltree is enabled in your database:
1 2 |
test=# CREATE EXTENSION ltree; CREATE EXTENSION |
Here is a materialized view containing the same data in a different format. The goal is to materialize the tree column to speed up searching:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
test=# CREATE MATERIALIZED VIEW mat AS WITH RECURSIVE x AS ( SELECT id::text, parent::text, id::text::ltree AS mypath FROM tree WHERE parent IS NULL UNION ALL SELECT y.id::text, y.parent::text, ltree_addtext(x.mypath, y.id::text) AS mypath FROM x, tree AS y WHERE x.id::text = y.parent::text ) SELECT * FROM x; |
To access the materialized view efficiently, we have to create a Gist index on the ltree column:
1 2 |
test=# CREATE INDEX idx_mypath ON mat USING gist (mypath); CREATE INDEX |
Note that a btree is not what we are looking for here. A gist index is necessary. The reason is that a btree only supports sorting which is not quite what we need here in this case.
If your data is static materializing makes a lot of sense. In case your data changes frequently and in case you always need 100% correct results it is of course pretty counterproductive to pre-calculate data. However, let us take a look at the speed of the materialized case:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
test=# explain analyze SELECT * FROM mat WHERE '4.21.129.770.4475.25499.143118.793712.4359182.23749035' @> mypath; QUERY PLAN ------------------------------------------------------------------------------------------------ Bitmap Heap Scan on mat (cost=161.88..4775.02 rows=1221 width=112) (actual time=0.117..0.117 rows=1 loops=1) Recheck Cond: ('4.21.129.770.4475.25499.143118.793712.4359182.23749035'::ltree @> mypath) Heap Blocks: exact=1 -> Bitmap Index Scan on idx_mypath (cost=0.00..161.57 rows=1221 width=0) (actual time=0.105..0.105 rows=1 loops=1) Index Cond: (mypath <@ '4.21.129.770.4475.25499.143118.793712.4359182.23749035'::ltree) Planning Time: 0.098 ms Execution Time: 0.270 ms (7 rows) |
The query executes in way less than a millisecond, which is exceptional.
Performance is king and we have written a lot about PostgreSQL performance in the past. One of the key aspects is VACUUM. Read on to find out how to enable and disable AUTOVACUUM.
In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Facebook or LinkedIn.
You need to load content from reCAPTCHA to submit the form. Please note that doing so will share data with third-party providers.
More InformationYou are currently viewing a placeholder content from Facebook. To access the actual content, click the button below. Please note that doing so will share data with third-party providers.
More InformationYou are currently viewing a placeholder content from X. To access the actual content, click the button below. Please note that doing so will share data with third-party providers.
More Information
I'm afraid the GIST index has a limit of 255 items... Is there any method to use this functionality with very long data structures?