Indexes are a perfect tool to find a certain value or some kind of range in a table. It is possible to speed up a query many times by avoiding a sequential scan on a large table. This kind of behavior is widely known and can be observed in any relational database system.
Table of Contents
What is interesting to note is that indexes are not only good to search for data – they also offer a good way to provide you with sorted output.
Let us try to demonstrate things with a simple example. For the sake of simplicity we create a table with 100.000 rows:
1 2 3 4 5 |
test=# CREATE TABLE t_test (id int4); CREATE TABLE test=# INSERT INTO t_test SELECT * FROM generate_series(1, 100000); INSERT 0 100000 |
The goal of this example is to show what happens if you want to find the top 5 IDs. The query is pretty simple:
1 2 3 4 5 6 7 8 9 |
test=# SELECT * FROM t_test ORDER BY id DESC LIMIT 5; id -------- 100000 99999 99998 99997 99996 (5 rows) |
All we need is an ORDER BY clause as well as a LIMIT clause. There is nothing special about this query. What is more important is the execution plan used for this query:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
test=# explain analyze SELECT * FROM t_test ORDER BY id DESC LIMIT 5; QUERY PLAN ------------------------------------------------------------------------------------------------- Limit (cost=3103.96..3103.98 rows=5 width=4) (actual time=35.758..35.761 rows=5 loops=1) -> Sort (cost=3103.96..3353.96 rows=100000 width=4) (actual time=35.757..35.757 rows=5 loops=1) Sort Key: id Sort Method: top-N heapsort Memory: 25kB -> Seq Scan on t_test (cost=0.00..1443.00 rows=100000 width=4) (actual time=0.007..12.372 rows=100000 loops=1) Total runtime: 35.794 ms (6 rows) |
PostgreSQL has to scan the table completely and perform a top-N heapsort. The larger the table grows the longer this will take. In other words: You cannot do this kind of query on a very large table if you need deterministic runtimes.
An index can help:
1 2 |
test=# CREATE INDEX idx_id ON t_test (id); CREATE INDEX |
The plan we had before needed 35 ms to perform the query. Once we have defined the index we can make use of its sorted content:
1 2 3 4 5 6 7 8 9 10 11 |
test=# explain analyze SELECT * FROM t_test ORDER BY id DESC LIMIT 5; QUERY PLAN ----------------------------------------------------------------------------------------------- Limit (cost=0.29..0.42 rows=5 width=4) (actual time=0.054..0.056 rows=5 loops=1) -> Index Only Scan Backward using idx_id on t_test (cost=0.29..2604.29 rows=100000 width=4) (actual time=0.052..0.054 rows=5 loops=1) Heap Fetches: 0 Total runtime: 0.080 ms (4 rows) |
All PostgreSQL has to do now is to read the index backwards and take the first 5 rows. Note that there is no WHERE clause involved here. We merely take advantage of sorting here.
The same concept can be applied to min and max. The max value is the highest entry in the table, which is not NULL. The min value is the lowest value on the PostgreSQL table, which is not NULL.
For quite some time now the PostgreSQL optimizer is able to handle this kind of query efficiently:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
test=# explain SELECT min(id), max(id) FROM t_test; QUERY PLAN ------------------------------------------------------------------------------------------------------- Result (cost=0.64..0.65 rows=1 width=0) InitPlan 1 (returns $0) -> Limit (cost=0.29..0.32 rows=1 width=4) -> Index Only Scan using idx_id on t_test (cost=0.29..2854.29 rows=100000 width=4) Index Cond: (id IS NOT NULL) InitPlan 2 (returns $1) -> Limit (cost=0.29..0.32 rows=1 width=4) -> Index Only Scan Backward using idx_id on t_test t_test_1 (cost=0.29..2854.29 rows=100000 width=4) Index Cond: (id IS NOT NULL) (9 rows) |
The query executes well under a millisecond because all it does is scanning the index twice and return the data. This is a major benefit when it comes to performance.
----------
In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Facebook or LinkedIn.
You need to load content from reCAPTCHA to submit the form. Please note that doing so will share data with third-party providers.
More InformationYou are currently viewing a placeholder content from Facebook. To access the actual content, click the button below. Please note that doing so will share data with third-party providers.
More InformationYou are currently viewing a placeholder content from X. To access the actual content, click the button below. Please note that doing so will share data with third-party providers.
More Information