After the pgconfeu23 in Prague - which has been an excellent event - I decided to share some of the things I presented as a blog post to maybe shed some light on some of those topics. One of those ideas presented was the way PostgreSQL handles joins and especially join orders. Internally the PostgreSQL does a good job to optimize queries but how does it really work?
Table of Contents
1 2 3 4 5 6 7 8 9 10 |
plan=# SELECT 'CREATE TABLE x' || id || ' (id int)' FROM generate_series(1, 5) AS id; ?column? -------------------------- CREATE TABLE x1 (id int) CREATE TABLE x2 (id int) CREATE TABLE x3 (id int) CREATE TABLE x4 (id int) CREATE TABLE x5 (id int) (5 rows) |
In PostgreSQL we can easily create SQL using SQL. The beauty of psql is that one can simply run gexec to use the previous output as new input:
1 2 3 4 5 6 |
plan=# \gexec CREATE TABLE CREATE TABLE CREATE TABLE CREATE TABLE CREATE TABLE |
Voila, we have 5 tables which can serve as a sample data structure.
The following query shows a simple join using the tables we have just created:
1 2 3 4 5 6 7 8 |
plan=# explain (timing, analyze) SELECT * FROM x1 JOIN x2 ON (x1.id = x2.id) JOIN x3 ON (x2.id = x3.id) JOIN x4 ON (x3.id = x4.id) JOIN x5 ON (x4.id = x5.id); … Planning Time: 0.297 ms Execution Time: 0.043 ms |
What is the important observation here? Let us take a look at planning time. PostgreSQL needs 0.297 milliseconds to find the best execution plan (= execution strategy) to run the query. The question arising is: Where does the planner need the time to plan the query? The thing is: Even when using explicit joins as shown above PostgreSQL will join those tables implicitly and decide on the best join order. What does that mean in real life? Well let us consider a join “a join b join c”: Even if we write an SQL that says join “a to b” the optimizer might still decide to vote for “c join a join b” in case it guarantees the same result. Why is this so important? Because it offers a great deal of efficiency. Letting the optimizer decide on the best join order is an important internal optimization.
However, we got to keep planning time in mind - especially if there are many tables (10+?) are involved.
In case planning time is an issue we can force PostgreSQL to use the join order we want it to use. The variable controlling this behavior is join_collapse_limit. What does it mean? Basically it controls the number of explicit joins planned implicitly. In other words: How many explicit joins can be optimized by PostgreSQL.
If we set this variable to 1 it means that we force PostgreSQL to use the join order of our choosing:
1 2 3 4 5 6 7 8 9 10 |
plan=# SET join_collapse_limit TO 1; SET plan=# explain (timing, analyze) SELECT * FROM x1 JOIN x2 ON (x1.id = x2.id) JOIN x3 ON (x2.id = x3.id) JOIN x4 ON (x3.id = x4.id) JOIN x5 ON (x4.id = x5.id); … Planning Time: 0.069 ms Execution Time: 0.046 ms |
What is really noteworthy about this is the significant improvement of planning speed. We can see a stunning 4x speedup.
However, let me also issue a word of caution: There is a reason why the optimizer is trying to re-structure joins in the first place. In case the query is more expensive than what we see here in this example it can make a lot of sense to invest more time into plan creation. In other words: Changing this variable can backfire unless the end user is fully aware of what is going on. We therefore advise to test your queries and your entire setups using real data and a real workload before changing this setting. In general it can also be beneficial to ONLY change the variable for a single query and keep the default value in postgresql.conf as it is for all other operations.
More about the pgconfeu23 in Prague in our blog.
You need to load content from reCAPTCHA to submit the form. Please note that doing so will share data with third-party providers.
More InformationYou are currently viewing a placeholder content from Facebook. To access the actual content, click the button below. Please note that doing so will share data with third-party providers.
More InformationYou are currently viewing a placeholder content from X. To access the actual content, click the button below. Please note that doing so will share data with third-party providers.
More Information
There is typo in the timing display in both the output because both are same.
Thanks; I have fixed the typo.