Scaling PostgreSQL: Modern servers provide people with more and more CPU cores. 16 cores or more are not uncommon these days. Unfortunately, PostgreSQL cannot scale a single query to more than one core. A lot of effort is going into this issue already.
Table of Contents
To address the issue, we've done some experiments to see what can be done to scale things up. At this point, we've written a small module which takes an existing execution plan and does some post-processing of the plan using “custom plan nodes”.
The idea is experimental: what happens if the approach described in this post is actually used?
So the goal in our case is really to scale out the processing done in the SELECT clause.
For a start, 100 million rows have been created to do a little testing. 20 million rows are in each partition:
Our prototype module has a simple runtime variable, which allows us to define the number of processes we want to see. Setting parallel.agg_workers to 1 basically means turning the plan-processing code off.
The data in the table is scanned normally. We need around 29 seconds to process the query:
Let us try the same with more cores: 7 cores will be used for worker processes.
The important part here is that more processes are actually a net loss. The reason is simple: The table is scanned by a single process at the moment so there is a serious bottleneck on the scan side. The aggregation part is simply not complex enough to provide us with a speedup. Remember: All we do is adding up some values (which is ways too simple). This is by far not enough to justify the overhead of moving data around via shared memory queues.
SELECT
clauses when scalingHowever, the situation starts to change significantly if we try a query involving a lot of processing in the SELECT
-clause. The more stuff we do in the SELECT
-clause, the more benefit we can observe.
In this case, we do a lot of aggregation and therefore the overhead of moving tuples around is not as big compared to the rest of the query.
What we see is a nice 3.5 times speedup. Of course, this is still far from linear. Linear scalability can hardly be achieved with this approach. However, if there are 50-100 aggregation functions in the SELECT
-clause (which is far from uncommon), there is clearly a major performance improvement. That tends to get pretty close to “linear” in many cases.
The process dispatching the data is running at 100%, while the worker processes are not able to make full use of the CPU. It seems that some overhead can still be shaved off to provide users with better overall performance. A prototype might be possible fairly soon.
In case you need any assistance, please feel free to contact us.
In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Facebook or LinkedIn.
You need to load content from reCAPTCHA to submit the form. Please note that doing so will share data with third-party providers.
More InformationYou need to load content from reCAPTCHA to submit the form. Please note that doing so will share data with third-party providers.
More InformationYou are currently viewing a placeholder content from Facebook. To access the actual content, click the button below. Please note that doing so will share data with third-party providers.
More InformationYou are currently viewing a placeholder content from X. To access the actual content, click the button below. Please note that doing so will share data with third-party providers.
More Information
The SELECT with multiple CASE statements appears to have the results in inverse order.
Is your use of CASE rather than (or in addition to, I find the construct here somewhat baffling) FILTER an implementation limitation?
FILTER is basically faster ... i did a blog post on that some time ago. i used the slow version here to create some load for the workers. we tested with FILTER. no showstoppers so far 🙂
Sorry I was unclear. Does your implementation allow using parallel workers in the FILTER case?
yes. we can do that. the code will be released when it is in a useful state. still a long road ahead
Oh, and where's the code? 🙂
8 CPU cores: 206667.727 ms
1 CPU core: 59226.385 ms
What we see is a nice 3.5 times speedup...
Seems, like numbers are mixed up here.
Actually they aren't. In the postgresql FAQ there is an explanation for this behaviour. https://wiki.postgresql.org/wiki/FAQ#How_does_PostgreSQL_use_CPU_resources.3F
Well, I see from numbers above that query running on 8 CPUs spent ~4 times more time than running on one CPU. That's not a speedup. The link to PostgreSQL FAQ is not about that. Am I missing something?