CYBERTEC PostgreSQL Logo

PostgreSQL: Speeding up analytics and window functions

03.2018 / Category: / Tags: |

This post will deal with how to speed up analytics and window functions in PostgreSQL. As a professional PostgreSQL support company, we see a lot of SQL performance stuff which is worth sharing. One of those noteworthy things happened this week when I was out on a business trip to Berlin, Germany. This (excellent) customer was making extensive use of window functions and analytics. However, there is always room to speed things up.

“Improving SQL performance” or “how to trick the optimizer”

Most people simply write their SQL code and execute it, assuming that the optimizer will take care of things on its own. While this is usually the case there are still corner cases, where some clever optimization - done by a professional - can give you an edge and better database performance.

How to rearrange window functions for better performance

One way to improve speed is by rearranging window functions in a clever way. Consider the following example:

Our example is pretty simple: All we need is a table containing 5 rows:

Let's take a look at an example:

What we have here is a simple aggregation. For the sake of simplicity I have used array_agg, which simply shows, how our data is aggregated. Of course we could also use min, max, sum, count or any other window function.

Now, let's add a second column to the example:

In this case, there are two columns with two different OVER-clauses. Note that those two aggregations are using different sort orders. One column needs ascending data and one needs descending data.

To understand what is really going on here, we can take a look at the execution plan provided by PostgreSQL:

Sorting the data

First of all, PostgreSQL has to read the data and sort by “id”. This sorted data is fed to the first aggregation, before it is passed on to the next sort step (to sort descending). The second sort passes its output to its window aggregation. Finally, the data is again sorted by id because we want the final output to be ordered by the first column. Overall our data has to be sorted three times, which is not a good thing to do.

Optimizing window functions and avoiding excessive sorting

Sorting data three times is clearly not a good idea. Maybe we can do better. Let us simply swap two columns in the SELECT clause:

Skipping a sorting step

Wow, by making this little change we have actually managed to skip one sorting step. First of all, the data is sorted descending because we need it for the first window function. However, the next column will need data in exactly the same order as the final ORDER BY at the end of the query. PostgreSQL knows that and can already use the sorted input. If you are processing a big data set, this kind of optimization can make a huge difference and speed up your queries tremendously.

At this point PostgreSQL is not able (yet?) to make those adjustments for you so some manual improvements will definitely help. Try to adjust your window functions in a way that columns needing identical sorting are actually next to each other.

Read more about window functions in my post about SQL Trickery: Configuring Window Functions


In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Facebook or LinkedIn.

Leave a Reply

Your email address will not be published. Required fields are marked *

CYBERTEC Logo white
CYBERTEC PostgreSQL International GmbH
Römerstraße 19
2752 Wöllersdorf
Austria

+43 (0) 2622 93022-0
office@cybertec.at

Get the newest PostgreSQL Info & Tools


    This site is protected by reCAPTCHA and the Google Privacy Policy & Terms of Service apply.

    ©
    2024
    CYBERTEC PostgreSQL International GmbH
    phone-handsetmagnifiercrosscross-circle
    linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram