CYBERTEC PostgreSQL Logo

Understanding LATERAL joins in PostgreSQL

07.2021 / Category: / Tags: |

LATERAL joins are one of the lesser-known features of PostgreSQL and other relational databases such as Oracle, DB2 and MS SQL. However, LATERAL joins are a really useful feature, and it makes sense to take a look at what you can accomplish with them.

Inspecting FROM more closely

Before we dive into LATERAL, it makes sense to sit back and think about SELECT and FROM clauses in SQL on a more philosophical level. Here is an example:

Basically, we could see this statement as a loop. Writing this SQL statement in pseudo code would look somewhat like the following snippet:

For each entry in the table, we do what the SELECT clause says. Usually data is simply returned as it is. A SELECT statement can be seen as a loop. But what if we need a “nested” loop? This is exactly what LATERAL is good for.

LATERAL joins: Creating sample data

Let’s imagine a simple example. Imagine we have a line of products, and we’ve also got customer wishlists. The goal now is to find the best 3 products for each wishlist. The following SQL snippet creates some sample data:

The product table is populated with 1000 products. The price is random, and we used a pretty creative name to name the products:

Next, we have a list of wishes.

As you can see, the wishlist belongs to a user and there is a desired price for those three products we want to suggest.

Running LATERAL joins

After providing some sample data and loading it into our PostgreSQL database, we can approach the problem and try to come up with a solution.

Suppose we wanted to find the top three products for every wish, in pseudo-code:

The important thing is that we need two loops. First, we need to iterate through the list of wishes and then we take a look at the sorted list of products, pick 3 and move on to the next wishlist.

Let’s see how this can be done using a LATERAL-join:

We’ll go through it step by step. The first thing you see in the FROM clause is the t_wishlist table. What LATERAL can do now is to use entries from the wishlist to do its magic. So for each entry in the wishlist, we pick three products. To figure out which products we need, we can make use of w.desired_price. In other words: It is like a “join with parameters”. The FROM-clause is the “outer loop” in our pseudo code and the LATERAL can be seen as the “inner loop”.

The result set looks as follows:

PostgreSQL returned three entries for each wishlist, which is exactly what we wanted. The important part here is that the LIMIT-clause is inside the SELECT fed to LATERAL. Thus it limits the number of rows per wishlist, and not the overall number of rows.

PostgreSQL is doing a pretty good job optimizing LATERAL joins. In our case, the execution plan is going to look pretty straightforward:

LATERAL joins are extremely useful, and can be utilized in many cases to speed up operations, or to simply make code a lot easier to understand.

Finally...

If you want to learn more about joins in general and if you want to read more about PostgreSQL right now consider checking out Laurenz Albe’s excellent post about join strategies in PostgreSQL.

If you want to learn more about the PostgreSQL optimizer in general, and if you want to find out more about optimization and other important topics related to PostgreSQL query optimization, check out my blog post about the optimizer.

9 responses to “Understanding LATERAL joins in PostgreSQL”

  1. Great explanation. So it's also very useful when there's no primary key to JOIN on. As I was reading this, I thought, "Why not just JOIN on id and add a WHERE statement after (prob more expensive)?" But there is no primary key shared by the tables. The only relation you have is price. Also, I often hear LATERAL can be done with a CTE but I just tried and could not reproduce the result with a CTE

  2. Winning! thanks for the blog post: clear, concise, lucid.

    btw would you know any university courses where one can learn postgres / querying to an expert / advanced level?

  3. Thanks, great explanation! Lateral join seems to be very dangerous and should only be used as a last-ditch effort. Even in this case which looks to be a very rare problem you are calculating the result in O(n*m) time for an O(n) problem and only works because you have 3 customers instead of say 30000. But it's still much better than cross joins and window functions to solve the same problem.

    Is there a declarative way to force postgresql to solve this in O(n)? In pl/SQL I could put together a simple function that solved this in less than a second for a 1000000 x 100000 dataset which was obviously impossible for the lateral join.

  4. A lateral join always forces a nested loop join. That is not dangerous at all, but can perform badly, particularly if both result sets are large. With the top three customers it shouldn't matter, and I don't think there is a better way.

    If n and m are the row counts being joined, this always has to be O(n*m). "Less than a second" has nothing to do with the big O notation. PL/pgSQL is not particularly fast, and I doubt that a lateral join implemented in PL/pgSQL will beat a lateral join in SQL.

  5. I may have completely misunderstood, true. The comment was not very clear.
    Anyway, the algorithm you describe is known as "nested loops":

    take the first wish and start looping through the products [...]
    take the next wish and continue the loop [...]

    • No, it's not. Please take your time to understand it.

      I edited my comment above and added my SQLs.

Leave a Reply

Your email address will not be published. Required fields are marked *

CYBERTEC Logo white
Get the newest PostgreSQL Info & Tools


    This site is protected by reCAPTCHA and the Google Privacy Policy & Terms of Service apply.

    ©
    2024
    CYBERTEC PostgreSQL International GmbH
    phone-handsetmagnifiercrosscross-circle
    linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram