PERFORMANCE TUNING: MAX AND GROUP BY

These days everybody is talking about time series, time series analysis and alike for performance tuning. Analyzing time series data in PostgreSQL can provide valuable insights, help in making informed decisions and understanding data more deeply. By utilizing PostgreSQL’s powerful features, we can efficiently query all types of measurement data to track trends, patterns, and anomalies over time. However, often there is a tiny little requirement, people are struggling to understand. Consider the following data:

Table of Contents

CREATE TABLE t_sample (
     tstamp           timestamptz, 
     sensor_id        int, 
     measurement      numeric
);                                                                                                                                                                                                                                CREATE TABLE
CREATE TABLE
 
 
INSERT INTO t_sample SELECT x, y, random() * 10000 
     FROM  generate_series('2025-01-01', 
                           '2025-01-06', 
                           '1 second'::interval) AS x,
           generate_series(1, 100) AS y 
     ORDER BY random();
INSERT 0 43200100

CREATE TABLE t_sample (

tstamp timestamptz,

sensor_id int,

measurement numeric

); CREATE TABLE

CREATE TABLE

INSERT INTO t_sample SELECT x, y, random() * 10000

FROM generate_series('2025-01-01',

'2025-01-06',

'1 second'::interval) AS x,

generate_series(1, 100) AS y

ORDER BY random();

INSERT 0 43200100

What we see here is a classical simple table containing time series data in PostgreSQL. For a couple of days we store 1 measurement value for each of those 100 sensors. Overall this provides us with roughly 43 million rows of data.

Common time series questions

Here is one of the most typical questions somebody might have given this data set:

Find the highest value for each sensor

In this post we will discuss how this can be done in the most efficient way possible.

Find the highest value for each group

This is most likely the most common use case we have seen. For each sensor in our series we want to see the highest value in the data set. Sounds easy? Well, it is … Let us take a look at the query:

test=# explain analyze SELECT sensor_id, max(measurement) 
             FROM   t_sample 
             GROUP BY 1 
             ORDER BY 1;
                              QUERY PLAN                                                                        
--------------------------------------------------------------------------
 Finalize GroupAggregate  (cost=546553.97..546579.31 rows=100 width=36) 
       (actual time=3620.266..3621.127 rows=100 loops=1)
   Group Key: sensor_id
   ->  Gather Merge  (cost=546553.97..546577.31 rows=200 width=36) 
              (actual time=3620.260..3621.079 rows=300 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Sort  (cost=545553.95..545554.20 rows=100 width=36) 
                    (actual time=3616.385..3616.388 rows=100 loops=3)
               Sort Key: sensor_id
               Sort Method: quicksort  Memory: 28kB
               Worker 0:  Sort Method: quicksort  Memory: 28kB
               Worker 1:  Sort Method: quicksort  Memory: 28kB
               ->  Partial HashAggregate  (cost=545549.63..545550.63 
                           rows=100 width=36) 
                    (actual time=3616.322..<strong>3616.328</strong> rows=100 loops=3)
                     Group Key: sensor_id
                     Batches: 1  Memory Usage: 32kB
                     Worker 0:  Batches: 1  Memory Usage: 32kB
                     Worker 1:  Batches: 1  Memory Usage: 32kB
                     ->  Parallel Seq Scan on t_sample  
                          (cost=0.00..455549.42 rows=18000042 width=15) 
                          (actual time=0.545..1873.428 rows=14400033 loops=3)
 Planning Time: 0.154 ms
 Execution Time: 3621.199 ms
(18 rows)

test=# explain analyze SELECT sensor_id, max(measurement)

FROM t_sample

GROUP BY 1

ORDER BY 1;

QUERY PLAN

--------------------------------------------------------------------------

Finalize GroupAggregate (cost=546553.97..546579.31 rows=100 width=36)

(actual time=3620.266..3621.127 rows=100 loops=1)

Group Key: sensor_id

-> Gather Merge (cost=546553.97..546577.31 rows=200 width=36)

(actual time=3620.260..3621.079 rows=300 loops=1)

Workers Planned: 2

Workers Launched: 2

-> Sort (cost=545553.95..545554.20 rows=100 width=36)

(actual time=3616.385..3616.388 rows=100 loops=3)

Sort Key: sensor_id

Sort Method: quicksort Memory: 28kB

Worker 0: Sort Method: quicksort Memory: 28kB

Worker 1: Sort Method: quicksort Memory: 28kB

-> Partial HashAggregate (cost=545549.63..545550.63

rows=100 width=36)

(actual time=3616.322..<strong>3616.328</strong> rows=100 loops=3)

Group Key: sensor_id

Batches: 1 Memory Usage: 32kB

Worker 0: Batches: 1 Memory Usage: 32kB

Worker 1: Batches: 1 Memory Usage: 32kB

-> Parallel Seq Scan on t_sample

(cost=0.00..455549.42 rows=18000042 width=15)

(actual time=0.545..1873.428 rows=14400033 loops=3)

Planning Time: 0.154 ms

Execution Time: 3621.199 ms

(18 rows)

What happens here is that PostgreSQL will read ALL the data and run aggregation. To speed things up we see a parallel query using multiple CPU cores at a time. However, the core problem is: We got to read all the data. Of course that leads to long runtimes which keep growing the more data we have.

Let us try to deploy and index and see if it changes anything:

test=# CREATE INDEX ON t_sample (sensor_id, measurement);
CREATE INDEX
 
test=# \d t_sample
                         Table "public.t_sample"
   Column    |           Type           | Collation | Nullable | Default
-------------+--------------------------+-----------+----------+---------
 tstamp      | timestamp with time zone |           |          | 
 sensor_id   | integer                  |           |          | 
 measurement | numeric                  |           |          | 
Indexes:
    "t_sample_sensor_id_measurement_idx" btree (sensor_id, measurement)

test=# CREATE INDEX ON t_sample (sensor_id, measurement);

CREATE INDEX

test=# \d t_sample

Table "public.t_sample"

Column | Type | Collation | Nullable | Default

-------------+--------------------------+-----------+----------+---------

tstamp | timestamp with time zone | | |

sensor_id | integer | | |

measurement | numeric | | |

Indexes:

"t_sample_sensor_id_measurement_idx" btree (sensor_id, measurement)

Running the query again will not lead to a better execution plan. The way to address this problem is to simulate something that is generally known as “skip scan”. What does it mean? For each incarnation of sensor values we find the maximum. The system can find the maximum value for a single sensor quite quickly.

test=# explain analyze SELECT max(measurement) 
             FROM   t_sample 
             WHERE  sensor_id = 10;
                                   QUERY PLAN    
-----------------------------------------------------------------------------
 Result  (cost=0.60..0.61 rows=1 width=32) 
       (actual time=0.111..0.112 rows=1 loops=1)
   InitPlan 1 (returns $0)
     ->  Limit  (cost=0.56..0.60 rows=1 width=11) 
                (actual time=0.106..0.107 rows=1 loops=1)
           ->  Index  Only Scan Backward 
                    using t_sample_sensor_id_measurement_idx 
                    on t_sample  (cost=0.56..13321.39 rows=375841 width=11) 
                    (actual time=0.105..0.105 rows=1 loops=1)
                 Index Cond: ((sensor_id = 10) AND (measurement IS NOT NULL))
                 Heap Fetches: 0
 Planning Time: 0.269 ms
 Execution Time: 0.143 ms
(8 rows)

test=# explain analyze SELECT max(measurement)

FROM t_sample

WHERE sensor_id = 10;

QUERY PLAN

-----------------------------------------------------------------------------

Result (cost=0.60..0.61 rows=1 width=32)

(actual time=0.111..0.112 rows=1 loops=1)

InitPlan 1 (returns $0)

-> Limit (cost=0.56..0.60 rows=1 width=11)

(actual time=0.106..0.107 rows=1 loops=1)

-> Index Only Scan Backward

using t_sample_sensor_id_measurement_idx

on t_sample (cost=0.56..13321.39 rows=375841 width=11)

(actual time=0.105..0.105 rows=1 loops=1)

Index Cond: ((sensor_id = 10) AND (measurement IS NOT NULL))

Heap Fetches: 0

Planning Time: 0.269 ms

Execution Time: 0.143 ms

(8 rows)

Wow, we can find a single value in a fraction of a millisecond. Unfortunately we have to apply some trickery to do this for more than just a single sensor. There are two ways to emulate a skip scan:

Use a table to store a list of sensor ideas and use a LATERAL join
Run a recursive query

The following example shows how we can use a lateral join to solve the problem:

test=# explain analyze SELECT * FROM generate_series(1, 100) AS x, 
              LATERAL (SELECT max(measurement) FROM t_sample WHERE sensor_id = x);
                               QUERY PLAN    
---------------------------------------------------------------------------
 Nested Loop  (cost=0.60..63.05 rows=100 width=36) 
              (actual time=2.656..5.002 rows=100 loops=1)
   ->  Function Scan on generate_series x  
             (cost=0.00..1.00 rows=100 width=4) 
             (actual time=2.520..2.532 rows=100 loops=1)
   ->  Result  (cost=0.60..0.61 rows=1 width=32) 
               (actual time=0.024..0.024 rows=1 loops=100)
         InitPlan 1 (returns $1)
           ->  Limit  (cost=0.56..0.60 rows=1 width=11) 
                      (actual time=0.023..0.023 rows=1 loops=100)
                 ->   Index Only Scan Backward 
                           using t_sample_sensor_id_measurement_idx 
                           on t_sample  (cost=0.56..15312.59 rows=432001 width=11)
                           (actual time=0.023..0.023 rows=1 loops=100)
                       Index Cond: ((sensor_id = x.x) 
                          AND (measurement IS NOT NULL))
                       Heap Fetches: 0
 Planning Time: 0.481 ms
 Execution Time: 5.124 ms
(10 rows)

test=# explain analyze SELECT * FROM generate_series(1, 100) AS x,

LATERAL (SELECT max(measurement) FROM t_sample WHERE sensor_id = x);

QUERY PLAN

---------------------------------------------------------------------------

Nested Loop (cost=0.60..63.05 rows=100 width=36)

(actual time=2.656..5.002 rows=100 loops=1)

-> Function Scan on generate_series x

(cost=0.00..1.00 rows=100 width=4)

(actual time=2.520..2.532 rows=100 loops=1)

-> Result (cost=0.60..0.61 rows=1 width=32)

(actual time=0.024..0.024 rows=1 loops=100)

InitPlan 1 (returns $1)

-> Limit (cost=0.56..0.60 rows=1 width=11)

(actual time=0.023..0.023 rows=1 loops=100)

-> Index Only Scan Backward

using t_sample_sensor_id_measurement_idx

on t_sample (cost=0.56..15312.59 rows=432001 width=11)

(actual time=0.023..0.023 rows=1 loops=100)

Index Cond: ((sensor_id = x.x)

AND (measurement IS NOT NULL))

Heap Fetches: 0

Planning Time: 0.481 ms

Execution Time: 5.124 ms

(10 rows)

What is a LATERAL? Let us step back a bit and see SQL from a more philosophical point of view:

SELECT whatever FROM foo;

1	SELECT whatever FROM foo;

In code this means:

foreach x in foo
loop
       do_whatever
end loop;

foreach x in foo

loop

do_whatever

end loop;

One can see a FROM clause as some kind of loop. Now: If we want to find the max for a list of values what we really need is a loop around that loop. That is exactly what LATERAL does for us: For each value returned from the generate_series call (= generates the list of sensors) we run the LATERAL part of the query. The problem is: We are generating the list of sensors which makes the query a bit too “static” so it is better to fetch the sensor list from a second relation (normalization).

However, we can also achieve the same thing using a simple recursion:

test=# explain analyze 
WITH RECURSIVE x AS ( 
       SELECT       min(sensor_id) AS s
       FROM         t_sample
       UNION ALL
       SELECT       (SELECT min(sensor_id)
       FROM         t_sample
       WHERE  sensor_id > x.s)                                                                                                                                                                                                                                                                                        
       FROM x                                                                                                                                                                                                                                                                                                                
       WHERE x.s IS NOT NULL
)
SELECT s AS sensor_id,
               (SELECT        max(measurement)
               FROM     t_sample
               WHERE          t_sample.sensor_id = x.s) AS measurement
FROM          x
WHERE  s IS NOT NULL;
                               QUERY PLAN    
----------------------------------------------------------------- 
 CTE Scan on x  (cost=64.66..127.73 rows=100 width=36) 
       (actual time=0.103..2.410 rows=100 loops=1)
   Filter: (s IS NOT NULL)
   Rows Removed by Filter: 1
   CTE x
    ->  Recursive Union (cost=0.60..64.66 rows=101 width=4)
                        (actual time=0.060..0.838 rows=101 loops=1)
           ->  Result (cost=0.60..0.61 rows=1 width=4) 
                      (actual time=0.058..0.059 rows=1 loops=1)
                 InitPlan 3 (returns $1)
                   ->  Limit  (cost=0.56..0.60 rows=1 width=4) 
                              (actual time=0.051..0.052 rows=1 loops=1)
                         ->  Index Only Scan 
                                using t_sample_sensor_id_measurement_idx 
                                on t_sample t_sample_1  
                                   (cost=0.56..1423086.31 rows=43200100 width=4)
                                   (actual time=0.049..0.050 rows=1 loops=1)
                               Index Cond: (sensor_id IS NOT NULL)
                               Heap Fetches: 0
           ->  WorkTable Scan on x x_1  
                    (cost=0.00..6.30 rows=10 width=4) 
                    (actual time=0.007..0.007 rows=1 loops=101)
                 Filter: (s IS NOT NULL)
                 Rows Removed by Filter: 0
                 SubPlan 2
                   ->  Result (cost=0.60..0.61 rows=1 width=4) 
                                (actual time=0.007..0.007 rows=1 loops=100)
                         InitPlan 1 (returns $3)
                           ->  Limit  
                                    (cost=0.56..0.60 rows=1 width=4) 
                                    (actual time=0.006..0.006 rows=1 loops=100)
                                 -> Index Only Scan 
                                     using t_sample_sensor_id_measurement_idx 
                                     on t_sample  
                                     (cost=0.56..510365.23 rows=14400033 width=4)
                                     (actual time=0.006..0.006 rows=1 loops=100)
                                     Index Cond: ((sensor_id IS NOT NULL) 
                                        AND (sensor_id > x_1.s))
                                       Heap Fetches: 0
   SubPlan 6
     ->  Result  (cost=0.60..0.61 rows=1 width=32) 
                 (actual time=0.015..0.015 rows=1 loops=100)
           InitPlan 5 (returns $6)
             ->  Limit  (cost=0.56..0.60 rows=1 width=11) 
                        (actual time=0.015..0.015 rows=1 loops=100)
                   ->  Index Only Scan Backward 
                           using t_sample_sensor_id_measurement_idx 
                           on t_sample t_sample_2  
                         (cost=0.56..15312.59 rows=432001 width=11) 
                         (actual time=0.014..0.014 rows=1 loops=100)
                        Index Cond: ((sensor_id = x.s) 
                                  AND (measurement IS NOT NULL))
                         Heap Fetches: 0
 Planning Time: 0.446 ms
 Execution Time: 2.495 ms
(30 rows)

test=# explain analyze

WITH RECURSIVE x AS (

SELECT min(sensor_id) AS s

FROM t_sample

UNION ALL

SELECT (SELECT min(sensor_id)

FROM t_sample

WHERE sensor_id > x.s)

FROM x

WHERE x.s IS NOT NULL

)

SELECT s AS sensor_id,

(SELECT max(measurement)

FROM t_sample

WHERE t_sample.sensor_id = x.s) AS measurement

FROM x

WHERE s IS NOT NULL;

QUERY PLAN

-----------------------------------------------------------------

CTE Scan on x (cost=64.66..127.73 rows=100 width=36)

(actual time=0.103..2.410 rows=100 loops=1)

Filter: (s IS NOT NULL)

Rows Removed by Filter: 1

CTE x

-> Recursive Union (cost=0.60..64.66 rows=101 width=4)

(actual time=0.060..0.838 rows=101 loops=1)

-> Result (cost=0.60..0.61 rows=1 width=4)

(actual time=0.058..0.059 rows=1 loops=1)

InitPlan 3 (returns $1)

-> Limit (cost=0.56..0.60 rows=1 width=4)

(actual time=0.051..0.052 rows=1 loops=1)

-> Index Only Scan

using t_sample_sensor_id_measurement_idx

on t_sample t_sample_1

(cost=0.56..1423086.31 rows=43200100 width=4)

(actual time=0.049..0.050 rows=1 loops=1)

Index Cond: (sensor_id IS NOT NULL)

Heap Fetches: 0

-> WorkTable Scan on x x_1

(cost=0.00..6.30 rows=10 width=4)

(actual time=0.007..0.007 rows=1 loops=101)

Filter: (s IS NOT NULL)

Rows Removed by Filter: 0

SubPlan 2

-> Result (cost=0.60..0.61 rows=1 width=4)

(actual time=0.007..0.007 rows=1 loops=100)

InitPlan 1 (returns $3)

-> Limit

(cost=0.56..0.60 rows=1 width=4)

(actual time=0.006..0.006 rows=1 loops=100)

-> Index Only Scan

using t_sample_sensor_id_measurement_idx

on t_sample

(cost=0.56..510365.23 rows=14400033 width=4)

(actual time=0.006..0.006 rows=1 loops=100)

Index Cond: ((sensor_id IS NOT NULL)

AND (sensor_id > x_1.s))

Heap Fetches: 0

SubPlan 6

-> Result (cost=0.60..0.61 rows=1 width=32)

(actual time=0.015..0.015 rows=1 loops=100)

InitPlan 5 (returns $6)

-> Limit (cost=0.56..0.60 rows=1 width=11)

(actual time=0.015..0.015 rows=1 loops=100)

-> Index Only Scan Backward

using t_sample_sensor_id_measurement_idx

on t_sample t_sample_2

(cost=0.56..15312.59 rows=432001 width=11)

(actual time=0.014..0.014 rows=1 loops=100)

Index Cond: ((sensor_id = x.s)

AND (measurement IS NOT NULL))

Heap Fetches: 0

Planning Time: 0.446 ms

Execution Time: 2.495 ms

(30 rows)

The core idea behind the query is as follows: We find the smallest sensor_id in the table which is really fast. Then we look for the next smallest value larger than the one we found before. We keep doing this in the recursion until we have quickly compiled a unique list of sensor_id. The recursion is indeed the fastest way to find a list of DISTINCT sensor_id without having to read the entire table. Essentially this is what it means to simulate a “skip scan” as mentioned before. Finally we can use this list of unique sensor IDS (“x”) to find the maximum value for each of them which is of course a fast index lookup.

Finally…

If you want to know more about time series in general you might want to check out my post about simple pattern matching and string encoding in PostgreSQL which is available on our website as well.

In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Twitter, Facebook, or LinkedIn.

PERFORMANCE TUNING: MAX AND GROUP BY

Common time series questions

Find the highest value for each group

Finally…

Leave a Reply Cancel reply

Hans-Jürgen Schönig

Blog Tags

NEWSLETTER

Articles by our PostgreSQL Experts