CYBERTEC PostgreSQL Logo

Citus live query inspection using citus_stat_activity

11.2023 / Category: / Tags: |

Most people who use PostgreSQL on a regular basis may have seen pg_stat_activity, which is a system view. It gives you real-time information about what is happening on your database server. It has grown over the years to provide ever more information about system processes, database connections, parallelism and a lot more.

However, if you are running a Citus deployment, pg_stat_activity doesn't provide enough information to understand what's going on. While pg_stat_activity is the perfect tool to inspect a single server, a Citus deployment usually consists of many servers. In that case, it's hard to see what all the shards are doing, and how a query is distributed.

citus_stat_activity: Monitoring Citus

The solution to the problem is the citus_stat_activity view. When Citus is enabled, this view is available by default and provides the counterpart to pg_stat_activity on the cluster level. The following listing shows what the view looks like:

What you see is, to a large extent, identical to pg_stat_activity. However, there are a couple of fields which are only available in the Citus view (such as nodeid). Those fields will help us to see which shard is doing what, and how.

Running slow queries in Citus

To demonstrate how citus_stat_statements works, we can run a slow query. The problem with “normal” queries is that Citus usually executes them so quickly that it's hard to inspect. Therefore I have implemented an extremely slow query which will never end:

If we run this on the coordinator, we have enough time to check what's really going on:

From line 3 onwards, we can see what's happening on the shards. We have multiple queries per nodeid which operate on tables containing a postfix such as _102220. We can clearly see the parallelism that is going on behind the scenes.

Please note that if you connect to the shard, a lot of the complexity will be hidden from the end user:

psql won’t necessarily show how things are structured. While d will tell us that the shard is not present, we can still query it. Many of you will have noticed similar behavior. citus_stat_activity does not show up in d either, due to some behind-the-scenes magic.

Finally …

What I like about citus_stat_activity is that I can get all the information I need at a glance. It's not necessary to connect to each shard / node separately and gather the information one by one. A single view will show you the information at once, which makes it super easy to debug and to understand how the query is actually distributed efficiently.

See our other Citus and sharding blogs:

In case you need any assistance, please feel free to contact us.

Leave a Reply

Your email address will not be published. Required fields are marked *

CYBERTEC Logo white
CYBERTEC PostgreSQL International GmbH
Römerstraße 19
2752 Wöllersdorf
Austria

+43 (0) 2622 93022-0
office@cybertec.at

Get the newest PostgreSQL Info & Tools


    This site is protected by reCAPTCHA and the Google Privacy Policy & Terms of Service apply.

    ©
    2024
    CYBERTEC PostgreSQL International GmbH
    phone-handsetmagnifiercrosscross-circle
    linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram