CYBERTEC PostgreSQL Logo

Performance differences between normal and generic audit triggers

04.2020 / Category: / Tags: |

What are the performance differences between normal and generic audit triggers? Recently I was talking in a more general way about some common auditing / change tracking approaches for PostgreSQL...but it also made me curious, how does it look from the performance side?

To quickly recap the previous blog post: the most common approaches for tracking important changes are mostly solved with writing some triggers. There are two main variations: table specific triggers / audit tables and a more generic approach with only one trigger function and one (or also many) generic audit tables, usually relying on PostgreSQL’s NoSQL features that allow quite a high degree of genericness in quite a convenient and usable way.

Obviously one could “guesstimate” that the generic approach would perform worse than the tailored approach as this is commonly the case in software. But the question is - how much worse? Can it be considered negligible? Well there's only one way to find out I guess….and finally after finding some time to set up a small test case I can say that I was in for a bit of a surprise! But do read on for details or see the last paragraph for the executive summary.

Test schema for the “table specific” approach

I used pgbench to create auditing tables for the three pgbench tables that receive updates (detailed SQL here). It quickly generated SQL for the background auditing tables and attached triggers. While the schema is simple, I’m only including an excerpt here; the full script is available. I kept the history table populated by pgbench since it has no indexes and shouldn’t impact performance, ensuring both test cases remain equal.

Test schema for the “generic” approach

As with the explicit schema, we’re starting with the default pgbench schema but now extending it only with a single trigger function that will be attached to all 3 tables getting updates! And there will also only be a single logging table relying on the “generic” JSONB data type to handle all kinds of various inputs with just one column. Full SQL code below (or here):

Setup info

Some hardware / software information for completeness:

Test host: 4 CPU i5-6600, 16GB RAM, SATA SSD

PostgreSQL: v12.1, all defaults except shared_buffers set to 25% of RAM, i.e. 2GB, checkpoint_completion_target = 0.9, backend_flush_after = '2MB' (see here for explanation).

Pgbench: scale 1000, i.e. ~ 13GB DB size to largely factor out disk access (working set should fit more or less into RAM for good performance on any DB system), 4h runtime, 2 concurrent sessions (--jobs) not to cause any excessive locking on the smaller tables. Query latencies were measured directly in the database using the pg_stat_statement extension, so they should be accurate.

Results - TPS & latencies

Performance has many aspects but let’s start with TPS (Transactions per Second) and transaction latencies (single transaction duration) as they are probably the most important metrics for the performance hungry. Surprisingly it turned out that the generic NoSQL approach was quite a bit faster in total numbers! Whew, I didn’t expect that for sure. Note the “in total numbers” though...as updates on the 2 small tables were still a tiny bit slower still, but the costliest update a.k.a. the “heart” of the transaction (on pgbench_accounts table) was still significantly faster still. See below table for details.

 Explicit triggersGeneric triggersDifference (%)
TPS585794+35.7
Total transaction63245468581130+35.7
Mean time of UPDATE pgbench_accounts (ms)1.11270.8282-25.6
Mean time of UPDATE pgbench_branches (ms)0.04930.0499+1.3
Mean time of UPDATE pgbench_tellers (ms)0.04790.0483+1.0

Results - disk footprint

Well here the decision is very clear when looking at the resulting auditing table sizes after the tests finished - the grand total of tables representing the traditional / explicit approach are much smaller than the one big generic table! So this is definitely also worth considering if you’re building the next Amazon or such. We shouldn’t only look at absolute numbers here as we had 35% more transactions with the generic approach...so we should also factor that in. 

 Explicit triggersGeneric triggersDifference (%)TPS adjusted difference (%)
Total size of auditing tables (in MB)15784255+170+109

Note that the table sizes don’t include indexes.

Quering

While not directly related to this test, an important aspect of planning change tracking is the need to efficiently access the audit trail. After experimenting with various schemas, the explicit audit table approach clearly outperforms others. It eliminates the need for JSONB syntax, which can be challenging for beginners, and offers better query performance for complex queries as data accumulates. Additionally, the generic approach consumes about twice the space, making it less suitable for frequent historical queries.

Also you will most probably need more targeted functional indexes on the JSONB column as I wouldn’t recommend to just index everything with a single GiN index - although it would be the simplest way to go, it only makes sense if you really perform searches on the majority of all “columns”.

Anyways it’s hard to say something definitive here as real life requirements will differ vastly I imagine, and the amount of indexes / data will be decisive.

Summary

My personal learning from this test - it's not over until the fat lady sings 🙂 My initial hunch on the performance cost of generic JSONB storing triggers was not correct this time and the performance is actually great! So one shouldn’t be afraid of this approach due to the performance hit...but if at all, rather due to disk “costs”...or due to possible woes when needing to actually query the gathered data in some complex ways.

By the way - if you’re wondering why the generic approach was actually faster, my best bet is that it has to do with (at least) 3 factors:

1) PostgreSQL JSONB implementation is very-very efficient

2) less PL/pgSQL parsing / planning as same trigger code is used three times

3) better CPU level data caching as we have less active / hot data blocks.

And as always - remember that when choosing between different approaches for your own project I always recommend testing with your exact schema / hardware as schemas vary vastly and a lot depends on data types used. This test was just one of the simpler examples.

Bonus idea - generic “differential” change auditing

After finishing my test I started wondering if it would be somehow possible to reduce the disk “cost” downside of this otherwise neat “generic” approach...and it actually wasn’t too hard to put together some trigger code that looks at the row data and logs only columns and values that are changed!

The core idea can be seen in the code below (full sample here) and also the first short testing looked quite promising actually - about 40% of disk space savings compared to the above described generic approach! So this could be definitely an interesting way to go. Thanks for reading!

Read more about PostgreSQL, triggers and performance, see this blog: Why Are My PostgreSQL Updates Getting Slower? By Laurenz Albe.

5 responses to “Performance differences between normal and generic audit triggers”

    • Ok interesting. You mean ~10% for a single usage of such construct vs explicit column list or end result for a similar test? But seems this amount of penalty should still leave the JSONB approach on top still, even when taking more care when writing the normal triggers.

      • yes, there is a overhead of plpgsql with passing parameters to INSERT, and there is some CPU overhead related to number of columns. If your bottleneck is CPU, then is better to have less columns than more.

  1. Interesting post. Thanks for it.

    Just a tiny glitch: "Test host: 4 CPU i5-6600, 16GB RAM... shared_buffers set to 25% of RAM, i.e. 2GB" => 4GB shared buffers or test host with 8GB of RAM?

Leave a Reply

Your email address will not be published. Required fields are marked *

CYBERTEC Logo white
Get the newest PostgreSQL Info & Tools


    This site is protected by reCAPTCHA and the Google Privacy Policy & Terms of Service apply.

    ©
    2025
    CYBERTEC PostgreSQL International GmbH
    phone-handsetmagnifiercrosscross-circle
    linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram