By Kaarel Moppel - In my last post I described what to expect from simple PL/pgSQL triggers in performance degradation sense, when doing some inspection/changing on the incoming row data. The conclusion for the most common “audit fields” type of use case was that we should not worry about it too much and just create those triggers. But in which use cases would it make sense to start worrying a bit?
Table of Contents
So, to get more insights I conjured up some more complex trigger use cases and again measured transaction latencies on them for an extended period of time. Please read on for some extra info on the performed tests or just jump to the concluding results table at end of article.
This was the initial test that I ran for the original blog post - default pgbench transactions, with schema slightly modified to include 2 auditing columns for all tables being updated, doing 3 updates, 1 select, 1 insert (see here to see how the default transaction looks like) vs PL/PgSQL audit triggers on all 3 tables getting updates. The triggers will just set the last modification timestamp to current time and username to current user, if not already specified in the incoming row.
Results: 1.173ms vs 1.178ms i.e. <1% penalty for the version with triggers.
With multi statement transactions a lot of time is actually spent on communication over the network. To get rid of that the next test consisted of just a single update on the pgbench_accounts table (again 2 audit columns added to the schema). And then again the same with an PL/pgSQL auditing trigger enabled that sets the modification timestamp and username if left empty.
Results: 0.390ms vs 0.405ms ~ 4% penalty for the trigger version. Already a bit visible, but still quite dismissible I believe.
1 2 3 4 |
/* script file used for pgbench */ set aid random(1, 100000 * :scale) set delta random(-5000, 5000) UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid; |
But what it the above 4% performance degradation is not acceptable and it sums up if we are actually touching a dozen of tables (ca 60% hit)? Can we somehow shave off some microseconds?
Well one could try to write triggers in the Postgres native language of “C”! As well with optimizing normal functions it should help with triggers. But hughh, “C” you think...sounds daunting? Well...sure, it’s not going to be all fun and play, but there a quite a lot of examples actually included in the Postgres source code to get going, see here for example.
So after some tinkering around (I'm more of a Python / Go guy) I arrived at these numbers: 0.405ms for PL/pgSQL trigger vs 0.401ms for the “C” version meaning only ~ +1% speedup! So in short – absolutely not worth the time for such simple trigger functionality. But why so little speedup against an interpreted PL language you might wonder? Yes, PL/pgSQL is kind of an interpreted language, but with a good property that execution plans and resulting prepared statements actually stay cached within one session. So if we’d use pgbench in "re-connect" mode I’m pretty sure we’d see some very different numbers.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
... // audit field #1 - last_modified_on attnum = SPI_fnumber(tupdesc, 'last_modified_on'); if (attnum <= 0) ereport(ERROR, (errcode(ERRCODE_TRIGGERED_ACTION_EXCEPTION), errmsg('relation '%d' has no attribute '%s'', rel->rd_id, 'last_modified_on'))); valbuf = (char*)SPI_getvalue(rettuple, tupdesc, attnum); if (valbuf == NULL) { newval = GetCurrentTimestamp(); rettuple = heap_modify_tuple_by_cols(rettuple, tupdesc, 1, &attnum, &newval, &newnull); } ... |
See here for my full “C” code.
Here things get a bit incomparable actually as we’re adding some new data, which is not there in the “un-triggered” version. So basically I was doing from the trigger the same as the insert portion (into pgbench_history) from the default pgbench transaction. Important to note though - although were seeing some slowdown...it’s most probably still faster that doing that insert from the user transaction as we can space couple of network bytes + the parsing (in our default pgbench case statements are always re-parsed from text vs pl/pgsql code that are parsed only once (think “prepared statements”). By the way, to test how pgbench works with prepared statements (used mostly to test max IO throughput) set the “protocol” parameter to “prepared“.
Results - 0.390ms vs 0.436ms ~ 12%. Not too bad at all given we double the amount of data!
Here we basically double the amount of data written – all updated tables get a logging entry (including pgbench_accounts, which actually gets an insert already as part on normal transaction). Results - 1.173 vs 1.285 ~ 10%. Very tolerable penalties again – almost doubling the dataset here and only paying a fraction of the price! This again shows that actually the communication latency and transaction mechanics together with the costly but essential fsync during commit have more influence than a bit of extra data itself (given we don’t have tons of indexes on the data of course). For reference - full test script can be found here if you want to try it out yourself.
Use Case | Latencies (ms) | Penalty per TX (%) |
---|---|---|
Pgbench default vs with audit triggers for all 3 updated tables | 1.173 vs 1.178 | 0.4% |
Single table update (pgbench_accounts) vs with 1 audit trigger | 0.390 vs 0.405 | 3.9% |
Single table update (pgbench_accounts) vs with 1 audit trigger written in “C” | 0.390 vs 0.401 | 2.8% |
Single table update vs with 1 “insert logging” trigger | 0.390 vs 0.436 | 11.8% |
Pgbench default vs with 3 “insert logging” triggers on updated tables | 1.173 vs 1.285 | 9.6% |
Did you know that in Postgres one can also write DDL triggers so that you can capture/reject/log structural changes for all kinds of database objects? Most prominent use case might be checking for full table re-writes during business hours.
You need to load content from reCAPTCHA to submit the form. Please note that doing so will share data with third-party providers.
More InformationYou are currently viewing a placeholder content from Facebook. To access the actual content, click the button below. Please note that doing so will share data with third-party providers.
More InformationYou are currently viewing a placeholder content from X. To access the actual content, click the button below. Please note that doing so will share data with third-party providers.
More Information
Leave a Reply