First, the big question – should we be using good old triggers at all? Well, actually I’m not going to recommend anything here, as it’s an opinionated topic:) People well-versed in databases would probably see good use cases for them, whereas modern application developers would mostly say it’s an awful practice - doing some “magic stuff”™ kind of secretly. Avoiding that holy war, let’s say that we already need to use triggers - boss’s orders. But then comes the question: Should we be afraid of using triggers in PostgreSQL due to possible performance penalties? Should we plan to beef up the hardware or do some app optimizations beforehand? From my personal experience – no, mostly nothing will change in the big picture if “used moderately”. But let’s try to generate some numbers as that’s where the truth lies...
Table of Contents
All applications are generally unique, so the most critical part of usable performance tests boils down to actually setting up a more or less plausible use case. As Postgres comes bundled with the quick benchmarking tool pgbench, I usually tend to take it’s schema as a baseline and do some modifications on that, based on the type of application that the customer has. In the case of deploying triggers, the most usual use case is probably “auditing” - making sure on database level that we store some data on the author/reason for those changes on all rows. So to simulate such basic auditing I decided to just add two audit columns for all pgbench tables receiving updates (3 of them) in the default transaction mode. So let’s create last_modified_on (as timestamp) and last_modified_by (as text).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
... ALTER TABLE pgbench_(accounts|branches|tellers) ADD COLUMN last_modified_on timestamptz, ADD COLUMN last_modified_by text; ... CREATE FUNCTION trg_last_modified_audit() RETURNS TRIGGER AS $ IF NEW.last_modified_by IS NULL THEN NEW.last_modified_by = session_user; END IF; IF NEW.last_modified_on IS NULL THEN NEW.last_modified_on = current_timestamp; END IF; RETURN NEW; $ LANGUAGE plpgsql; ... |
Next, I booted up a moderately specced (2 CPU, 4 GB RAM, 48 GB SSD) test machine on Linode (Ubuntu 18.04) and installed the latest Postgres (v10.4) binaries from the official Postgres project managed repository, leaving all postgresql.conf settings, except shared_buffers (which I set at 3GB), at default. For the pgbench scaling factor number, I chose 100, giving us a ~1.3GB database (see here for more how to choose those scale numbers) so that everything is basically cached and we can factor out most IO jitter - checkpoints, background writer and autovacuum are still kind of random of course, but typically they’re there also for real-life systems so not sure if removing them is a good idea.
NB! For carrying out the actual testing, I then compiled the latest (11devel) pgbench, to make use of repeatable test cases (the new –random-seed parameter!), initialized the schema and ran the simplest possible pgbench test for triggers/untriggered case for 2h with 3 loops. Basically something like what you can see below (for the full script see here).
1 2 |
pgbench -i -s 100 pgbench -T 7200 --random-seed=2018 |
So the first test I did compared pgbench transactions “as is” with just 2 auditing columns, added for the 3 tables getting updates vs the “triggered” case, where for each of those tables also a trigger was installed that was setting the auditing timestamp/username. The avg. transaction latency results came back kind of as expected: 1.173 ms vs 1.178 ms i.e. 0.4% difference – meaning basically no difference in average transaction latency at all for transactions where 3 simple data checking/filling triggers are executed in the background!
If having a typical OLTP transaction touching a couple of tables, PL/PgSQL triggers containing just simple business logic can be used without further performance considerations!
Hmm...but how many simple triggers would you then need to see some noticeable runtime difference in our use case? Probably at least a dozen! In a typical short lived OLTP transaction context we’re still mostly IO (largely disk fsync speed) and especially for multi-statement transactions also network (round-trip) bound...thus worrying on some extra CPU cycles spent in triggers can be spared.
So what could be tried more to get an idea of penalties resulting from triggers? First we could make the transactions thinner by getting rid of network round trip latency – a single UPDATE on the pgbench_accounts would be good for that in our case. Then we could also let triggers insert some data into some other tables...but that’s enough content for another blog post I believe. See you soonish!
+43 (0) 2622 93022-0
office@cybertec.at
You are currently viewing a placeholder content from Facebook. To access the actual content, click the button below. Please note that doing so will share data with third-party providers.
More InformationYou are currently viewing a placeholder content from X. To access the actual content, click the button below. Please note that doing so will share data with third-party providers.
More Information
I presume you actually attached the trigger function to the table using
create trigger
but just forgot to add it to the code sample.yes sure, just an excerpt here...the linked full script has it:)
"whereas modern application developers would mostly say it’s an awful practice – doing some “magic stuff”™ kind of secretly"
I find that many shun things like triggers / stored procs mainly due to the fact that they often require you to write code in seemingly archaic languages like PL/SQL. plv8 and the likes should make it less painful to language-switch. Modern developers, please do not immediately write off powerful techniques like triggers or stored procedures. These techniques can be very powerful. We mostly try and prevent problem with shipping ores to Paris by writing good queries (https://stackoverflow.com/a/7518619/535761) but sometimes queries are not sufficient. Sometimes you need a stored proc / db function.
I just gave this a try with the beta1 build from the pgdg yum repo. Used an m4.large amazon linux EC2 instance with the pgbench client running on a separate machine. I did it while sitting at the SEAPUG meeting and listening to a really interesting presentation about monitoring and PostgreSQL... so I shortened the test a bit to just two 20 minute runs each. but FWIW, my timings were 2.604/2.601 ms average latency without triggers and 2.781/2.727 ms average lagency with the plpgsql triggers. still comes in around 1% - great result!