BY Kaarel Moppel - Having recently witnessed quite an ingenious hack to implement some good old "println" style development debugging from stored procedures into a file, it prompted me to post knowledge to the Interwebs on two other ways how such a goal can be implemented more transparently. Also, with help of some other good old legacy technology in one case. By the way, the main reason for them going for the hack was that being relatively new to databases they didn't know how to name this thing that they were implementing, so Google wasn't able to help – once again proof that naming things is one of the hardest problems of computing 🙂
Table of Contents
But to start with let's explain what an "autonomous transaction" is, since most people who work with databases probably haven't needed / heard of them and actually they're not a standard or anything. They are not too well-supported by various database engines. The phrase itself comes from the Oracle world, I believe, and it basically denotes "fire and forget" (sub)-transactions that are not connected to the main transactions. And sadly, also Postgres does not have direct built-in support for them – you're always in a real transaction and need to invent a bit, if the need arises.
What would be the common use cases? Mostly some logging / auditing / progress tracking into tables, in such a way that the information on the attempt would persist even when the main transaction is rolled back - for example, due to an error. Remember - in a standard transaction, everything is thrown away in case no special measures (savepoints or exception handling sub-blocks in stored procedures) are taken.
So what did the initial implementation that wowed me a bit look like?
1 2 3 4 5 6 7 8 9 |
CREATE FUNCTION public.log(appid text, msg text) RETURNS void LANGUAGE plpgsql SECURITY DEFINER AS $function$ BEGIN execute $ copy (select $ || quote_literal(msg) || $) to program 'tee -a /tmp/$ ||appid|| $.log' $; END; $function$; |
Not bad - it works and is relatively short and simple, basically a one-liner. But there are some issues:
In short it could be improved a bit.
The first and probably most common alternative would be to use the old "dblink" functionality. The feature has been there since ages ago, but as there is basically now something better with Postgres Foreign Data Wrappers (FDW) for most use cases (where there's a fixed set of remote tables), I'm kind of considering it a legacy technology... however, it's still a good fit for this "fire and forget" purpose. To learn more about "dblink" (with random / ad-hoc queries on remote Postgres databases being the main use case) please look at the documentation here.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
-- set up the 'logging sink' table CREATE TABLE log( msg text not null, sender_ts timestamptz not null, local_ts timestamptz default now() ); CREATE EXTENSION dblink; -- dblink is a 'contrib' extension -- define the logger function. -- Note the use of clock_timestamp() -- transactions freeze the 'now()' timestamp CREATE FUNCTION log_dblink(msg text) RETURNS void LANGUAGE sql AS $function$ select dblink('host=/var/run/postgresql port=5432 user=postgres dbname=postgres', format('insert into log select %L, %L', msg, clock_timestamp()::text)) $function$; |
The advantages are that now no superuser is needed for the use of dblink (the installation of the extension still needs it) once a password is specified in the connect string, so basically anyone can use it. Also, no physical access is needed – data is stored in a normal table and available via SQL.
The second approach would be something more complex again, but not so hacky I would say: Stored procedures in PL/Python and opening an explicit external transaction! The code (assuming PL/Python packages and the "psycopg2" Python driver have been installed already) would then look something like that:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
CREATE EXTENSION plpythonu; CREATE FUNCTION public.log_python(msg text) RETURNS void LANGUAGE plpythonu AS $function$ import psycopg2 from datetime import datetime conn = psycopg2.connect(host='/var/run/postgresql', port=5432) conn.autocommit = True cur = conn.cursor() cur.execute('INSERT INTO log select %s, %s', (msg, datetime.now())) $function$; |
Although this is semantically cleaner to read, the downside here is that PL/Python still requires more setup and full superuser access (that's becoming increasingly rare). But on the positive side, it's surprisingly quite a few milliseconds faster than the "dblink"! Here's a tip - to gain back on speed of "dblink", one could actually use PgBouncer to reduce connection initialization time - the toll on a single simple operation should not be underestimated! According to my quick tests, PgBouncer removed those extra milliseconds and things were on par.
To close it off - sadly, as we saw, there are no really perfect solutions for autonomous transactions, but "dblink" would be my preferred one. An idea though for a better solution – it would be really cool if the Postgres FDW would actually have some option to define such autonomous "fire and forget" or "out of transaction" servers / tables that ignore the main transaction, so that a simple INSERT would suffice. Let's hope something like that will pop up one day.
+43 (0) 2622 93022-0
office@cybertec.at
You are currently viewing a placeholder content from Facebook. To access the actual content, click the button below. Please note that doing so will share data with third-party providers.
More InformationYou are currently viewing a placeholder content from X. To access the actual content, click the button below. Please note that doing so will share data with third-party providers.
More Information
As part of a large Oracle migration, we built pg_jobmon, a complete framework for just this kind of thing. Check out https://github.com/omniti-labs/pg_jobmon & feel free to reach out if you have questions / need help.
The odd implementation of the
log
function does something different than the other twos: it logs to a file, while your proposed solutions log to a table (even if a remote one).In this context, I think also a foreign data wrapper could help .
Also it is a little awkward the dollar quoting and its nesting in the first example!
Good blog post. love to see different options and workaround using pgbouncer 🙂
Another blog post by Yaser mentioned about overhead of dblink as "The PostgreSQL dblink is session-specific. Any dblink opened in one session cannot be used in a different session. This means that each session will have to open a new DB connection and this increases
response time." Reference : https://aws.amazon.com/blogs/database/migrating-oracle-autonomous-transactions-to-postgresql/