Table of Contents
(Last updated on 2023-01-24) Recently, while troubleshooting PostgreSQL performance problems, I ran into problems with subtransactions twice. So I thought this was a nice topic for another blog post.
Everybody knows database transactions. In PostgreSQL, which is operating in autocommit mode, you have to start a transaction that spans multiple statements explicitly with BEGIN
or START TRANSACTION
and close it with END
or COMMIT
. If you abort the transaction with ROLLBACK
(or end the database session without committing) all work done inside the transaction becomes undone.
Now subtransactions allow you to roll back part of the work done in a transaction. You start a subtransaction inside a transaction with the standard SQL statement:
1 |
SAVEPOINT name; |
“name
” is an identifier (no single quotes!) for the subtransaction. You cannot commit a subtransaction in SQL (it is automatically committed with the transaction that contains it), but you can roll it back with:
1 |
ROLLBACK TO SAVEPOINT name; |
Subtransactions are useful in longer transactions. In PostgreSQL, any error inside a transaction will abort the transaction:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
test=> BEGIN; BEGIN test=*> SELECT 'Some work is done'; ?column? ------------------- Some work is done (1 row) test=*> SELECT 12 / (factorial(0) - 1); ERROR: division by zero test=!> SELECT 'try to do more work'; ERROR: current transaction is aborted, commands ignored until end of transaction block test=!> COMMIT; ROLLBACK |
With a transaction that does a lot of work this is quite annoying, because it means that all work done so far is lost. Subtransactions can help you recover from such a situation:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
test=> BEGIN; BEGIN test=*> SELECT 'Some work is done'; ?column? ------------------- Some work is done (1 row) test=*> SAVEPOINT a; SAVEPOINT test=*> SELECT 12 / (factorial(0) - 1); ERROR: division by zero test=!> ROLLBACK TO SAVEPOINT a; ROLLBACK test=*> SELECT 'try to do more work'; ?column? --------------------- try to do more work (1 row) test=*> COMMIT; COMMIT |
Note that ROLLBACK TO SAVEPOINT
starts another subtransaction called a
when it rolls back the old one.
Even if you have never used the SAVEPOINT
statement, you may already have encountered subtransactions. Written in PL/pgSQL, the code in the previous section looks like this:
1 2 3 4 5 6 7 8 9 10 11 12 |
BEGIN PERFORM 'Some work is done'; BEGIN -- a block inside a block PERFORM 12 / (factorial(0) - 1); EXCEPTION WHEN division_by_zero THEN NULL; -- ignore the error END; PERFORM 'try to do more work'; END; |
Every time you enter a block with an EXCEPTION
clause, a new subtransaction is started. The subtransaction is committed when you leave the block, and rolled back when you enter the exception handler.
Many other databases handle errors inside a transaction differently. Rather than aborting the complete transaction, they roll back only the statement that caused the error, leaving the transaction itself active.
When migrating or porting from such a database to PostgreSQL, you might be tempted to wrap every single statement in a subtransaction to emulate the above behavior.
The PostgreSQL JDBC driver even has a connection parameter “autosave
” that you can set to “always
” to automatically set a savepoint before each statement and rollback in case of failure.
As the following will show, this alluring technique will lead to serious performance problems.
To demonstrate the problems caused by overuse of subtransactions, I created a test table:
1 2 3 4 5 6 7 8 9 10 11 |
CREATE UNLOGGED TABLE contend ( id integer PRIMARY KEY, val integer NOT NULL ) WITH (fillfactor='50'); INSERT INTO contend (id, val) SELECT i, 0 FROM generate_series(1, 10000) AS i; VACUUM (ANALYZE) contend; |
The table is small, unlogged and has a low fillfactor to reduce the required I/O as much as possible. This way, I can observe the effects of subtransactions better.
I'll use pgbench
, the benchmarking tool shipped with PostgreSQL, to run the following custom SQL script:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
BEGIN; PREPARE sel(integer) AS SELECT count(*) FROM contend WHERE id BETWEEN $1 AND $1 + 100; PREPARE upd(integer) AS UPDATE contend SET val = val + 1 WHERE id IN ($1, $1 + 10, $1 + 20, $1 + 30); SAVEPOINT a; set rnd random(1,990) EXECUTE sel(10 * :rnd + :client_id + 1); EXECUTE upd(10 * :rnd + :client_id); SAVEPOINT a; set rnd random(1,990) EXECUTE sel(10 * :rnd + :client_id + 1); EXECUTE upd(10 * :rnd + :client_id); ... SAVEPOINT a; set rnd random(1,990) EXECUTE sel(10 * :rnd + :client_id + 1); EXECUTE upd(10 * :rnd + :client_id); DEALLOCATE ALL; COMMIT; |
The script will set 60 savepoints for test number 1, and 90 for test number 2. It uses prepared statements to minimize the overhead of query parsing.
pgbench
will replace :client_id
with a number unique to the database session. So as long as there are no more than 10 clients, each client's UPDATE
s won't collide with those of other clients, but they will SELECT
each other's rows.
Since my machine has 8 cores, I'll run the tests with 6 concurrent clients for ten minutes.
To see meaningful information with “perf top
”, you need the PostgreSQL debugging symbols installed. This is particularly recommended on production systems.
1 2 3 4 5 6 7 8 9 10 11 12 |
pgbench -f subtrans.sql -n -c 6 -T 600 transaction type: subtrans.sql scaling factor: 1 query mode: simple number of clients: 6 number of threads: 1 duration: 600 s number of transactions actually processed: 100434 latency average = 35.846 ms tps = 167.382164 (including connections establishing) tps = 167.383187 (excluding connections establishing) |
This is what “perf top --no-children --call-graph=fp --dsos=/usr/pgsql-12/bin/postgres
” shows while the test is running:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
+ 1.86% [.] tbm_iterate + 1.77% [.] hash_search_with_hash_value 1.75% [.] AllocSetAlloc + 1.36% [.] pg_qsort + 1.12% [.] base_yyparse + 1.10% [.] TransactionIdIsCurrentTransactionId + 0.96% [.] heap_hot_search_buffer + 0.96% [.] LWLockAttemptLock + 0.85% [.] HeapTupleSatisfiesVisibility + 0.82% [.] heap_page_prune + 0.81% [.] ExecInterpExpr + 0.80% [.] SearchCatCache1 + 0.79% [.] BitmapHeapNext + 0.64% [.] LWLockRelease + 0.62% [.] MemoryContextAllocZeroAligned + 0.55% [.] _bt_checkkeys 0.54% [.] hash_any + 0.52% [.] _bt_compare 0.51% [.] ExecScan |
1 2 3 4 5 6 7 8 9 10 11 12 |
pgbench -f subtrans.sql -n -c 6 -T 600 transaction type: subtrans.sql scaling factor: 1 query mode: simple number of clients: 6 number of threads: 1 duration: 600 s number of transactions actually processed: 41400 latency average = 86.965 ms tps = 68.993634 (including connections establishing) tps = 68.993993 (excluding connections establishing) |
This is what “perf top --no-children --call-graph=fp --dsos=/usr/pgsql-12/bin/postgres
” has to say:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
+ 10.59% [.] LWLockAttemptLock + 7.12% [.] LWLockRelease + 2.70% [.] LWLockAcquire + 2.40% [.] SimpleLruReadPage_ReadOnly + 1.30% [.] TransactionIdIsCurrentTransactionId + 1.26% [.] tbm_iterate + 1.22% [.] hash_search_with_hash_value + 1.08% [.] AllocSetAlloc + 0.77% [.] heap_hot_search_buffer + 0.72% [.] pg_qsort + 0.72% [.] base_yyparse + 0.66% [.] SubTransGetParent + 0.62% [.] HeapTupleSatisfiesVisibility + 0.54% [.] ExecInterpExpr + 0.51% [.] SearchCatCache1 |
Even if we take into account that the transactions in test 2 are one longer, this is still a performance regression of 60% compared with test 1.
To understand what is going on, we have to understand how transactions and subtransactions are implemented.
Whenever a transaction or subtransaction modifies data, it is assigned a transaction ID. PostgreSQL keeps track of these transaction IDs in the commit log, which is persisted in the pg_xact
subdirectory of the data directory.
However, there are some differences between transactions and subtransactions:
The information which (sub)transaction is the parent of a given subtransaction is persisted in the pg_subtrans
subdirectory of the data directory. Since this information becomes obsolete as soon as the containing transaction has ended, this data do not have to be preserved across a shutdown or crash.
The visibility of a row version (“tuple”) in PostgreSQL is determined by the xmin
and xmax
system columns, which contain the transaction ID of the creating and destroying transactions. If the transaction ID stored is that of a subtransaction, PostgreSQL also has to consult the state of the containing (sub)transaction to determine if the transaction ID is valid or not.
To determine which tuples a statement can see, PostgreSQL takes a snapshot of the database at the beginning of the statement (or the transaction). Such a snapshot consists of:
A snapshot is initialized by looking at the process array, which is stored in shared memory and contains information about all currently running backends. This, of course, contains the current transaction ID of the backend and has room for at most 64 non-aborted subtransactions per session. If there are more than 64 such subtransactions, the snapshot is marked as suboverflowed.
A suboverflowed snapshot does not contain all data required to determine visibility, so PostgreSQL will occasionally have to resort to pg_subtrans
. These pages are cached in shared buffers, but you can see the overhead of looking them up in the high rank of SimpleLruReadPage_ReadOnly
in the perf
output. Other transactions have to update pg_subtrans
to register subtransactions, and you can see in the perf
output how they vie for lightweight locks with the readers.
Apart from looking at “perf top
”, there are other symptoms that point at the direction of this problem:
SubtransSLRU
” (“SubtransControlLock
” in older releases) in pg_stat_activity
pg_stat_slru
and check if blks_read
in the row with name = 'Subtrans'
keeps growing. That indicates that PostgreSQL has to read disk pages because it needs to access subtransactions that are no longer cached.pg_export_snapshot()
” function, the resulting file in the pg_snapshots
subdirectory of the data directory will contain the line “sof:1
” to indicate that the subtransaction array overflowedpg_stat_get_backend_subxact(integer)
to return the number of subtransactions for a backend process and whether the subtransaction cache overflowed or notSubtransactions are a great tool, but you should use them wisely. If you need concurrency, don't start more than 64 subtransactions per transaction.
The diagnostics shown in this article should help you determine whether you are suffering from the problem, or not.
Finding the cause of that problem can be tricky. For example, it may not be obvious that a function with an exception handler that you call for each result row of an SQL statement (perhaps in a trigger) starts a new subtransaction.
In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Twitter, Facebook, or LinkedIn.
+43 (0) 2622 93022-0
office@cybertec.at
You are currently viewing a placeholder content from Facebook. To access the actual content, click the button below. Please note that doing so will share data with third-party providers.
More InformationYou are currently viewing a placeholder content from X. To access the actual content, click the button below. Please note that doing so will share data with third-party providers.
More Information
This will be speeding up a bunch of my queries! Thank you!
Hi,
Does the wait event "SubtransControlLock" mentioned above corresponds to "SubtransSLRU" in PGSQL13 ?
Yes, thanks for the question. I have adapted the text.