CYBERTEC PostgreSQL Logo

Why do I have a slow COMMIT in PostgreSQL?

05.2024 / Category: / Tags: | |
slow commit woes: one burglar tries to crack a safe, while the other tells him to hurry up (since they are in a critical section)
© Laurenz Albe 2024

 

Sometimes one of our customers looks at the most time consuming statements in a database (either with pg_stat_statements or with pgBadger) and finds COMMIT in the high ranks. Normally, COMMIT is a very fast statement in PostgreSQL, so that is worth investigating. In this article, I will explore the possible reasons for a slow COMMIT and discuss what you can do about it.

The basic COMMIT activity in PostgreSQL

A slow COMMIT is a surprising observation, because committing a transaction is a very simple activity in PostgreSQL. In most cases, all a COMMIT has to do is

  • set the two bits for the transaction in the commit log to TRANSACTION_STATUS_COMMITTED (0b01) (persisted in pg_xact)
  • if track_commit_timestamp is set to on, record the commit timestamp (persisted in pg_commit_ts)
  • flush the write-ahead log (WAL) (persisted in pg_wal) to disk, unless synchronous_commit is set to off

Note that because of the multi-versioning architecture of PostgreSQL, both COMMIT and ROLLBACK are normally very fast operations: they both don't have to touch the tables, they only register the status of the transaction in the commit log.

The most frequent reason for slow COMMIT: disk problems

From the above it is clear that a potential cause of slowness is disk I/O. After all, flushing the WAL to disk causes I/O requests. So the first thing you should check is if the disk has a problem or is under load:

  • On Linux, you can use commands like “vmstat 1” or “sar -p 1” to measure the percentage of CPU time spent waiting for I/O (“wa” in vmstat and “%iowait” in sar). If that value is consistently above 10, you can be pretty certain that the I/O system is under stress.
  • With NAS, you should check if the TCP network is overloaded.
  • If the storage is a shared SAN or NAS, the disks may be shared wth other machines, and you should check if there is contention on the storage system.
  • Failing disks, other hardware problems or operating system problems can lead to intermittent performance problems. Check the kernel log for messages.

If we can rule out disk problems as the cause of slow commits, we have to investigate some more.

Reading the source to find out more

If we want to find out what happens during the commit of a transaction, the easiest approach is to read the source. Many people are not aware of the true power of open source: rather than having to guess or buy support from the software vendor, you can just see for yourself. The PostgreSQL source is well written and amply documented, and many parts don't need expert C skills to understand. The relevant code is in the function CommitTransaction() in src/backend/access/transam/xact.c.

There are several causes for a slow COMMIT in exceptional cases, like contention from many concurrent NOTIFY statements (a case you could diagnose from locks on database 0 in pg_locks). However, usually the culprit is one of the three cases described in the following sections.

Slow COMMIT caused by deferred constraints and triggers

Normally, PostgreSQL checks a constraint as part of the statement that modifies the constrained table. With deferred constraints, PostgreSQL waits with the check until the end of the transaction. One use case for that is if you insert data into tables with circular foreign key constraints:

The deferred foreign key makes it easy to create a new department:

Since PostgreSQL checks deferred constraints at commit time, they can slow down COMMIT processing. Ordinarily, checking a constraint is very fast — it is an index lookup. However, many such checks can accumulate in a larger transaction, and the collective execution time can slow down COMMIT considerably.

PostgreSQL also has constraint triggers that are deferrable and can slow down COMMIT just like deferred constraints. See this article for a further discussion of use cases for constraint triggers.

If you can identify deferred constraints or triggers as the cause of slow COMMIT, that may be just fine and no reason to worry. After all, you have to check those constraints at some point.

Slow COMMIT caused by cursors WITH HOLD

A cursor allows the client to fetch a query result set in chunks, which can simplify processing and avoids going out of memory on the client side. However, a regular cursor can only exist in the context of a database transaction. Consequently, you cannot use cursors when a user interaction is involved: the snapshot held by the cursor would stall the progress of VACUUM and lead to table bloat and worse problems. Moreover, ACCESS SHARE locks held for the duration of the transaction will cause problems with concurrent ALTER TABLE or TRUNCATE.

To avoid the limitation of being tied to a transaction, you can use a cursor WITH HOLD. Such a cursor can outlive the transaction that created it and is for example useful for implementing pagination. PostgreSQL implements WITH HOLD cursors by materializing the result set at commit time. If the query behind the cursor is expensive, that can make COMMIT painfully slow. Also, you must not forget to close such cursors when you are done, otherwise the materialized result set will hog server resources until the end of the database session.

If you can establish cursors WITH HOLD as the cause of your slow commits, you can improve the situation by tuning the queries from the cursor definition to run faster. Since PostgreSQL doesn't optimize queries in a cursor for speed of calculating the complete result set, it can sometimes help to set cursor_tuple_fraction to 1.0 to speed up commit processing.

Slow COMMIT caused by synchronous replication

Both streaming replication and logical replication are asynchronous by default. If you use replication for high availability, and you don't want to risk losing an already committed transaction, you can use synchronous replication. With synchronous replication, the order of operations at commit time is:

  • write the COMMIT record to WAL and flush it (this is the actual persistent commit)
  • wait until the synchronous standby reports that it has got all the WAL information
  • make the transaction visible on the primary server
  • report success to the client

If the network latency between primary and synchronous standby is high, COMMIT will take a long time. You can diagnose that by checking pg_stat_activity for frequent or long lasting “SyncRep” wait events. As a rule, you only want to use synchronous replication between machines with low network latency, that is, machines that are physically close together.

Slow COMMIT caused by third-party extensions

One of PostgreSQL's greatest features is its extensibility. It allows you to write extensions that interact with the guts of PostgreSQL without having to modify the server code. Third-party code can “hook” into PostgreSQL to modify its behavior. Among many other options, you can use the C function RegisterXactCallback() to register a callback that PostgreSQL executes at commit time. So if you look for the cause of a slow COMMIT, you should also have a look at the extensions installed in the database. For example, a foreign data wrapper that implements transaction processing at the remote data source may want to commit the remote transaction when PostgreSQL commits the local transaction. Then high network latency or slow transaction processing on the remote side will slow down the PostgreSQL COMMIT.

Conclusion

Looking at the source code, we can easily find the likely causes for slow COMMIT: apart from the obvious cause of disk problems, you should consider deferred constraints, cursors WITH HOLD and synchronous replication as likely causes.

Leave a Reply

Your email address will not be published. Required fields are marked *

CYBERTEC Logo white
CYBERTEC PostgreSQL International GmbH
Römerstraße 19
2752 Wöllersdorf
Austria

+43 (0) 2622 93022-0
office@cybertec.at

Get the newest PostgreSQL Info & Tools


    This site is protected by reCAPTCHA and the Google Privacy Policy & Terms of Service apply.

    ©
    2024
    CYBERTEC PostgreSQL International GmbH
    phone-handsetmagnifiercrosscross-circle
    linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram