CYBERTEC PostgreSQL Logo

Autovacuum wraparound protection in PostgreSQL

08.2017 / Category: / Tags: |

Autovacuum is maybe one of the most beloved, but also one of the most misunderstood features in the PostgreSQL world. Many questions reaching our PostgreSQL support team are somehow related to autovacuum or VACUUM in general. So maybe it makes sense to use this opportunity to shine some light on VACUUM and a thing called “wraparound protection”.

The circle of time

Some people might have noticed that PostgreSQL sometimes starts an autovacuum worker process in a “special way”. Here is an example:

Sometimes autovacuum adds this “to prevent wraparound” notice to the process title. The main question now is: What is the point?

As you might know PostgreSQL uses a mechanism called MVCC (= MultiVersion Concurrency Control) to control transaction visibility. What does it mean? Here is a simple example:

User 1User 2Remarks
   
BEGIN;  
UPDATE tab

SET field = 2

WHERE field = 1;

BEGIN; 
COMMIT;SELECT * FROM tab;←User 2 still sees “1” because the first transaction has not committed in time
 COMMIT; 

 

transaction identifiers

Internally, this kind of visibility management is mostly done by comparing transaction IDs. Every version of every row gets tagged with the transaction ID that wrote it and every reading transaction takes note of which transactions it can see by taking a snapshot (if you are interested in how this works, check out tqual.c in the PostgreSQL source code). There is just one problem: Transaction IDs are finite, 4 billion in total. At some point they will wrap around, with preceding 2 billion values considered to be the past and the next 2 billion to be the future. What was once past becomes the future and the database will return incorrect results. PostgreSQL will stop accepting new transactions if this could happen and autovacuum will kick in to prevent things from getting to that point.

The meaning of autovacuum_freeze_max_age

Comparing transaction IDs to determine visibility works for a while but it is pretty expensive, as you have to go and look up from commit log if the transaction actually committed or not. To skip this check the first reader of the row will cache the result of this check in the row header as “hint bits”. Now any future readers only need to compare the transaction ID of the row to their snapshot to see if the row was added before or after the snapshot was taken.

A similar mechanism is used to “freeze” the row. If a vacuum process notices that a row was inserted before any currently running transaction it can use a special combination of the hint bits to tag the row as frozen - telling any future readers that this row was inserted a long time ago, no need to check the transaction ID. And if all rows are frozen before their transaction ID wraps around there is no chance of invalid results.

The solution?

This is where wraparound prevention vacuum comes in. Normally vacuum processes use something called a visibility map to only check pages that have dead or unchecked rows on them. When enough time has passed vacuum will scan the whole table, freeze anything it can and take note of the oldest unfrozen row, so it knows when to do this again the next time. The point when PostgreSQL runs the cleanup process is controlled autovacuum_freeze_max_age, by default 200.000.000 transactions. Under normal circumstances this leaves autovacuum plenty of time to avoid the fatal wraparound situation when a table is older than 2 billion transactions.

The same protection mechanism is used for something called MultiXact IDs. Usually MultiXact IDs are much slower moving than normal transcation IDs and so normally they don’t trigger wraparound vacuums but are worth mentioning still for completeness sake.

What this means to PostgreSQL administrators

There is a lot more to say about the inner workings of VACUUM / autovacuum but what does this all mean to administrators, who want to keep a system up and running?

The most important thing is: If you happen to see …

… don't panic. Autovacuum does its job as it is expected to do. In case autovacuum kicks in after 200 million transactions (or a bit more) there is no need to panic because a wraparound problem is still hundreds of millions of transactions away. Autovacuum will simply ensure that your table is fine and there is no need to take action to prevent things.

However, sometimes it makes sense to see, how far a table is still away from wraparound protection. The following query gives PostgreSQL administrators a tool to see, when autovacuum will initiate a wraparound VACUUM:

What can we see?

The query shows for each table, when the next autovacuum will start to freeze the relation. In my example we are still more than 199 million transactions away from wraparound protection.

A wraparound vacuum has to check the whole table to figure out the oldest non-frozen row. With huge tables and lots of transactions this can happen relatively often and cause lots of disk I/O. In PostgreSQL 9.6 a new bit was added to the visibility map that lets PostgreSQL know which tables are already frozen and don’t need to be touched anymore. This greatly speeds up wraparound vacuums for append-mostly tables, which most huge tables are.

On older versions you can reduce overhead by running closer to the wraparound limit. For example you could set autovacuum_freeze_max_age to 1.000.000.000, making the full scans happen 5x less often. More adventurous souls may want to go even closer to the limit, although the incremental gains are much smaller. If you do increase the value, monitor that autovacuum is actually keeping up so you don’t end up with downtime when your transaction rate outpaces autovacuum’s ability to freeze.

6 responses to “Autovacuum wraparound protection in PostgreSQL”

  1. Is this true that if I have say few 10-Billion row archive tables which are rarely updated (once a day) and a small table of 10K rows that runs hot transaction volume of say 10K tps, if postgres can't vacuum 10-billion table every 5 days I will run into transaction wraparound error?
    What if db can't keep up and vacuum archive tables in time?

  2. Working on Pg9.5, I had to modify your query to:

    SELECT
    oid::regclass::text AS table,
    age(relfrozenxid) AS xid_age,
    mxid_age(relminmxid) AS mxid_age,
    least(
    (SELECT setting::int
    FROM pg_settings
    WHERE name = 'autovacuum_freeze_max_age') - age(relfrozenxid),
    (SELECT setting::int
    FROM pg_settings
    WHERE name = 'autovacuum_multixact_freeze_max_age') - mxid_age(relminmxid)
    ) AS tx_before_wraparound_vacuum,
    pg_size_pretty(pg_total_relation_size(oid)) AS size,
    pg_stat_get_last_autovacuum_time(oid) AS last_autovacuum
    FROM pg_class
    WHERE not (relfrozenxid = 0)
    AND oid > 16384
    ORDER BY tx_before_wraparound_vacuum;

    In order to get a workaround from error:

    ERROR: operator does not exist: xid integer

    This version works on PG 9.5 --> 11beta3

  3. It might be worth mentioning that the “VACUUM to prevent wraparound” is performed even when autovacuum is disabled.

  4. Hi
    Thanks for detailed explanation and post.
    I have a question,
    Say mydb 'autovacuum_freeze_max_age' is 200M(default) and there is no dead tuples/transaction to vaccum.
    How to prevent to run "autovacuum: VACUUM public.x (to prevent wraparound)" in this case .
    How to determine my optimal parameter value ?

    autovacuum_freeze_max_age
    vacuum_freeze_min_age
    vacuum_freeze_table_age

    Regards,
    Raj

  5. One simple question though:

    Why is nobody changing the XID and all dependencies from 32 bit to 64 bit in the source code?

    That should definitely solve the problem of wraparounds under heavy load and potentially long running queues (think of data warehouse analytics), without having to worry about vacuum being able to keep the pace.

Leave a Reply

Your email address will not be published. Required fields are marked *

CYBERTEC Logo white
Get the newest PostgreSQL Info & Tools


    This site is protected by reCAPTCHA and the Google Privacy Policy & Terms of Service apply.

    ©
    2024
    CYBERTEC PostgreSQL International GmbH
    phone-handsetmagnifiercrosscross-circle
    linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram