Whenever rows in a PostgreSQL table are updated or deleted, dead rows are left behind. VACUUM
gets rid of them so that the space can be reused. If a table doesn't get vacuumed, it will get bloated, which wastes disk space and slows down sequential table scans (and – to a smaller extent – index scans).
Table of Contents
VACUUM
also takes care of freezing table rows so to avoid problems when the transaction ID counter wraps around, but that's a different story.
Normally you don't have to take care of all that, because the autovacuum daemon built into PostgreSQL does it for you. To find out more about enabling and disabling autovacuum, read this post.
In case your tables bloat, the first thing you check is whether autovacuum processed them or not:
1 2 3 4 5 6 7 8 |
SELECT schemaname, relname, n_live_tup, n_dead_tup, last_autovacuum FROM pg_stat_all_tables ORDER BY n_dead_tup / (n_live_tup * current_setting('autovacuum_vacuum_scale_factor')::float8 + current_setting('autovacuum_vacuum_threshold')::float8) DESC LIMIT 10; |
If your bloated table does not show up here, n_dead_tup
is zero and last_autovacuum
is NULL, you might have a problem with the statistics collector.
If the bloated table is right there on top, but last_autovacuum
is NULL, you might need to configure autovacuum to be more aggressive so that it finishes the table.
But sometimes the result will look like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
schemaname | relname | n_live_tup | n_dead_tup | last_autovacuum ------------+--------------+------------+------------+--------------------- laurenz | vacme | 50000 | 50000 | 2018-02-22 13:20:16 pg_catalog | pg_attribute | 42 | 165 | pg_catalog | pg_amop | 871 | 162 | pg_catalog | pg_class | 9 | 31 | pg_catalog | pg_type | 17 | 27 | pg_catalog | pg_index | 5 | 15 | pg_catalog | pg_depend | 9162 | 471 | pg_catalog | pg_trigger | 0 | 12 | pg_catalog | pg_proc | 183 | 16 | pg_catalog | pg_shdepend | 7 | 6 | (10 rows) |
Here autovacuum ran recently, but it didn't free the dead tuples!
We can verify the problem by running VACUUM (VERBOSE)
:
1 2 3 4 5 6 7 8 9 10 |
test=> VACUUM (VERBOSE) vacme; INFO: vacuuming 'laurenz.vacme' INFO: 'vacme': found 0 removable, 100000 nonremovable row versions in 443 out of 443 pages DETAIL: 50000 dead row versions cannot be removed yet, oldest xmin: 22300 There were 0 unused item pointers. Skipped 0 pages due to buffer pins, 0 frozen pages. 0 pages are entirely empty. CPU: user: 0.01 s, system: 0.00 s, elapsed: 0.01 s. |
VACUUM
remove the dead rows?VACUUM
only removes those row versions (also known as “tuples”) that are not needed any more. A tuple is not needed if the transaction ID of the deleting transaction (as stored in the xmax
system column) is older than the oldest transaction still active in the PostgreSQL database. (Or, in the whole cluster for shared tables).
This value (22300 in the VACUUM
output above) is called the “xmin horizon”.
There are three things that can hold back this xmin horizon in a PostgreSQL cluster:
VACUUM
:You can find those and their xmin
value with the following query:
1 2 3 4 |
SELECT pid, datname, usename, state, backend_xmin, backend_xid FROM pg_stat_activity WHERE backend_xmin IS NOT NULL OR backend_xid IS NOT NULL ORDER BY greatest(age(backend_xmin), age(backend_xid)) DESC; |
You can use the pg_terminate_backend()
function to terminate the database session that is blocking your VACUUM
.
VACUUM
:A replication slot is a data structure that keeps the PostgreSQL server from discarding information that is still needed by a standby server to catch up with the primary.
If replication is delayed or the standby server is down, the replication slot will prevent VACUUM
from deleting old rows.
You can find all replication slots and their xmin
value with this query:
1 2 3 |
SELECT slot_name, slot_type, database, xmin FROM pg_replication_slots ORDER BY age(xmin) DESC; |
Use the pg_drop_replication_slot()
function to drop replication slots that are no longer needed.
Note: This can only happen with physical replication if hot_standby_feedback = on
. For logical replication there is a similar hazard, but only it only affects system catalogs. Examine the column catalog_xmin
in that case.
VACUUM
:During two-phase commit, a distributed transaction is first prepared with the PREPARE
statement and then committed with the COMMIT PREPARED
statement.
Once Postgres prepares a transaction, the transaction is kept “hanging around” until it Postgres commits it or aborts it. It even has to survive a server restart! Normally, transactions don't remain in the prepared state for long, but sometimes things go wrong and the administrator has to remove a prepared transaction manually.
You can find all prepared transactions and their xmin
value with the following query:
1 2 3 |
SELECT gid, prepared, owner, database, transaction AS xmin FROM pg_prepared_xacts ORDER BY age(transaction) DESC; |
Use the ROLLBACK PREPARED
SQL statement to remove prepared transactions.
hot_standby_feedback = on
and VACUUM
:Normally, the primary server in a streaming replication setup does not care about queries running on the standby server. Thus, VACUUM
will happily remove dead rows which may still be needed by a long-running query on the standby, which can lead to replication conflicts. To reduce replication conflicts, you can set hot_standby_feedback = on
on the standby server. Then the standby will keep the primary informed about the oldest open transaction, and VACUUM
on the primary will not remove old row versions still needed on the standby.
To find out the xmin
of all standby servers, you can run the following query on the primary server:
1 2 3 |
SELECT application_name, client_addr, backend_xmin FROM pg_stat_replication ORDER BY age(backend_xmin) DESC; |
Read more about PostgreSQL table bloat and autocommit in my post here.
You need to load content from reCAPTCHA to submit the form. Please note that doing so will share data with third-party providers.
More InformationYou are currently viewing a placeholder content from Facebook. To access the actual content, click the button below. Please note that doing so will share data with third-party providers.
More InformationYou are currently viewing a placeholder content from X. To access the actual content, click the button below. Please note that doing so will share data with third-party providers.
More Information
Hi Laurenz, Thanks for this post. It was helpful to us in identifying some issues related to auto-vacuum.
We are on postgresql 9.6 and when we run the vacuum verbose cmd, it does not show us the oldest xmin as seen in your output above. Wondering if you tried this on a newer version.
Regards
You are right, this came with commit 9eb344faf54 in PostgreSQL v10.
Hi Laurenz, thanks a lot for post.
I got bloated table because of oldest xmin, but this xmin belong to physical replication. How can I solve this problem without loosing replication?
Best regards.
It's a bit unclear what your problem is, and it seems unrelated to the article, but perhaps you need to drop the replication slot.
Hi Laurenz,
I've got an issue with this auto-vacuum
[2020-08-05 16:45:17.157 07][][][][][431][XX001]ERROR: found xmin 2756976979 from before relfrozenxid 300006063
[2020-08-05 16:45:17.157 07][][][][][431][XX001]CONTEXT: automatic vacuum of table "template1.pg_catalog.pg_authid"
May u please kindly help me with this problem?
You have data corruption.
Dump and restore the database to a new PostgreSQL cluster. If you need more help, please contact sales@cybertec.at
Hello Laurenz,
on 'long-running transactions'. In isolation level 'read committed'. If I just do selects and keep the transaction open in between. Can that cause problems for vacuum? I guess not as it does not guarantee read consistency? Thank you!
You are right. What holds back
VACUUM
is open snapshots, and inREAD COMMITTED
isolation, each statement takes a new snapshot. If you look at the query I provide, you will see that it checksbackend_xmin
andbackend_xid
. You will see that your read-onlyREAD COMMITTED
has both values set to NULL between queries.Hi Laurenz,
we have a database where a vacuum on pg_largeobject does not remove the dead items.
I've checked the 4 reasons but I'm not sure if any of the reasons apply here. From my understanding, I would rather say that nothing applies.
Have I overlooked anything? And what else could it be?
VACUUM VERBOSE ANALYSE pg_largeobject;
INFO: vacuuming "pg_catalog.pg_largeobject"
INFO: table "pg_largeobject": index scan bypassed: 420500 pages from table (1.19% of total) have 690946 dead item identifiers
INFO: table "pg_largeobject": found 0 removable, 1816641 nonremovable row versions in 533548 out of 35474423 pages
DETAIL: 0 dead row versions cannot be removed yet, oldest xmin: 25410676
Skipped 0 pages due to buffer pins, 0 frozen pages.
CPU: user: 2.66 s, system: 1.43 s, elapsed: 4.29 s.
INFO: analyzing "pg_catalog.pg_largeobject"
INFO: "pg_largeobject": scanned 30000 of 35474423 pages, containing 106967 live rows and 587 dead rows; 30000 rows in sample, 126486420 estimated total rows
Please find the results of the queries enclosed:
1) Long-running transactions and VACUUM
pid | datname | usename | state | backend_xmin | backend_xid
--------+----------+----------+---------------------+--------------+-------------
235573 | ........ | ........ | idle in transaction | 25410676 |
236156 | ........ | ........ | idle in transaction | 25410676 |
237118 | ........ | ........ | idle in transaction | 25410676 |
240698 | ........ | ........ | idle in transaction | 25410676 |
242505 | ........ | postgres | active | 25410677 |
2) Abandoned replication slots and VACUUM
slot_name | slot_type | database | xmin
-----------+-----------+----------+------
stndb01 | physical | |
stndb02 | physical | |
(2 rows)
3) Orphaned prepared transactions and VACUUM
gid | prepared | owner | database | xmin
-----+----------+-------+----------+------
(0 rows)
4) Standby server with hot_standby_feedback = on and VACUUM
application_name | client_addr | backend_xmin
------------------+---------------+--------------
stndb01 | 172.16.24.24 |
stndb02 | 172.16.24.22 |
(2 rows)
FYI:
hot_standby = on
hot_standby_feedback = off (default)
Thanks in advance!
Best regards
Seems that the problem are the "idle in transaction" transactions...
Everything is fine, and
VACUUM
removed all dead rows. See the line "0 dead row versions cannot be removed yet", which is the same as "all dead row versions could be removed". The "nonremovable row versions" are the live rows in the table.To actually shrink the table, you would need
VACUUM (FULL)
.Thanks for your fast response.
I just want to say thank you for this! I did not understand that long-lived transactions would block vacuuming across all tables (including those the transaction doesn't use), but this information helped me figure out what was going on. You are a saint for all the help you provide!