Did you ever wonder why VACUUM
does not make your PostgreSQL tables smaller? Did you ever wonder why VACUUM
does not shrink data files? Well, maybe this is the article you have been looking for. The point is: usually, space is not returned to the operating system after a cleanup and it is important to understand why. Often people make wrong assumptions about the inner workings of VACUUM
. It makes sense to dig deeper and understand what's really going on.
Table of Contents
The following post reveals the most useful secrets about VACUUM to users.
To understand VACUUM in PostgreSQL, it is important to see how PostgreSQL handles visibility in the first place. The entire concept is based around a set of hidden columns which are part of the row. Here's how it works:
1 2 3 4 5 6 |
test=# CREATE TABLE t_test (id int); CREATE TABLE test=# INSERT INTO t_test VALUES (5), (6), (7); INSERT 0 3 test=# INSERT INTO t_test VALUES (8), (9), (10); INSERT 0 3 |
The table we have just created consists of just one column - to keep things simple. Note that data has been loaded into the table using two separate transactions. Each transaction has inserted 3 rows, which is clearly indicated in the hidden columns:
1 2 3 4 5 6 7 8 9 10 |
test=# SELECT ctid, xmin, xmax, cmin, cmax, * FROM t_test; ctid | xmin | xmax | cmin | cmax | id ------+------+------+------+------+---- (0,1) | 764 | 0 | 0 | 0 | 5 (0,2) | 764 | 0 | 0 | 0 | 6 (0,3) | 764 | 0 | 0 | 0 | 7 (0,4) | 765 | 0 | 0 | 0 | 8 (0,5) | 765 | 0 | 0 | 0 | 9 (0,6) | 765 | 0 | 0 | 0 | 10 (6 rows) |
xmin
, xmax
, cmin
and cmax
are hidden columns containing transaction IDs. As you can see, the first three rows have been written by transaction number 764, while the rest of the data has been created using transaction number 765.
The hidden columns will handle visibility and PostgreSQL will (often but not always) determine based on those columns whether a row can be seen by a certain transaction or not.
Running UPDATE statements will show what's really going on here:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
test=# BEGIN; BEGIN test=*# UPDATE t_test SET id = id * 10 WHERE id > 9 RETURNING *; id ----- 100 (1 row) UPDATE 1 test=*# SELECT ctid, xmin, xmax, cmin, cmax, * FROM t_test; ctid | xmin | xmax | cmin | cmax | id ------+------+------+------+------+----- (0,1) | 764 | 0 | 0 | 0 | 5 (0,2) | 764 | 0 | 0 | 0 | 6 (0,3) | 764 | 0 | 0 | 0 | 7 (0,4) | 765 | 0 | 0 | 0 | 8 (0,5) | 765 | 0 | 0 | 0 | 9 (0,7) | 766 | 0 | 0 | 0 | 100 (6 rows) |
One row has been changed. But let's focus on the CTID which represents the physical position of the row on disk. Notice that (0, 6) is gone because PostgreSQL had to copy the row. If we run a second UPDATE
, the row will be copied again:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
test=*# UPDATE t_test SET id = id * 10 WHERE id > 9 RETURNING *; id ------ 1000 (1 row) UPDATE 1 test=*# UPDATE t_test SET id = id * 10 WHERE id > 9 RETURNING *; id ------- 10000 (1 row) UPDATE 1 |
Copying these rows is important because we have to keep the old row around. Otherwise ROLLBACK
would not work - therefore the old version has to remain.
Let's inspect the table once again:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
test=*# SELECT ctid, xmin, xmax, cmin, cmax, * FROM t_test; ctid | xmin | xmax | cmin | cmax | id ------+------+------+------+------+------- (0,1) | 764 | 0 | 0 | 0 | 5 (0,2) | 764 | 0 | 0 | 0 | 6 (0,3) | 764 | 0 | 0 | 0 | 7 (0,4) | 765 | 0 | 0 | 0 | 8 (0,5) | 765 | 0 | 0 | 0 | 9 (0,9) | 766 | 0 | 2 | 2 | 10000 (6 rows) test=*# COMMIT; COMMIT |
Everything between 5 and 9 is full of dead rows which have to be removed.
Keep in mind that COMMIT
must not kill dead rows either. Therefore the cleanup process has to be done asynchronously. This is exactly what VACUUM
does. Let's run it and see what happens:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
test=# VACUUM VERBOSE t_test INFO: vacuuming 'test.public.t_test' INFO: finished vacuuming 'test.public.t_test': index scans: 0 pages: 0 removed, 1 remain, 1 scanned (100.00% of total) tuples: 3 removed, 6 remain, 0 are dead but not yet removable removable cutoff: 767, which was 0 XIDs old when operation ended new relfrozenxid: 764, which is 1 XIDs ahead of previous value index scan not needed: 0 pages from table (0.00% of total) had 0 dead item identifiers removed avg read rate: 22.790 MB/s, avg write rate: 27.348 MB/s buffer usage: 6 hits, 5 misses, 6 dirtied WAL usage: 3 records, 3 full page images, 14224 bytes system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s VACUUM |
VACUUM
is actively looking for rows which are not seen by anyone anymore. Those rows can be in the middle of the data file somewhere. What happens is that VACUUM
allows PostgreSQL to reuse that space - however, it does not return that space to the operating system. It can't do that because if you have a datafile that is 1 GB in size, you can't simply return “the middle of the file” to the operating system in case it is empty - there is no file system operation which supports that. Instead, PostgreSQL has to remember this free space and reuse it later.
However, there is an exception to the rule. Consider the following code snippet:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
test=# SELECT pg_relation_size('t_test'); pg_relation_size ------------------ 8192 (1 row) test=# DELETE FROM t_test; DELETE 6 test=# SELECT pg_relation_size('t_test'); pg_relation_size ------------------ 8192 (1 row) |
The table will retain its size even after the DELETE
statement. Remember: Cleanup is done asynchronously. Therefore VACUUM
can be called to delete those rows:
[sql gutter="false"]
test=# VACUUM t_test;
VACUUM
This is a bit of a special case. The rule is: If from a certain position in a table onward, ALL rows are dead, VACUUM
can truncate the table. That's exactly what happened in this example:
1 2 3 4 5 |
test=# SELECT pg_relation_size('t_test'); pg_relation_size ------------------ 0 (1 row) |
After all, in large tables there are always a handful of rows towards the end of the data file (under normal circumstances). For that reason, don't count on VACUUM to shrink tables.
One way to battle table bloat (which is the technical term used to describe tables growing out of proportion) is to use VACUUM FULL
. However, the problem is that VACUUM FULL
needs a table lock which can be a real problem if you are in a business critical environment. Therefore we have developed a method to reorganize large tables without extensive locking.
The solution is called pg_squeeze. Download it for free from Github or our website.
To find out more, you can watch my Youtube video about VACUUM
:
https://www.youtube.com/watch?v=lhIUoEsRlZg
If your VACUUM
process won't remove dead rows and you don't know why, check out Laurenz' blog, 4 Reasons Why VACUUM Won't Remove Dead Rows. In case you are looking to dig deeper, check out our further blogs related to VACUUM
.
To get important tips about how disabling AUTOCOMMIT affects autovacuum maintenance, read Laurenz Albe's blog, Disabling Autocommit in PostgreSQL Can Damage Your Health.
+43 (0) 2622 93022-0
office@cybertec.at
You are currently viewing a placeholder content from Facebook. To access the actual content, click the button below. Please note that doing so will share data with third-party providers.
More InformationYou are currently viewing a placeholder content from X. To access the actual content, click the button below. Please note that doing so will share data with third-party providers.
More Information
One thing that isn't clear to me is how much free space overhead is required with pg_squeeze? Do I need to store an entire duplicate copy of the table indexes during processing?
Yes, just like with
VACUUM (FULL)
.