CYBERTEC PostgreSQL Logo

pg_squeeze - Shrinks tables better than VACUUM

pg_squeeze is an Open Source PostgreSQL extension that automatically fixes table bloat – without extensive table locking. The process works for you completely in the background.

Key Benefits

  • More aggressive space reduction
  • Close to lock-free table reorganization
  • Ability to move tables between tablespaces (without downtime)
  • Ability to cluster tables (without downtime)
  • Built-in advanced scheduling
  • Fully Open Source

pg_squeeze is not a replacement for autovacuum – it is an add-on to perform even better cleanups .

PostgreSQL uses a mechanism called “MVCC” (Multi Version Concurrency Control) to store data. As the name already suggests, MVCC holds various versions of a row to support as much concurrency as possible. At some point, those additional rows must be removed from the storage system and this is where VACUUM comes along.

Unlike with the built-in commands “VACUUM FULL” or “CLUSTER”, with “pg_squeeze” there are no extended periods of full-table locking, and consequently reads and writes are not blocked during the rebuild. The rebuilding process is very efficient due to a novel approach of using transaction log files and logical decoding (instead of triggers) to capture possible data changes to the table being rebuilt. First of all, this helps save disk space and IO throughput, but even more importantly, it enables very short locking times, making it a perfect fit for mission-critical OLTP systems.

How does pg_squeeze work?

  1. The extension is implemented as a background-worker process (a framework introduced in version 9.4) that periodically monitors user-defined tables.
  2. When it detects that a table has exceeded the “bloat threshold”,
  3. it kicks in and rebuilds the table automatically.
How it Works - pg_squeeze

Rebuilding happens concurrently in the background with minimal storage and computational overhead, due to the use of PostgreSQL’s logical decoding. pg_squeeze uses PostgreSQL’s built-in replication slots to extract any table changes which happen during the rebuild. The bloat threshold is configurable and the bloat ratio calculation is based on the Free Space Map (it also takes FILLFACTOR into account) or, under certain conditions, on the “pgstattuple” extension when it’s available. Additionally, many customization parameters like “minimum table size” can be set to ignore unsuitable tables. Moreover, reordering using an index or moving the table or indexes to a new tablespace are both possible.

pg_squeeze 1.3 Download

The newest version of pg_squeeze can be downloaded from github.com.

License

  • PostgreSQL License

FAQ

Q: Is it safe? What happens when power fails during a table rebuild?

A: Yes, it’s safe because rebuild happens within a single transaction. Furthermore, maximum lock time can also be set so that the extension can limit the time taken to switch the table.

Q: How does it differ from “pg_repack”?

A: It is different in the sense that it is more resource-friendly, because it doesn’t use triggers. It automatically determines which tables are bloated – on its own – and does not require the use of a separate command-line tool.

Q: What are the requirements for tables to be rebuilt?

A: Besides hitting the “bloat threshold”, the only hard-coded requirement is that a table needs to have an identity key, thus defining the primary key or a unique constraint.

Q: Is PostgreSQL version 14 supported?

A: Yes, with the newest update, it is.

Do you want to know more about pg_squeeze?

CYBERTEC Logo white
Get the newest PostgreSQL Info & Tools


    This site is protected by reCAPTCHA and the Google Privacy Policy & Terms of Service apply.

    ©
    2024
    CYBERTEC PostgreSQL International GmbH
    phone-handsetmagnifiercrosscross-circle
    linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram