pg_resetwal (formerly known as pg_resetxlog
) is a command provided by PostgreSQL which allows you to reset the WAL (Write Ahead Log) in case of trouble. It is a means of last resort to get a broken database server to start. However, what happens behind the scenes when this command is called? Is it dangerous? Required? Useful or just a “nice to have” kind of thing? Let’s dive in and figure it out.
Table of Contents
Before discussing pg_resetwal
, we need to talk about the purpose of WAL in the first place:
Let’s consider this basic transaction:
1 2 3 4 |
BEGIN; INSERT INTO a VALUES (1); INSERT INTO a VALUES (2); … |
In case the database crashes while inserting the second row, we are in trouble. Our data might only be half written (e.g. only half of an 8k block, missing index entries, etc.) which causes a problem, since our data files would face instant corruption (just like MyISAM got corrupted in the old days). However, this is PostgreSQL and the goal is to protect data at all costs. Therefore PostgreSQL writes redo information to the WAL first, and uses it to fix the data files (= remove corruption) after a crash.
Note: During normal operations, data files are not supposed to be up to date, consistent and correct. Yes, you have just read this correctly: In case your database is under load your data files are usually inconsistent. If the WAL gets lost or corrupted, your database is (in case of a crash) simply broken, incomplete and inconsistent.
The rule is:
Your data files alone are not sufficient to ensure consistency.
pg_resetwal
and riskThis is where the problem starts: If your WAL is so corrupted that you have to reset it, it usually means data loss or corruption because, remember, data files are inconsistent during normal operations. What pg_resetwal
does is to remove the “instructions to fix those data files after a restart”.
In reality, this leads to some very important rules:
pg_resetwal
pg_resetwal
carelesslyKeep this warning in mind at all times, otherwise you put your data at risk!
This does not mean that pg_resetwal
is always bad - it just means that you need to thoroughly understand it before you use it.
pg_resetwal
and how to use it correctlyAfter this introduction, it’s time to take a look at the syntax of the pg_resetwal
command:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
postgres_support:~ hs$ pg_resetwal --help pg_resetwal resets the PostgreSQL write-ahead log. Usage: pg_resetwal [OPTION]... DATADIR Options: -c, --commit-timestamp-ids=XID,XID set oldest and newest transactions bearing commit timestamp (zero means no change) [-D, --pgdata=]DATADIR data directory -e, --epoch=XIDEPOCH set next transaction ID epoch -f, --force force update to be done -l, --next-wal-file=WALFILE set minimum starting location for new WAL -m, --multixact-ids=MXID,MXID set next and oldest multitransaction ID -n, --dry-run no update, just show what would be done -o, --next-oid=OID set next OID -O, --multixact-offset=OFFSET set next multitransaction offset -u, --oldest-transaction-id=XID set oldest transaction ID -V, --version output version information, then exit -x, --next-transaction-id=XID set next transaction ID --wal-segsize=SIZE size of WAL segments, in megabytes -?, --help show this help, then exit |
Note is that pg_resetwal
is called when the database is not able to start up. You have to point to a data directory and then you can set a few variables.
The problem is: The WAL is not supposed to have less data than the data directory ($PGDATA
). Therefore setting the status of the WAL back will lead to “interesting” results. You might face duplicate primary key entries, missing index pointers and a lot of other difficulties you have never even heard of.
pg_resetwal
In case you really have to reset the system, it is super important to properly handle the database afterwards. Usually it is not a good idea to consider the content of the database to be trustworthy. Therefore it is a good idea to pg_dump
/ pg_restore
the database (in case this is still possible). A binary backup is NOT what you are looking for, since you would merely create a binary copy of the broken database - which is broken too!
It is therefore a really good idea to use pg_dump
to see what can be extracted from the database and what not. In fact: It is quite likely that pg_dump
will error out after pg_resetwal
as the database might be, well, broken. All you can do in this case is to use manual copy and restore whatever you can extract from the database, step-by-step.
You might be interested in this article that demonstrates how pg_resetwal
can be used to cause data corruption.
If you want to learn more about PostgreSQL and how to spot performance problems, see our performance blogs for more great tips. We cover the latest information to support PostgreSQL and to help people use their systems to the greatest advantage. If you want to know more about PostgreSQL support and PostgreSQL consulting - reach out to us.
In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Facebook or LinkedIn.
+43 (0) 2622 93022-0
office@cybertec.at
You are currently viewing a placeholder content from Facebook. To access the actual content, click the button below. Please note that doing so will share data with third-party providers.
More InformationYou are currently viewing a placeholder content from X. To access the actual content, click the button below. Please note that doing so will share data with third-party providers.
More Information
Leave a Reply