by Kaarel Moppel
Table of Contents
It’s almost that time of year again—PostgreSQL's next major version, v12, is on the way! Upgrading databases, including Postgres, is a rare event for many users, leading to misconceptions about the difficulty of the process. In reality, upgrading is straightforward, which makes it surprising that many still use outdated versions. This article aims to encourage faster upgrades in the future. For a quick overview, check the TL;DR table at the end.
Some main points that come to my mind:
And to make it even more easy for you - there’s a really cool website (much kudos to Depesz!) where you can directly get a highlighted listing of all the changes between two target Postgres versions! Link here!
Minor version changes a.k.a. bugfix releases happen regularly and can be used as a warmp-up here. There is a minimum of 1 minor release per quarter, if no security-critical stuff is found. It is, of course, highly recommended to apply them as soon as they are released.
The good thing about minor releases is that they’re as fast as restarting your server! I recommend to “prefix” the restart with a hand-initiated “checkpoint” so that effective downtime for applications is minimized since “pg_ctl stop / restart” already does checkpointing in exclusive mode, after active sessions have been kicked from the server.
Caution! When using some distribution-specific Postgres packages some extra attention is needed. For example running “apt upgrade postgresql-X” might mean an imminent restart to the running instance, after new binaries have been pulled in! For RedHat / CentOS it’s usually a bit better though and there you need to restart explicitly.
Also when running a clustered streaming replication setup where downtime on the primary node is scarce and restarts are to be avoided, my recommendation is to immediately update all replicas to the latest minor version (minor versions don’t need to match, only major). That way, when some accidental server issue appears (e.g.reboot, hardware maintenance etc) you’ll already be promoted to the latest version with a sort of “free downtime”.
Let's now move on to the real thing - major upgrade options.
This is the original way of doing major version upgrades and it's the only option up until version 9.0 which introduced the binary / in-place option. Logical dumps use “pg_dumpall” or “pg_dump” / “pg_restore” tools, that basically “reverse engineer” the database state to normal SQL statements. When those statements are executed sequentially on a target DB, they re-create exactly the same state for all user-level objects as it was on the source cluster / DB. NB! “Pg_dumpall” works on the instance level and “pg_dump” on the database level.
One more recommendation - when dumping out the DB contents it’s usually best to use the tools from the latest Postgres version. Note that this might not be officially supported, if you fall out of the “5 latest releases supported” window. By the way, on the topic of dump / restore and Postgres versions, I’d recommend to read this nice FAQ by our colleague Pavlo. It’s a few years old, but still correct.
Binary in-place upgrades are quite different from the logical ones as they happen on the filesystem level (i.e. always on the same machine if no shared storage is in play) and are a multi-step process with some additional complexity. However, things are still quite straightforward - after installing the new binaries and following a bit of preparation, it basically boils down to running a single command that typically finishes in 10 to 20 seconds in “--link” mode!! That's also why it's the most common upgrade option nowadays: nobody wants extended downtimes.
High-level steps to take (see documentation for details):
1. Install new major version binaries.
2. Initialize the new instance with new binaries (with the same major settings as on the old instance).
3. Do a test run with “pg_upgrade”. FYI - I tend to test first on a completely spare replica or a P.I.T.R. restored instance.
4. Stop the old instance.
5. Run “pg_upgrade” with or without file linking. Using hard-linking of files from the old data directory into the new one is the fastest way to upgrade! Basically only system catalogs are dumped and reloaded. Without linking all data files are copied over and it boils down to filesystem speed vs instance size, with the benefit that the old cluster is not modified.
6. Start the instance (it is usually also necessary to change the port back to 5432).
7. Start the “analyze” script generated and hinted at by the “pg_upgrade” tool. Performance for complex queries might suffer until it finishes.
NB! Also note that some distributions provide wrapper scripts (like “pg_upgradecluster” on Debian-based systems) to assist with in-place upgrades. They might be worth a look.
Logical Replication (LR) is a new option for Postgres major version upgrades, available from version 10 and up, and unofficially from 9.4. It allows you to transfer data and changes from an old primary server to a newly created, independent master server without extended downtime. LR operates at the database level, decoupling from the binary format and using data objects similar to JSON. This enables near real-time syncing, allowing time to test and validate results before finalizing the switch. For more details, check the sample code here.
It looks awesome and relatively simple, doesn’t it? And it is mostly! To spoil the party a bit - there's also a lot of "small print" to read, and an out-of-the box LR process might not always be possible - it depends a bit on the data model being used. Some tweaking / changing might be needed.
Upgrade method | Pro | Contra |
---|---|---|
Dump / restore |
|
|
Binary in-place |
|
|
Logical Replication |
|
|
Besides the obvious pre-upgrade DBA activities, such as testing, backups, and thoroughly analyzing the release notes, there are other important considerations. It’s beneficial to discuss application compatibility. Additionally, the compatibility of UI or monitoring tools with end users and application teams should also be addressed. All in all there is a very low risk factor involved in upgrading, main SQL features are backwards compatible.
I can't, of course, 100% recommend doing "lone ranger" types of covert database migrations. Although I’ve done my fair share of these, they do come with risks. In these scenarios, the database may disappear for a minute, only to reappear in a shiny new “coat” without anyone raising an eyebrow. However, if your schema consists only of a handful of plain tables with the most common data types, the situation may be different. If there are no stored procedures or other PostgreSQL-specific features like LISTEN / NOTIFY involved, and standard ANSI SQL is performed, then most likely everything will be just fine. This is due to the fact that the Postgres wire protocol has not been changed since v7-point-something. Changing it for basic SQL operations would be a massive problem for all users globally. That's why changing it has been avoided.
If you're not running an absolutely time-critical 24/7 operation (something like a space-shuttle) in my opinion there are no excuses not to upgrade at least once every couple of years. Keep in mind – if you ignore the topic for too long you’ll stop receiving security patches at some point. Then, when it is most needed, the upgrade process becomes slightly more tricky. This is because the PostgreSQL Global Development Group only supports releases from the previous five years in its tooling. If you've waited too long to upgrade, you might need to run "pg_upgrade" twice in a row. Additionally, there's some extra risk involved, as you may encounter corner case backward compatibility issues. This is particularly concerning because there is no testing for such scenarios. In short, better upgrade sooner than later!
You need to load content from reCAPTCHA to submit the form. Please note that doing so will share data with third-party providers.
More InformationYou are currently viewing a placeholder content from Facebook. To access the actual content, click the button below. Please note that doing so will share data with third-party providers.
More InformationYou are currently viewing a placeholder content from X. To access the actual content, click the button below. Please note that doing so will share data with third-party providers.
More Information
You might want to mention these other CONs of pglogical/logical replication. No support for:
1. large objects
2. TRUNCATE
3. sequence changes
I think TRUNCATE got fixed with PG11 and LO-s are mostly a non-topic...but true, there are quite some details with LR - that's why I labelled it as the most complex method. Will add a note but planning a 2nd more detailed post on LR also.
Are there clearly documented steps to be followed for a minor upgrade... say from 10.10 to 10.12 on a centos 7 vm? I'm an oracle/mysql guy new to postgresql.