PostgreSQL TDE is a version of PostgreSQL which offers on-disk encryption for PostgreSQL. It is part of the CYBERTEC POSTGRESQL ENTERPRISE EDITION (PGEE). What we have noticed recently is that people kept asking about how to integrate PostgreSQL TDE with existing key management solutions. To help you manage encryption keys with PostgreSQL TDE, we have decided to outline the basic concepts and help people to implement key management strategies for PostgreSQL to secure their infrastructures.
Table of Contents
Running PostgreSQL TDE is easy. However, it requires some basic understanding of how we handle keys in general. The following image shows the basic architecture of how to manage encryption keys using our security-enabled version of PostgreSQL:
Let's focus on each of these parts separately.
When a new PostgreSQL instance is created, initdb
is executed behind the scenes. Most systems do that for you when the service is started for the first time. The purpose of initdb
is to create the initial content of the data directory, deploy a config file and so on.
The point is: In the case of TDE (= Transparent Data Encryption), the data files on disk have to be encrypted. The natural question which arises is: How can we find the key? It's not a good idea to store the key directly in the data directory, so we have to be a little creative to handle keys properly.
When we developed PostgreSQL TDE, we decided on a simple plugin architecture. Key management can be quite diverse. Many different keystores exist and the requirements of every single company differ depending on use cases, philosophy, guidelines and sometimes even due to legal requirements.
When PostgreSQL TDE fetches the key, it uses a plugin which is an executable we have to point to:
1 |
[tde@cybertec]$ initdb --help |
initdb
initializes a PostgreSQL database cluster.
1 |
initdb [OPTION]... [DATADIR] |
1 2 3 4 |
... -k, --data-checksums use data page checksums -K, --encryption-key-command command that returns encryption key |
The -K
option requires you to point to a script that returns the key as a string. Note that this key is never visible to anybody but the database server. It does not show up in a process table. It does not show up anywhere. The beauty of this design is that you can write any plugin for any key manager of your choice. For testing purposes, you can even call up the command line and make the administrator type in the key. In the real world, almost no-one does that - but even that approach is possible.
As stated: Provide a script or an executable that echoes the key to stdout. PostgreSQL will read it and use it to encrypt your data files. This script gives you total flexibility to integrate with every possible kind of key management.
1 |
[tde@cybertec]$ ./initdb -D /path/db -K /path/key_manager.sh |
The files which belong to this database system will be owned by user "hs".
This user must also own the server process.
The database cluster will be initialized with locales
1 2 3 4 5 6 |
COLLATE: en_US.UTF-8 CTYPE: en_US.UTF-8 MESSAGES: en_US.UTF-8 MONETARY: de_DE.UTF-8 NUMERIC: de_DE.UTF-8 TIME: de_DE.UTF-8 |
The default database encoding has been set accordingly, to "UTF8".
The default text search configuration will be set to "english".
Data page checksums are disabled.
Data encryption is enabled.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
creating directory /path/db ... ok creating subdirectories ... ok selecting dynamic shared memory implementation ... posix selecting default max_connections ... 100 selecting default shared_buffers ... 128MB selecting default time zone ... Europe/Vienna creating configuration files ... ok running bootstrap script ... ok performing post-bootstrap initialization ... ok syncing data to disk ... ok initdb: warning: enabling 'trust' authentication for local connections You can change this by editing pg_hba.conf or using the option -A, or --auth-local and --auth-host, the next time you run initdb. |
1 |
[tde@cybertec]$ ./pg_ctl -D /path/db -l logfile start |
The -K
option is really the path to success here. In the simplest of all cases the script simply returns a constant:
1 2 3 |
[tde@cybertec]$ cat /path/key_manager.sh #!/bin/sh echo 4e5358ab309bcdea23450934546298ab |
But, as stated earlier, this is really just a showcase. DO NOT store the key on the same server - instead, integrate it with a proper key store and manage the key remotely. Otherwise, TDE would be pointless and simply a waste of resources.
What is really important to note here is that the key has to be in hex and it has to be 32 characters long. Otherwise initdb
will error out.
1 |
initdb: fatal: encryption key is too short, should be a 32 character hex key |
For using secrets that humans can memorize there is a bundled pg_keytool
utility that takes as input a passphrase and uses a key derivation function to form a correctly formatted output key. By using a key derivation function that takes a non-trivial amount of time to calculate, this adds an extra layer of security against dictionary attacks.
Once the instance has been deployed, we can actually start PostgreSQL normally.
What initdb
has done for us is to adjust the relevant values in postgresql.conf
:
1 2 3 |
# - Encryption - encryption_key_command = '/path/key_manager.sh' |
Every time the server starts up, it will run the same program to fetch the key. Remember, we need this key to decrypt those data files. The server has already taken care to ensure that nothing can go wrong - assuming that the program does indeed return the correct key. If the key cannot be provided, then the server cannot be started. There is NO BACKDOOR, no emergency key or anything of that kind. If you cannot provide the correct key your data is lost. That is the entire idea of TDE. We do not want backdoors.
People keep asking: How can I figure out if my PostgreSQL instance is indeed encrypted? The way to do that is to control pg_controldata
:
1 2 3 4 5 6 7 8 9 |
[tde@cybertec]$ pg_controldata /path/db/ ... Database system identifier: 7036323720088499175 Database cluster state: shut down pg_control last modified: Di 30 Nov 2021 12:29:41 CET ... Mock authentication nonce: f12fd3308a81f946ffbb36a0a3bd4d41c87bbb9ba4d612b809a001a2202cdc6a Data encryption: on Data encryption fingerprint: D2D2CED8FE9F3980599289B1F468A9FB |
At the end of the listing, you can find the encryption details. Make sure that you use the RIGHT version of pg_controldata
. Otherwise, you won't be able to find that information.
One of the main questions is: Which key store can one use for PostgreSQL? The answer is: Any of them. Here are some which are quite popular and widely used by many companies. Keep in mind that the list is not exhaustive:
A typical solution for a production system is to use a scheme where the database master key is generated randomly, then encrypted using a key from a key management system, and finally stored in the PostgreSQL data directory. This allows for keys to be rotated or even the key management system to be swapped out without having to re-encrypt the whole database.
You can read more about the topic of encryption in our introduction to Transparent Layer Security (TLS).
Exciting news in the realm of GIS - we recently released a website that offers ready-to-use GIS data. If you are using PostGIS and OSM data, this is the place to go. We provide you with ready-made database dumps featuring the entire planet. Check out our GIS site.
In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Facebook or LinkedIn.
You need to load content from reCAPTCHA to submit the form. Please note that doing so will share data with third-party providers.
More InformationYou are currently viewing a placeholder content from Facebook. To access the actual content, click the button below. Please note that doing so will share data with third-party providers.
More InformationYou are currently viewing a placeholder content from X. To access the actual content, click the button below. Please note that doing so will share data with third-party providers.
More Information
Leave a Reply