PostgreSQL offers a nice BLOB interface which is widely used. However, recently we came across problems faced by various customers, and it makes sense to reflect a bit and figure out how PostgreSQL handles BLOBs - and especially BLOB cleanup.
Table of Contents
In PostgreSQL, you can use various means to store binary data. The simplest form is definitely to make use of the “bytea” (= byte array) data type. In this case a binary field is basically seen as part of a row.
Here is how it works:
1 2 3 4 5 6 7 8 9 |
test=# CREATE TABLE t_image (id int, name text, image bytea); CREATE TABLE test=# d t_image Table 'public.t_image' Column | Type | Collation | Nullable | Default -------+---------+-----------+----------+--------- id | integer | | | name | text | | | image | bytea | | | |
As you can see, this is a normal column and it can be used just like a normal column. The only thing worth mentioning is the encoding one has to use on the SQL level. PostgreSQL uses a variable to configure this behavior:
1 2 3 4 5 |
test=# SHOW bytea_output; bytea_output -------------- hex (1 row) |
The bytea_output variable accepts two values: “hex” tells PostgreSQL to send the data in hex format. “escape” means that data has to be fed in as an octal string. There is not much the application has to worry about here, apart from the maximum size of 1 GB per field.
However, PostgreSQL has a second interface to handle binary data: The BLOB interface. Let me show an example of this powerful tool in action:
1 2 3 4 5 |
test=# SELECT lo_import('/etc/hosts'); lo_import ----------- 80343 (1 row) |
In this case, the content of /etc/hosts has been imported into the database. Note that PostgreSQL has a copy of the data - it is not a link to the filesystem. What is noteworthy here is that the database will return the OID (object ID) of the new entry. To keep track of these OIDs, some developers do the following:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
test=# CREATE TABLE t_file ( id int, name text, object_id oid ); CREATE TABLE test=# INSERT INTO t_file VALUES (1, 'some_name', lo_import('/etc/hosts')) RETURNING *; id | name | object_id ----+---------------+----------- 1 | some_name | 80350 (1 row) |
1 |
INSERT 0 1 |
This is absolutely fine, unless you do something like below:
1 2 |
test=# DELETE FROM t_file WHERE id = 1; DELETE 1 |
The problem is that the object id has been forgotten. However, the object is still there. pg_largeobject is the system table in charge of storing the binary data inside PostgreSQL. All lo_functions will simply talk to this system table in order to handle thesethings:
1 2 3 4 5 6 7 |
test=# x Expanded display is on. test=# SELECT * FROM pg_largeobject WHERE loid = 80350; -[ RECORD 1 ]------------------------------------------ loid | 80350 pageno | 0 data | ##\012# Host Database\012#\012# localhost ... |
Why is that a problem? The reason is simple: Your database will grow and the number of “dead objects” will accumulate.
Therefore the correct way to kill a BLOB entry is as follows:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
test=# x Expanded display is off. test=# test=# SELECT lo_unlink(80350); lo_unlink ----------- 1 (1 row) test=# SELECT * FROM pg_largeobject WHERE loid = 80350; loid | pageno | data ------+--------+------ (0 rows) |
If you forget to unlink the object, you will suffer in the long run - and we have often seen that happen. It is a major issue if you are using the BLOB interface.
However, how can one fix the problem once you have accumulated thousands, or maybe millions, of dead BLOBs? The answer is a command line tool called “vacuumlo”.
Let us first create a dead entry:
1 2 3 4 5 |
test=# SELECT lo_import('/etc/hosts'); lo_import ----------- 80351 (1 row) |
Then we can run vacuumlo from any client:
1 2 3 4 |
iMac:~ hs$ vacuumlo -h localhost -v test Connected to database 'test' Checking object_id in public.t_file Successfully removed 2 large objects from database 'test'. |
As you can see, two dead objects have been killed by the tool. vacuumlo is the easiest way to clean out orphan objects.
However, there is more than just lo_import and lo_unlink. PostgreSQL offers a variety of functions to handle large objects in a nice way:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
test=# df lo_* List of functions Schema | Name | Result data type | Argument data types | Type ------------+---------------+------------------+---------------------------+------ pg_catalog | lo_close | integer | integer | func pg_catalog | lo_creat | oid | integer | func pg_catalog | lo_create | oid | oid | func pg_catalog | lo_export | integer | oid, text | func pg_catalog | lo_from_bytea | oid | oid, bytea | func pg_catalog | lo_get | bytea | oid | func pg_catalog | lo_get | bytea | oid, bigint, integer | func pg_catalog | lo_import | oid | text | func pg_catalog | lo_import | oid | text, oid | func pg_catalog | lo_lseek | integer | integer, integer, integer | func pg_catalog | lo_lseek64 | bigint | integer, bigint, integer | func pg_catalog | lo_open | integer | oid, integer | func pg_catalog | lo_put | void | oid, bigint, bytea | func pg_catalog | lo_tell | integer | integer | func pg_catalog | lo_tell64 | bigint | integer | func pg_catalog | lo_truncate | integer | integer, integer | func pg_catalog | lo_truncate64 | integer | integer, bigint | func pg_catalog | lo_unlink | integer | oid | func (18 rows) |
There are two more functions which don’t follow the naming convention for historic reasons: loread and lowrite:
1 2 |
pg_catalog | loread | bytea | integer, integer | func pg_catalog | lowrite | integer | integer, bytea | func |
They are functions whose names cannot easily be changed anymore. However, it is worth noting that they exist.
The PostgreSQL BLOB interface is really useful and can be used for many things. The beauty is that it is fully transactional and therefore binary content and metadata cannot go out of sync anymore.
If you want to learn more about triggers to enforce constraints in PostgreSQL, we recommend you check out our blog post written by Laurenz Albe. It will shed some light on this important topic.
In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Facebook or LinkedIn.
+43 (0) 2622 93022-0
office@cybertec.at
You are currently viewing a placeholder content from Facebook. To access the actual content, click the button below. Please note that doing so will share data with third-party providers.
More InformationYou are currently viewing a placeholder content from X. To access the actual content, click the button below. Please note that doing so will share data with third-party providers.
More Information
The BLOB system of PostgreSQL is quite horrible and I think should not be used. As noted, there is no referential integrity for blob values -- you can accidentally delete blobs that are still referenced, and also forget to delete them, for which the vacuumlo command is a workaround. Especially, nobody should use the shown lo_import() function, which requires superuser permissions and reads files on the database server's file system, both bad practices.
In most cases it's far better to create a table with a bytea field to store your binary data, and use foreign key constraints to ensure referential integrity. For files larger than 1 GB one should consider an external data store.
I agree. The only use cases are
- if you need to store data exceeding 1GB
- if you need to stream writes
Compare my blog on binary data performance in PostgreSQL.
My table with bytea column is bloating.
Every days my backup is bloating, as if deleted records were backed up.
Is it possible that deleted records are being backed up by pg_dump ?
No, that is not possible.
I think you should buy some consulting so that somebody can have a look at your problem.
It was a programmer's mistake, everything looks fine now