If you are relying heavily on the PostgreSQL COPY command to load data into PostgreSQL quickly, PostgreSQL 12 might offer a feature, which is most likely very beneficial to you. Bulkloading is an important operation and every improvement in this area is certainly going to help many people out there, who want to import data into PostgreSQL as fast as possible.
Table of Contents
When taking a closer look at the syntax of the COPY command in PostgreSQL 12 you will quickly see two things:
• h will now point to the correct page in the documentation
• COPY now supports a WHERE condition
Here is the complete syntax overview:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
db12=# h COPY Command: COPY Description: copy data between a file and a table Syntax: COPY table_name [ ( column_name [, ...] ) ] FROM { 'filename' | PROGRAM 'command' | STDIN } [ [ WITH ] ( option [, ...] ) ] [ WHERE condition ] COPY { table_name [ ( column_name [, ...] ) ] | ( query ) } TO { 'filename' | PROGRAM 'command' | STDOUT } [ [ WITH ] ( option [, ...] ) ] where option can be one of: FORMAT format_name FREEZE [ boolean ] DELIMITER 'delimiter_character' NULL 'null_string' HEADER [ boolean ] QUOTE 'quote_character' ESCAPE 'escape_character' FORCE_QUOTE { ( column_name [, ...] ) | * } FORCE_NOT_NULL ( column_name [, ...] ) FORCE_NULL ( column_name [, ...] ) ENCODING 'encoding_name' URL: https://www.postgresql.org/docs/12/sql-copy.html |
While having a link to the documentation around is certainly beneficial, the WHERE condition added to PostgreSQL 12 might even be more important. What is the purpose of this new feature? So far it was possible to completely import a file. However, in some cases this has been a problem: More often than not people only wanted to load a subset of data and had to write a ton of code to filter data before the import or once data has been written into the database already.
In PostgreSQL data can be filtered while importing easily. The COPY become is pretty flexible and allows a lot of trickery. To show you, how the new WHERE clause works, I have compiled a simple example:
1 2 3 |
db12=# CREATE TABLE t_demo AS SELECT * FROM generate_series(1, 1000) AS id; SELECT 1000 |
First of all 1000 rows are generated to make sure that we got some data to play. Then we export the content of this table to a file:
1 2 |
db12=# COPY t_demo TO '/tmp/file.txt'; COPY 1000 |
Finally, we can try to import this data again:
1 2 3 4 5 6 7 8 9 10 11 12 |
db12=# CREATE TABLE t_import (x int); CREATE TABLE db12=# COPY t_import FROM '/tmp/file.txt' WHERE x < 5; COPY 4 db12=# SELECT * FROM t_import; x --- 1 2 3 4 (4 rows) |
As you can see filtering data is pretty simple and very straight forward. One important thing to note here is: I exported an “id” column and imported it as “x”. Keep in mind that the text file does not know the data structure of our target table - you have to make sure that you filter on the column name of the table you want to import.
If you are new to PostgreSQL in general I also want to present one of the older features, which I like a lot personally. COPY can send data to the UNIX pipe or read data from a pipe. Here is how it works:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
db12=# COPY t_demo TO PROGRAM 'gzip -c > /tmp/file.txt.gz'; COPY 1000 db12=# COPY t_import FROM PROGRAM 'gunzip -c /tmp/file.txt.gz' WHERE x BETWEEN 100 AND 103; COPY 4 db12=# SELECT * FROM t_import WHERE x >= 100; x ----- 100 101 102 103 (4 rows) |
In some cases, you might want to do more than to just export data. In this case I decided to compress the data while exporting. Before the data is imported again it is uncompressed and again filtered. As you can see it is pretty simple to combine those features in a flexible way.
If you want to learn more about PostgreSQL and loading data in general, check out our post about rules and triggers. If you want to learn more about COPY, checkout the PostgreSQL documentation.
You need to load content from reCAPTCHA to submit the form. Please note that doing so will share data with third-party providers.
More InformationYou are currently viewing a placeholder content from Facebook. To access the actual content, click the button below. Please note that doing so will share data with third-party providers.
More InformationYou are currently viewing a placeholder content from X. To access the actual content, click the button below. Please note that doing so will share data with third-party providers.
More Information
Do you know if the psql meta-command copy will receive WHERE support? It’s not listed in the draft documentation for 12, but that may be an omission.
i don't see that happen. backslash commands are a frontend thing. WHERE is a backend thing.
Does that mean the copy command implemented with different code than the server-side COPY routine? I always assumed copy would stream the data to the server and then invoke a COPY.
I'm wondering if there are speed advantages to use a server-side COPY beyond extra features like WHERE support.