I've recently seen some really broad tables (hundreds of columns) in a somewhat inefficiently structured database. Our PostgreSQL support customer complained about strange runtime behavior which could not be easily explained. To help other PostgreSQL users in this same situation, I decided to reveal the secrets of a fairly common performance problem many people don’t understand: Column order and column access.
Table of Contents
The first question is: How can we create a table containing many columns? The easiest way is to simply generate the CREATE TABLE
statement using generate_series:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
test=# SELECT 'CREATE TABLE t_broad (' || string_agg('t_' || x || ' varchar(10) DEFAULT ''a'' ', ', ') || ' )' FROM generate_series(1, 4) AS x; ?column? ---------------------------------------------------------- CREATE TABLE t_broad ( t_1 varchar(10) DEFAULT 'a' , t_2 varchar(10) DEFAULT 'a' , t_3 varchar(10) DEFAULT 'a' , t_4 varchar(10) DEFAULT 'a' ) (1 row) test=# gexec CREATE TABLE |
For the sake of simplicity I have only used 4 columns here. Once the command has been generated we can use gexec to execute the string we have just compiled. gexec is a really powerful thing: It treats the previous result as SQL input which is exactly what we want here. It leaves us with a table containing 4 columns.
However, let's drop the table and create a really large one.
1 2 |
test=# DROP TABLE t_broad ; DROP TABLE |
The following statement creates a table containing 1500 columns. Mind that the upper limit is 1600 columns:
1 2 3 4 |
test=# SELECT 'CREATE TABLE t_broad (' || string_agg('t_' || x || ' varchar(10) DEFAULT ''a'' ', ', ') || ' )' FROM generate_series(1, 1500) AS x; |
In real life such a table is far from efficient and should usually not be used to store data. It will simply create too much overhead and in most cases it is not good modelling in the first place.
Let's populate the table and add 1 million rows:
1 2 3 4 5 6 7 8 9 10 11 |
test=# timing Timing is on. test=# INSERT INTO t_broad SELECT 'a' FROM generate_series(1, 1000000); INSERT 0 1000000 Time: 67457,107 ms (01:07,457) test=# VACUUM ANALYZE ; VACUUM Time: 155935,761 ms (02:35,936) |
Note that the table has default values so we can be sure that those columns actually contain something. Finally I have executed VACUUM
to make sure that all hint bits and alike are set.
The table we have just created is roughly 4 GB in size which can easily be determined using the following line:
1 2 3 4 5 |
test=# SELECT pg_size_pretty(pg_total_relation_size('t_broad')); pg_size_pretty ---------------- 3907 MB (1 row) |
PostgreSQL stores data in rows. As you might know data can be stored column- or row-oriented. Depending on your use case one or the other option might be beneficial. In the case of OLTP a row-based approach is usually far more efficient.
Let's do a count(*)
and see how long it takes:
1 2 3 4 5 6 |
test=# SELECT count(*) FROM t_broad; count --------- 1000000 (1 row) Time: 416,732 ms |
We can run the query in around 400 ms which is quite ok. As expected, the optimizer will go for a parallel sequential scan:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
test=# explain SELECT count(*) FROM t_broad; QUERY PLAN -------------------------------------------------------------------- Finalize Aggregate (cost=506208.55..506208.56 rows=1 width=8) -> Gather (cost=506208.33..506208.54 rows=2 width=8) Workers Planned: 2 -> Partial Aggregate (cost=505208.33..505208.34 rows=1 width=8) -> Parallel Seq Scan on t_broad (cost=0.00..504166.67 rows=416667 width=0) JIT: Functions: 4 Options: Inlining true, Optimization true, Expressions true, Deforming true (8 rows) |
Let's compare this to a count on the first column. You'll see a small difference in performance. The reason is that count(*)
has to check for the existence of the row while count(column)
has to check if a NULL
value is fed to the aggregate or not. In case of NULL
the value has to be ignored:
1 2 3 4 5 6 |
test=# SELECT count(t_1) FROM t_broad; count --------- 1000000 (1 row) Time: 432,803 ms |
But, let's see what happens if we access column number 100? The time to do that will differ significantly:
1 2 3 4 5 6 7 |
test=# SELECT count(t_100) FROM t_broad; count --------- 1000000 (1 row) Time: 857,897 ms |
The execution time has basically doubled. The performance is even worse if we do a count on column number 1000:
1 2 3 4 5 6 |
test=# SELECT count(t_1000) FROM t_broad; count --------- 1000000 (1 row) Time: 8570,238 ms (00:08,570) |
Wow, we are already 20 times slower than before. This is not a small difference but a major problem which has to be understood.
To understand why the problem happens in the first place we need to take a look at how PostgreSQL stores data: After the tuple header which is present in every row we got a couple of varchar
columns. We just used varchar
here to prove the point. The same issues will happen with other data types - the problem is simply more apparent with varchar
as it is more complicated internally than, say, integer
.
How does PostgreSQL access a column? It will fetch the row and then dissect this tuple to calculate the position of the desired column inside the row. So if we want to access column #1000 it means that we have to figure out how long those first 999 columns before our chosen one really are. This can be quite complex. For integer
we simply have to add 4, but in case of varchar
, the operation turns into something really expensive. Let's inspect how PostgreSQL stores varchar
(just to see why it is so expensive):
Now imagine what that means if we need to loop over 1000 columns? It does create some non-trivial overhead.
The key insight here is that using extremely large tables is often not beneficial from a performance standpoint. It makes sense to use sensible table layouts to have a good compromise between performance and convenience.
If you are interested in other ways to improve performance, read my blog on CLUSTER.
In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Facebook or LinkedIn.
You need to load content from reCAPTCHA to submit the form. Please note that doing so will share data with third-party providers.
More InformationYou are currently viewing a placeholder content from Facebook. To access the actual content, click the button below. Please note that doing so will share data with third-party providers.
More InformationYou are currently viewing a placeholder content from X. To access the actual content, click the button below. Please note that doing so will share data with third-party providers.
More Information
So for tables with lots of columns the suggestions is to break it into more tables consolidating "common" information to each one, and joining the data if addtional data from other tables is needed? This performs better than holding everything in the same table?
So u suggest normalisation is better
there is no general solution to the problem. sometimes normalization is good - sometimes not because joins are overhead too. one has to decide case by case. this is really important. i am NOT saying that one should normalize the hell out of things - this is certainly counterproductive.