UPDATED August 2023: Tuning shared_buffers in PostgreSQL is a pretty fundamental thing to do when you set up a high-performance server. In most cases, you add more memory to speed things up considerably. Many tests have shown exactly that. However, there are some corner cases which are not common, and which can cause significant performance issues.
Table of Contents
DROP TABLE
One of those corner cases is the massive use of DROP TABLE
. One might argue that DROP TABLE
is a rare thing to do. However, in many applications it is not. DROP TABLE
is used massively by countless applications and can turn out to be a bit of a showstopper.
Why is it an issue? The problem is that during DROP TABLE
all cached blocks are forced out of the cache. The bigger the cache, the bigger the effort.
The following test shows what this can actually mean in real life.
1 2 3 4 5 6 |
SET synchronous_commit TO off; BEGIN; CREATE TABLE x(id int); INSERT INTO x VALUES (1); DROP TABLE x; COMMIT; |
To make sure that the entire test does not start to be disk-bound, flushing requirements are relaxed. Then a basic table is created and dropped again.
shared_buffers
:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
#/bin/sh DB=test for x in '8 MB' '32 MB' '128 MB' '1 GB' '8 GB' do pg_ctl -D /tmp/db -l /dev/null -o '--shared_buffers='$x'' start sleep 1 echo tps for $x psql -c 'SHOW shared_buffers' $DB pgbench --file=run.sql -j 1 -c 1 -T 10 $DB 2> /dev/null pg_ctl -D /tmp/db stop sleep 1 done |
The test starts a database instance and runs the test shown before for 10 seconds.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
[hs@laura benchmark]$ ./x.sh | grep tps tps for 8 MB tps = 1390.858135 (including connections establishing) tps = 1391.123414 (excluding connections establishing) tps for 32 MB tps = 1341.244741 (including connections establishing) tps = 1341.745728 (excluding connections establishing) tps for 128 MB tps = 1193.329832 (including connections establishing) tps = 1193.796699 (excluding connections establishing) tps for 1 GB tps = 532.483333 (including connections establishing) tps = 532.702853 (excluding connections establishing) tps for 8 GB tps = 104.238781 (including connections establishing) tps = 104.280531 (excluding connections establishing) |
As you can see, the number of transactions per seconds drops dramatically down to around 100 TPS - this is 7.5% of the maximum value reached on this test.
The reason for this horrible drop is that during DROP TABLE
, PostgreSQL will clean out shared_buffers
. The trouble is: The larger the shared_buffers
area actually is, the longer this will take. For most applications this is not a real issue. However, if your application is heavily built on creating and dropping tables for whatever reason, this can be a showstopper and kill performance.
Of course, creating and dropping so many temporary tables should not happen, but from experience we have seen that it actually does.
To understand how that works, see the update about vacuum truncation in this blog on table bloat.
In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Facebook or LinkedIn.
You need to load content from reCAPTCHA to submit the form. Please note that doing so will share data with third-party providers.
More InformationYou are currently viewing a placeholder content from Facebook. To access the actual content, click the button below. Please note that doing so will share data with third-party providers.
More InformationYou are currently viewing a placeholder content from X. To access the actual content, click the button below. Please note that doing so will share data with third-party providers.
More Information
Good to know, thanks. Does using temporary and/or unlogged tables reduce the problem ?
This is really interesting, could you throw in a link to documentation as well?
wow. this is really interesting. I have process that drops around 30K tables daily and the performance of that process was terrible. The worst part that it would affect other sessions severely while the drop statements were running (I saw huge I/O spikes). Eventually I have rewritten the process to drop these tables over longer period of time, pausing after each batch.
This should be documented on drop table page of postgres doc.. or better why do we need to purge whole shared memory on drop table?
Are you sure the same applies to temporary tables? As the pages of temporary tables doesn't go to shared buffers, I don't see how would that make sense for those.
Right. Per our discussion on #postgresql, temp tables aren't stored in shared_buffers at all.
Also, shared_buffers is *scanned*, not synced, when we drop a regular table.
I think the above shown numbers are bad for even Dropping non-temp tables
as well. Also the same problem will happen for Truncate as well.
One way to we optimize such cases is that these operations should look for the buffers for blocks
of that relation only, we already tracks that in buf mapping table, so that should be possible. It might
not be good for large relations so we should do such an optimization for relations which are
smaller w.r.t shared_buffers.
The above shows that it is bad for non-temp tables, but not for temp tables. And yes, TRUNCATE suffers the same issue.
We talked about it on IRC, and reading the source, you can see that DropRelFileNodeBuffers [http://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/storage/buffer/bufmgr.c;h=cc973b53a91b7ac34b40c9a5ba5313ec7ddd1da3;hb=HEAD#l2513] has O(N) complexity (with N=shared_buffers/8KB) for non-temp tables, but for temp ones it is handled by DropRelFileNodeLocalBuffers, which also has O(N) complexity, but N is at most temp_buffers/8KB (could be less, as the allocation is lazy).
I have also tried your script, although my current box isn't good for tests, I could replay the behaviour you saw with higher shared_buffers value for non-temp tables, but when I change the table to TEMP tables, the TPS does not decrease.
So, in summary, I'd argue that issuing many DROP TABLE statements doesn't seem like a good idea for plain tables, but not bad for TEMP ones. Although, one must still keep in mind that this behaviour happens for TEMP ones, but driven by temp_buffers.
yes, temp tables show no difference. this is a text written late at night. sorry, i missed it during my check ...
No problem. That happens, I'm glad it has been clarified... 🙂
sorry folks, there is a mistake in the blog above, it accidentally says that TEMP TABLES are a problem as well. they are not. sorry, this is a leftover sentence taken from some other, unrelated text.
Interesting post, does 'truncate' present the same behavior?
If we truncate a table would it scan the buffers?