When doing PostgreSQL consulting the other day, the talk went to the topic of connection pools - namely what approaches and products are commonly used and perform well? The topic is pretty wide in itself but mostly well-known for old-timers. Nevertheless it is worth a small write up on basic concepts and a small comparison of the two most common "near to Postgres" products that you should know about - PGBouncer and pgpool-II.
Table of Contents
First a basic intro - connection pools are middleware that speak the database protocol and cache database connections so that clients could spare the time used to negotiate the connection, do authentication and set client defaults (encoding, work_mem) when opening a new connection and also to relieve the database server from storing too much client state in memory. Thus applications connect to the pool, thinking it's the database.
Common approaches for deploying pools:
When talking about separate pooling servers in Postgres context, two products stand out: PgBouncer and pgpool-II. Both products are written in C and seem to be actively maintained, with pgpool-II seeing more action as it also has a higher number of ambitious features. Besides source code, packages for common Linux distros are available and deployment it pretty simple: customize the configuration files and start the daemon. Redirect your clients to the pools instead of the real DB (both use non-standard ports by default though - 6432 and 9999 accordingly). Based on the available online documentation (PgBouncer in current version 1.7.2 and pgpool-II in version 3.5.4, the latter one having sadly some outdated parts) I compiled the following outline of the features so that you can decide on the suitability for your needs yourself.
pgpool-II gotchas
PgBouncer gotchas
From sceptical reasons, to test the claims of my colleague Ants that PgBouncer is significantly faster than pgpool-II. I also decided to run a quick set of tests with all the components running on my laptop. Test setup - a small 13MB "pgbench" in-memory dataset in "--select-only" mode to get fast responses, as we want to test here only connection overhead. Pools were configured without SSL and so that the tested amount of 8 concurrent connections would always be kept cached by the pools and no connection re-establishing would take place during test. For PgBouncer default "session pooling" was used.
1 2 3 4 5 6 |
pgbench -i -s 1 bench # init the bench schema ~13MB for port in 5432 6432 9999 ; do for i in {1..3} ; do pgbench --select-only --connect -T300 -c8 -j2 -p $port bench done done |
Side note - before I could really fire off with testing I ran into a distro-specific problem where connections started to fail after some time, and it required changing some kernel parameters. More info here.
Results (as always, given with a YMMV disclaimer) were such:
Both well-known and battle-tested products, PgBouncer and pgpool-II, provide a good way to grab that low-hanging fruit for performance (very noticeable difference when doing very short and simple transactions) and also to add some flexibility to your setup by hiding the database from direct access, making it easier to do minor maintenance. For most usecases (no replicas or using external HA solutions) PgBouncer would be my pick, due to its lightweight architecture and superior performance.
+43 (0) 2622 93022-0
office@cybertec.at
You are currently viewing a placeholder content from Facebook. To access the actual content, click the button below. Please note that doing so will share data with third-party providers.
More InformationYou are currently viewing a placeholder content from X. To access the actual content, click the button below. Please note that doing so will share data with third-party providers.
More Information
Great writeup my friend!
Cheers for this article.
nice explanation, can i have difference between pgpool and pg bouncer,
can i use any one in my environment ?
You should compare features listed in this article and choose product suits for your needs.