Writing a complex database server like PostgreSQL is not an easy task. Especially memory management is an important task, which needs special attention. Internally PostgreSQL makes use of so called “memory contexts”. The idea of a memory context is to organize memory in groups, which are organized hierarchically. The main advantage is that in case of an error, all relevant memory can be freed at once.
Table of Contents
Understanding PostgreSQL memory contexts can be useful to solve a bunch of interesting support cases. Here is an example: Recently we have stumbled across a problem. A database server was constantly running out of memory and was finally killed by the OOM killer over and over again. Backend processes were constantly increasing memory consumption for non-obvious reasons. How can a problem like this be approached?
GDB can come to the rescue and solve the riddle of memory consumption nicely. The basic procedure works as follows:
The first part is actually quite simple. To extract a core dump of a running process we have to find out the process ID first:
1 2 3 4 5 6 7 8 9 |
[hs@jacqueline debug]$ ps ax | grep post 1987 pts/1 S 0:00 /usr/local/pg94/bin/postgres -D /tmp/db94 1989 ? Ss 0:00 postgres: checkpointer process 1990 ? Ss 0:00 postgres: writer process 1991 ? Ss 0:00 postgres: wal writer process 1992 ? Ss 0:00 postgres: autovacuum launcher process 1993 ? Ss 0:00 postgres: stats collector process 1999 ? Ss 0:00 postgres: hs test [local] idle 2004 pts/1 S+ 0:00 grep post |
In this example the process ID of the process we want to inspect is 1999 (a simple, idle local backend).
Then it is time to create the core file. gcore can do exactly that for you:
1 2 3 4 |
[hs@jacqueline debug]$ gcore 1999 [Thread debugging using libthread_db enabled] 0x00000033e6ee98c2 in recv () from /lib64/libc.so.6 Saved corefile core.1999 |
The beauty here is that gcore is just a simple shell script calling some gdb magic internally. The result will be a core file we can then make use of:
1 2 3 |
[hs@jacqueline debug]$ ls -l total 251220 -rw-rw-r-- 1 hs hs 257244232 Sep 2 09:23 core.1999 |
Then comes the harder part: Writing the gdb macro to debug those memory contexts. gdb has a scripting language to handle that. Here is the code:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 |
[hs@jacqueline debug]$ cat /tmp/debug/pg_gdb_marcos define sum_context_blocks set $context = $arg0 set $block = ((AllocSet) $context)->blocks set $size = 0 while ($block) set $size = $size + (((AllocBlock) $block)->endptr - ((char *) $block)) set $block = ((AllocBlock) $block)->next end printf "%s: %dn",((MemoryContext)$context)->name, $size end define walk_contexts set $parent_$arg0 = ($arg1) set $indent_$arg0 = ($arg0) set $i_$arg0 = $indent_$arg0 while ($i_$arg0) printf " " set $i_$arg0 = $i_$arg0 - 1 end sum_context_blocks $parent_$arg0 set $child_$arg0 = ((MemoryContext) $parent_$arg0)->firstchild set $indent_$arg0 = $indent_$arg0 + 1 while ($child_$arg0) walk_contexts $indent_$arg0 $child_$arg0 set $child_$arg0 = ((MemoryContext) $child_$arg0)->nextchild end end walk_contexts 0 TopMemoryContext |
The last line in the file is the actual call executing the code just written. walk_contexts will go through those memory contexts starting at the TopMemoryContext.
To run the script the following line will be useful. The script can simply be piped into gdb. The result will list information about memory consumption:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 |
[hs@jacqueline debug]$ gdb -c ./core.1999 /usr/local/pg94/bin/postgres < pg_gdb_marcos *snip* Loaded symbols for /lib64/libnss_files.so.2 Core was generated by `postgres: hs test [local] idle '. #0 0x00000033e6ee98c2 in recv () from /lib64/libc.so.6 Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.132.el6_5.2.x86_64 (gdb) >>>> > > >>>(gdb) (gdb) >>>> > > >>>>> > > >>(gdb) (gdb) TopMemoryContext: 69936 MessageContext: 8192 Operator class cache: 8192 smgr relation table: 24576 TransactionAbortContext: 32768 Portal hash: 8192 PortalMemory: 0 Relcache by OID: 24576 CacheMemoryContext: 516096 pg_db_role_setting_databaseid_rol_index: 1024 pg_user_mapping_user_server_index: 1024 pg_user_mapping_oid_index: 1024 *snip* pg_class_oid_index: 1024 MdSmgr: 8192 ident parser context: 0 hba parser context: 3072 LOCALLOCK hash: 8192 Timezones: 83472 ErrorContext: 8192 (gdb) TopMemoryContext: 69936 |
The output is actually quite long so I decided to remove a couple of lines. What you see here is how memory contexts are organized and how much memory is in each memory context.
If you happen to see any context which uses insane amounts of memory, it will definitely bring you one step closer to finding the root cause of a memory related problem.
In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Facebook or LinkedIn.
You need to load content from reCAPTCHA to submit the form. Please note that doing so will share data with third-party providers.
More InformationYou are currently viewing a placeholder content from Facebook. To access the actual content, click the button below. Please note that doing so will share data with third-party providers.
More InformationYou are currently viewing a placeholder content from X. To access the actual content, click the button below. Please note that doing so will share data with third-party providers.
More Information
If you can afford to briefly pause a backend's operation (such as to use gdb's gcore on it) you can probably also afford to take a memory context dump directly with:
print MemoryContextStats(TopMemoryContext)
Per https://wiki.postgresql.org/wiki/Developer_FAQ#Examining_backend_memory_use
It'd be nice if Linux offered non-blocking coredumps of live processes (say, by cloning the memory mapping copy-on-write like it does for fork()ed processes) but AFAIK it doesn't support that, so writing a coredump could actually be more intrusive than just asking the running Pg to print a dump.