CYBERTEC PostgreSQL Logo

shared_buffers: Looking into the PostgreSQL I/O cache

02.2020 / Category: , / Tags: |

The PostgreSQL caching system has always been a bit of a miracle to many people and many have asked me during consulting or training sessions: How can I figure out what the PostgreSQL I/O cache really contains? What is in shared buffers and how can one figure out? This post will answer this kind of question and we will dive into the PostgreSQL cache.

shared_buffers-looking-into-the-postgresql-i-o-cache

Creating a simple sample database to illustrate shared_buffers

Before we can inspect shared_buffers, we have to create a little database:

To keep it simple I have created a standard pgbench database containing 1 million rows, as follows:

Deploying 1 million rows is pretty fast. In my case it took around 1.5 seconds (on my laptop).

Deploying pg_buffercache - shared_buffers illustration

Now that we have some data, we can install the pg_buffercache extension, which is ideal if you want to inspect the content of the PostgreSQL I/O cache:

pg_buffercache will return one row per 8k block in shared_buffers. However, to make sense out of the data one has to understand the meaning of those OIDs in the view. To make it easier for you I have created some simple example.

Let us take a look at the sample data first:

My demo database consists of 4 small tables.

Inspecting per database caching

Often the question is how much data from which database is currently cached. While this sounds simple you have to keep some details in mind:

The reldatabase column contains the object ID of the database a block belongs to. However, there is a “special” thing here: 0 does not represent a database but rather the pg_global schema. Some objects in PostgreSQL such as the list of databases, the list of tablespaces or the list of users are not stored in a database – this information is global. Therefore “0” needs some special treatment here. Otherwise, the query is pretty straightforward. To figure out how much RAM is currently not empty, we have to go and count the empty entries which have no counterpart in pg_database. In my example, the cache is not really fully populated, but mostly empty. On a real server with real data and real load, the cache is almost always 100% in use (unless your configuration is dubious).

Inspecting your current database

There is one more question many people are interested in: What does the cache know about my database? To answer that question, I will access an index to make sure some blocks will be held in shared_buffers:

The following SQL statement will calculate how many blocks from which table (r) respectively index (relkind = i) are currently cached:

We deliberately exclude all relations with object ID below 16384, because these low IDs are reserved for system objects. That way, the output only contains data for user tables.

As you can see, the majority of blocks in memory originate from pgbench_accounts. This query is therefore a nice way to instantly find out what is in cache and what is not. There is a lot more information to be extracted, but for most use cases those two queries will answer the most pressing questions.

Finally …

If you want to know more about PostgreSQL and performance in general, I suggest checking out one of our other posts in PostgreSQL performance issues, or take a look at our most recent performance blogs.

 


In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Facebook or LinkedIn.

Leave a Reply

Your email address will not be published. Required fields are marked *

CYBERTEC Logo white
Get the newest PostgreSQL Info & Tools


    This site is protected by reCAPTCHA and the Google Privacy Policy & Terms of Service apply.

    ©
    2024
    CYBERTEC PostgreSQL International GmbH
    phone-handsetmagnifiercrosscross-circle
    linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram