Quick start: How to build your first PostgreSQL container

As announced in the first part of this series, this article is the second part of our journey towards Docker and Kubernetes. By the end of this post, you will have built your first PostgreSQL container and learned and applied various ideas for optimising your containers.

Table of Contents

Some additional vocabulary

But before we start, let's take a quick look back:

In the first Post, we built our first container and learnt about the FROM, RUN, COPY, ENTRYPOINT and CMD commands.

Based on these commands, we will start by adding the following commands to our vocabulary:

ARG - Defines a variable that we only use within the build process. Important to know with ARG is that depending on whether I set ARG before or after the FROM, I have a different visibility. This means that if I set ARG after the FROM, the value only lives within the FROM clause. If I set it before, it lives globally.
ENV - With ENV we define an environment variable for the runtime of the container. This environment variable can be used by the container.
VOLUME - Defines a mount point in the container for persistence. It is important to know that I can also define a persistent storage at a desired mountpoint to start the container without a volume. By using volume, however, Docker will mount a volume at this path even without an explicit definition.
EXPOSE - Defines a port that is to be opened for the outside world. It should be noted that this is only used for documentation purposes in the Dockerfile. This must be explicitly specified when the container is started.
WORKDIR - The workdir defines the path which is used for all subsequent commands, unless an explicit path is specified. If the folder does not exist, it is created automatically.
USER - By default, we are always running as root in the container, with user we can switch to the corresponding user based on a username or the userid.

Stages and their use

Equipped with the additional vocabulary, let's take a look at the theme of the stages. A stage defines the part of the Dockerfile that begins with a FROM and all the statements belonging to this FROM. This means that I can define several from clauses and therefore several stages in a Dockerfile, which is then referred to as a multi-stage Dockerfile.

The multi-stage approach is primarily used to create images that are as small as possible. This means that we can clone and compile something from Github in a first stage, for example, for which we need packages that are not required in the container to be executed later. By creating an additional stage for the container to be executed and copying the compiled software, we reduce the size of our image.

Let's use the example of the CYBERTEC-pg-container and take a look at the Dockerfile for the exporter.

FROM rockylinux:9 AS builder

RUN dnf -y install --nodocs  \
    	--setopt=skip_missing_names_on_install=False \
    	git \
    	go \
    	dumb-init \
    	&amp;&amp; dnf -y clean all ;

RUN git clone https://github.com/prometheus-community/postgres_exporter.git &amp;&amp; cd postgres_exporter &amp;&amp; make build

FROM rockylinux/rockylinux:9-ubi-micro
COPY --from=builder /usr/bin/dumb-init /usr/bin/dumb-init
COPY --from=builder ./postgres_exporter/postgres_exporter /bin/postgres_exporter

COPY launcher/exporter/launch.sh /
COPY scripts/exporter/queries/ /postgres_exporter/queries

EXPOSE 9187

ENTRYPOINT &#91;"/usr/bin/dumb-init", "--"]

CMD &#91;"/bin/sh", "/launch.sh", "init"]
Note: For simplification purposes, the variables have been removed and replaced with fixed values.

FROM rockylinux:9 AS builder

RUN dnf -y install --nodocs \

--setopt=skip_missing_names_on_install=False \

git \

go \

dumb-init \

&& dnf -y clean all ;

RUN git clone https://github.com/prometheus-community/postgres_exporter.git && cd postgres_exporter && make build

FROM rockylinux/rockylinux:9-ubi-micro

COPY --from=builder /usr/bin/dumb-init /usr/bin/dumb-init

COPY --from=builder ./postgres_exporter/postgres_exporter /bin/postgres_exporter

COPY launcher/exporter/launch.sh /

COPY scripts/exporter/queries/ /postgres_exporter/queries

EXPOSE 9187

ENTRYPOINT ["/usr/bin/dumb-init", "--"]

CMD ["/bin/sh", "/launch.sh", "init"]

Note: For simplification purposes, the variables have been removed and replaced with fixed values.

Obtaining and compiling the exporter project requires the additional packages git and go. Furthermore, we need additional packages which in this case are included in the regular rocky9 image. We also do not need these in the later running image, so we use a much smaller image for the from-clause for the second stage.

Now we could discuss why we need multi-stages and not simply remove everything unnecessary from the containers net and thus make them smaller.

The reason lies in the layers of a container. Every command in the Dockerfile, i.e. FROM, RUN, ... creates a layer in the container and this layer is unchangeable. This means that I can try to keep the layer as small as possible within a layer by removing unnecessary elements. We can see this, for example, in the second part of the RUN command for installing the packages. Here, before the end of the layer, everything we don't need is removed, including the cache.

RUN dnf -y install --nodocs  \
    	--setopt=skip_missing_names_on_install=False \
    	git \
    	go \
    	dumb-init \
    	&amp;&amp; dnf -y clean all ;

RUN dnf -y install --nodocs \

--setopt=skip_missing_names_on_install=False \

git \

go \

dumb-init \

&& dnf -y clean all ;

However, as soon as we remove the packages again in a second layer, this enlarges the container as we have an additional layer and the package is still present in the previous layer.

However, we now use a different trick for our PostgreSQL container. Instead of using a new stage with a scaled-down image, as in our example, we start with an empty container, i.e. without a layer, and then create exactly one layer by copying everything from our previous builder stage, which we have previously freed of everything unnecessary in a final step. This works because we now take the last status, thus the cleaned status, and write it in exactly one layer.

FROM rockylinux:9 AS builder

RUN dnf -y install --nodocs  \
    	--setopt=skip_missing_names_on_install=False \
    	git \
    	go \
    	dumb-init \
    	&amp;&amp; dnf -y clean all ;

RUN dnf -y remove git go make &amp;&amp; dnf -y clean all ;

FROM scratch 
COPY --from=builder / /

FROM rockylinux:9 AS builder

RUN dnf -y install --nodocs \

--setopt=skip_missing_names_on_install=False \

git \

go \

dumb-init \

&& dnf -y clean all ;

RUN dnf -y remove git go make && dnf -y clean all ;

FROM scratch

COPY --from=builder / /

What we have to bear in mind with this topic is that we only define certain work on the last and therefore final container. This concerns the following instructions, among others:

ENV
ENTRYPOINT
CMD

It should also be noted that certain permissions for folders and files still need to be changed if you are working with a different user in the final stage, i.e. in the build stage.

Now let's get to work
We now want to build a simple PostgreSQL container that provides us with an instance.
To do this, we will install PostgreSQL from the PostgreSQL Global Development Group repos, remove unnecessary items from our container and finally create a fresh container with the content of the build stage and start it up.

Our Dockerfile:

FROM rockylinux:9 as builder

# Install needed Repos, Packages and remove unneded Packages
RUN dnf install -y https://download.postgresql.org/pub/repos/yum/reporpms/EL-9-x86_64/pgdg-redhat-repo-latest.noarch.rpm https://dl.fedoraproject.org/pub/epel/epel-release-latest-9.noarch.rpm \
   &amp;&amp; dnf -qy module disable postgresql \
   &amp;&amp; dnf update -y \
   &amp;&amp; dnf install -y postgresql17-server postgresql17 dumb-init \
   &amp;&amp; dnf remove -y kernel kernel-core kernel-modules man-db man-pages NetworkManager\
   &amp;&amp; dnf groupremove -y "Development Tools" "Base" "Standard" \
   &amp;&amp; dnf autoremove -y \
   &amp;&amp; dnf clean all \
   &amp;&amp; mkdir -p /data \
   &amp;&amp; chown -R postgres:postgres /data \
   &amp;&amp; rm -rf /var/cache/dnf /usr/share/doc /usr/share/man;

# Create new Container and Copy from builder
FROM scratch
COPY --from=builder / /
COPY launch.sh /launch.sh

RUN chmod +x /launch.sh

USER postgres

ENV PGDATA=/data/pgdata \
   PATH=$PATH:/usr/pgsql-17/bin

# Start dumb-init to ensure its using PID 1
ENTRYPOINT &#91;"/usr/bin/dumb-init", "--"]

# Starter Script
CMD &#91;"/launch.sh"]

FROM rockylinux:9 as builder

# Install needed Repos, Packages and remove unneded Packages

RUN dnf install -y https://download.postgresql.org/pub/repos/yum/reporpms/EL-9-x86_64/pgdg-redhat-repo-latest.noarch.rpm https://dl.fedoraproject.org/pub/epel/epel-release-latest-9.noarch.rpm \

&& dnf -qy module disable postgresql \

&& dnf update -y \

&& dnf install -y postgresql17-server postgresql17 dumb-init \

&& dnf remove -y kernel kernel-core kernel-modules man-db man-pages NetworkManager\

&& dnf groupremove -y "Development Tools" "Base" "Standard" \

&& dnf autoremove -y \

&& dnf clean all \

&& mkdir -p /data \

&& chown -R postgres:postgres /data \

&& rm -rf /var/cache/dnf /usr/share/doc /usr/share/man;

# Create new Container and Copy from builder

FROM scratch

COPY --from=builder / /

COPY launch.sh /launch.sh

RUN chmod +x /launch.sh

USER postgres

ENV PGDATA=/data/pgdata \

PATH=$PATH:/usr/pgsql-17/bin

# Start dumb-init to ensure its using PID 1

ENTRYPOINT ["/usr/bin/dumb-init", "--"]

# Starter Script

CMD ["/launch.sh"]

Our launcher.sh

#!/bin/bash


# Initialise DB using the initdb command
initialize_db() {
   initdb -D "$PGDATA"
   if &#91; $? -eq 0 ]; then
       echo "Initialization successfully completed."
   else
       echo "Error during initialization of the data directory." &gt;&amp;2
       exit 1
   fi
}

# Start PostgreSQL
start_postgres() {
   echo "Starting PostgreSQL Server"

   sed -i "s/logging_collector = on/logging_collector = off/g" "$PGDATA/postgresql.conf"

   exec postgres -D "$PGDATA"
}

# Check whether directory exists, if yes, start database, if no, initialise and start database
if &#91; -d "$PGDATA" ]; then
   start_postgres
else
   initialize_db
   start_postgres

#!/bin/bash

# Initialise DB using the initdb command

initialize_db() {

initdb -D "$PGDATA"

if [ $? -eq 0 ]; then

echo "Initialization successfully completed."

else

echo "Error during initialization of the data directory." >&2

exit 1

}

# Start PostgreSQL

start_postgres() {

echo "Starting PostgreSQL Server"

sed -i "s/logging_collector = on/logging_collector = off/g" "$PGDATA/postgresql.conf"

exec postgres -D "$PGDATA"

}

# Check whether directory exists, if yes, start database, if no, initialise and start database

if [ -d "$PGDATA" ]; then

start_postgres

else

initialize_db

start_postgres

Let's build and launch the container once

$ docker build . --tag my_first_pg_container:0.0.1
   &#91;+] Building 3.0s (11/11) FINISHED                                                                                                                                                        
   ...
    =&gt; =&gt; naming to docker.io/library/my_first_pg_container:0.0.1 

$ docker run -it docker.io/library/my_first_pg_container:0.0.1
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.

The database cluster will be initialized with locale "C".
The default database encoding has accordingly been set to "SQL_ASCII".
The default text search configuration will be set to "english".

Data page checksums are disabled.

creating directory /data/pgdata ... ok
creating subdirectories ... ok
selecting dynamic shared memory implementation ... posix
selecting default "max_connections" ... 100
selecting default "shared_buffers" ... 128MB
selecting default time zone ... UTC
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
syncing data to disk ... ok

initdb: warning: enabling "trust" authentication for local connections
initdb: hint: You can change this by editing pg_hba.conf or using the option -A, or --auth-local and --auth-host, the next time you run initdb.

Success. You can now start the database server using:

    pg_ctl -D /data/pgdata -l logfile start

Initialization successfully completed.
Starting PostgreSQL Server
2024-11-20 16:16:23.320 UTC &#91;7] LOG:  starting PostgreSQL 17.1 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 11.5.0 20240719 (Red Hat 11.5.0-2), 64-bit
2024-11-20 16:16:23.321 UTC &#91;7] LOG:  listening on IPv6 address "::1", port 5432
2024-11-20 16:16:23.321 UTC &#91;7] LOG:  listening on IPv4 address "127.0.0.1", port 5432
2024-11-20 16:16:23.333 UTC &#91;7] LOG:  listening on Unix socket "/run/postgresql/.s.PGSQL.5432"
2024-11-20 16:16:23.346 UTC &#91;7] LOG:  listening on Unix socket "/tmp/.s.PGSQL.5432"
2024-11-20 16:16:23.356 UTC &#91;21] LOG:  database system was shut down at 2024-11-20 16:16:18 UTC
2024-11-20 16:16:23.371 UTC &#91;7] LOG:  database system is ready to accept connections
2024-11-20 16:21:23.453 UTC &#91;19] LOG:  checkpoint starting: time
2024-11-20 16:21:27.840 UTC &#91;19] LOG:  checkpoint complete: wrote 46 buffers (0.3%); 0 WAL file(s) added, 0 removed, 0 recycled; write=4.323 s, sync=0.030 s, total=4.387 s; sync files=11, longest=0.011 s, average=0.003 s; distance=270 kB, estimate=270 kB; lsn=0/1524158, redo lsn=0/1524100

$ docker build . --tag my_first_pg_container:0.0.1

[+] Building 3.0s (11/11) FINISHED

...

=> => naming to docker.io/library/my_first_pg_container:0.0.1

$ docker run -it docker.io/library/my_first_pg_container:0.0.1

The files belonging to this database system will be owned by user "postgres".

This user must also own the server process.

The database cluster will be initialized with locale "C".

The default database encoding has accordingly been set to "SQL_ASCII".

The default text search configuration will be set to "english".

Data page checksums are disabled.

creating directory /data/pgdata ... ok

creating subdirectories ... ok

selecting dynamic shared memory implementation ... posix

selecting default "max_connections" ... 100

selecting default "shared_buffers" ... 128MB

selecting default time zone ... UTC

creating configuration files ... ok

running bootstrap script ... ok

performing post-bootstrap initialization ... ok

syncing data to disk ... ok

initdb: warning: enabling "trust" authentication for local connections

initdb: hint: You can change this by editing pg_hba.conf or using the option -A, or --auth-local and --auth-host, the next time you run initdb.

Success. You can now start the database server using:

pg_ctl -D /data/pgdata -l logfile start

Initialization successfully completed.

Starting PostgreSQL Server

2024-11-20 16:16:23.320 UTC [7] LOG: starting PostgreSQL 17.1 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 11.5.0 20240719 (Red Hat 11.5.0-2), 64-bit

2024-11-20 16:16:23.321 UTC [7] LOG: listening on IPv6 address "::1", port 5432

2024-11-20 16:16:23.321 UTC [7] LOG: listening on IPv4 address "127.0.0.1", port 5432

2024-11-20 16:16:23.333 UTC [7] LOG: listening on Unix socket "/run/postgresql/.s.PGSQL.5432"

2024-11-20 16:16:23.346 UTC [7] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432"

2024-11-20 16:16:23.356 UTC [21] LOG: database system was shut down at 2024-11-20 16:16:18 UTC

2024-11-20 16:16:23.371 UTC [7] LOG: database system is ready to accept connections

2024-11-20 16:21:23.453 UTC [19] LOG: checkpoint starting: time

2024-11-20 16:21:27.840 UTC [19] LOG: checkpoint complete: wrote 46 buffers (0.3%); 0 WAL file(s) added, 0 removed, 0 recycled; write=4.323 s, sync=0.030 s, total=4.387 s; sync files=11, longest=0.011 s, average=0.003 s; distance=270 kB, estimate=270 kB; lsn=0/1524158, redo lsn=0/1524100

Summary

In this blog article, we have looked at the various Docker instructions and different approaches to building small containers. Finally, we built our first PostgreSQL container, which is of course only intended for experimentation.

In the next part, we will take a deeper look at productively usable PostgreSQL containers and think about the proper use of containers.

Quick start: How to build your first PostgreSQL container

Some additional vocabulary

Stages and their use

Our Dockerfile:

Our launcher.sh

Let's build and launch the container once

Summary

Leave a Reply Cancel reply

Matthias Grömmer

Blog Tags

NEWSLETTER

Articles by our PostgreSQL Experts