CYBERTEC PostgreSQL Logo

PostgreSQL exclusive chain cron jobs using pg_timetable

12.2020 / Category: / Tags: | |

I wrote about the new pg_timetable 3 major release not so long ago. Two essential features were highlighted:
- new session locking implementation
- new jackc/pgx Golang library used

Today I want to reveal one more advanced feature! Fasten your seat belts!
First, we need to distinguish exclusive client session mode from exclusive chain execution mode.

Exclusive client session

Each pg_timetable instance, known as a client, is pre-set to work in the exclusive mode. Meaning there is no possibility each target database can hold more than one active client connection with the same client name. How we distinguish clients? Simple, the command-line parameter:
pg_timetable --clientname=worker01 ...

It is possible to run as many different clients as you want. Take a look:

Of course, life is complicated, and sometimes more than one pg_timetable instance with the same client name may connect to the target database. In this case, pg_timetable ensures that one and only one is executing chains exclusively. The second will wait until the active one will disconnect.

Frankly speaking, you can run whatever number of pg_timetable clients with the same name. The only one will be active (primary), and all others will be waiting (standby). The obvious drawback of such a situation is losing database connection slots to idle clients.

Exclusive chain

Let's get back to chains. By default, all chains are supposed to run simultaneously at their schedule. This number is physically limited by the number of pg_timetable instance parallel workers. Right now, this number is 16 for scheduled chains workers and 16 for interval chains workers. These workers are implemented using lightweight goroutines. That doesn't mean you cannot have more than 32 chains scheduled for the same time. It means no more than 16 scheduled are executed at any given time, and others are waiting in the queue.

Maybe later, we can add a settings parameter to specify the exact number of workers to run. However, we have never yet hit the situation where the default settings were not enough.

Imagine there are several pg_timetable instances working simultaneously against the same PostgreSQL database. And some task (exclusive or not) has no specified executor (by default). That leads all clients will execute the same chain in their own session.

One more time. We can only guarantee exclusive chain execution within the pg_timetable client session. We won't block any chain's simultaneous execution in several parallel pg_timetable sessions by design. Please use the client_name field of the timetable.chain_execution_config table during chain creation to prevent this situation.

Why exclusive chains matter? I'm sure you can find many use cases: report building, software updates, external non-transactional data updates (files), other process waiting, etc.

Let's sum up the concept of the exclusive task:

  1. exclusive chain adds itself to the queue
  2. exclusive chain waits until all working chains and jobs finished
  3. exclusive chain starts and blocks all other chains until it's done

Implementation details

If you look at the problem closer, you might see the classic readers-writers problems.

Readers-writers problems (there are some variations, check the wiki for details) occur when a shared piece of resource needs to be accessed by multiple executors. We assume our exclusive chains are writers, and non-exclusive (regular) chains are readers.

⚠ This is not related to a database, file system, or any other IO read-write. Just the concept itself is identical, so we can use the same algorithms efficiently.

Thank God we have standard Golang sync package which provides RWMutex type to solve this in a simple way.

From the Golang docs:

An RWMutex is a reader/writer mutual exclusion lock. The lock can be held by an arbitrary number of readers or a single writer. The zero value for an RWMutex is an unlocked mutex.

In other words, non-exclusive regular chains (readers) don't have to wait for each other. They only have to wait for exclusive chains (writers) holding the lock. And only one exclusive (writer) lock is possible at any given time.

During parallel operation, regular chains will call RWMutex.RLock() before and RWMutex.RUnlock() after transaction is done.

And exclusive chains will call RWMutex.Lock() before and RWMutex.Unlock() after. Simple and effective. I really like to Go!

Further reading:

Finally

This was the first in a series of posts dedicated to the new pg_timetable v3 features. Stay tuned for the components to be highlighted:

  • Asynchronous chains!
  • Debug mode for developing!

The previous post can be found here.

Stay safe, healthy, and wealthy!
Be happy! Peace! Love! ❤

Leave a Reply

Your email address will not be published. Required fields are marked *

CYBERTEC Logo white
Get the newest PostgreSQL Info & Tools


    This site is protected by reCAPTCHA and the Google Privacy Policy & Terms of Service apply.

    ©
    2024
    CYBERTEC PostgreSQL International GmbH
    phone-handsetmagnifiercrosscross-circle
    linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram