CYBERTEC PostgreSQL Logo

pg_timetable: Advanced PostgreSQL scheduling

01.2020 / Category: / Tags: |

UPDATED 14.02.2023: pg_timetable is a brand-new job scheduler for PostgreSQL implemented completely from scratch (in Go). Traditional job scheduling is quite limited and does not cover all we needed in real life. Therefore, we thought about job execution in more detail and came up with a new concept that has not been seen in the PostgreSQL world before. pg_timetable does not simply execute some simple SQL statements periodically. It has been designed to execute entire sequences of operations at a time while offering some other important features we will discuss in this post.

Installing and running pg_timetable

You can run pg_timetable in 2 ways:

  • Container
  • Standalone application

Both methods are described on our Github page so it should be easy for most people to run pg_timetable and to give it a try.

For the sake of simplicity we assume that you have installed our new PostgreSQL job scheduler and everything is up and running. So let's take a look and see how pg_timetable can be configured.

Base tasks and chains

The core idea is to allow users to execute more complicated sequences of operations. Now what does that mean? Suppose you want to …

  • Start a transaction
  • Write a log entry to the database
  • Download a file
  • Import the file
  • Run aggregations
  • Commit the transaction

A sequence like that is quite common. Let us take a look at those steps in pg_timetable terminology: Each step on the way is a “base task”. There are various types of base tasks:

  • Built-in tasks (such as “send email”, start transaction, etc.)
  • SQL tasks (some database instruction)
  • Program tasks (some external executables)

Those tasks can be combined and turned into chains. A base task can take parameters which are added at runtime. What does that mean? Suppose you want to send an email once a day. Who is going to receive this email? This kind of information can be passed to chains / base tasks scheduled for execution.

Configuring pg_timetable: An example

Let us take a look at an easy pg_timetable configuration examples:

These are the simplest one-task chain examples, so-called jobs. A job is a chain with only one task according to our terminology.

There are a couple of interesting things here:

  • job_max_instances tells us if this job must not run concurrently. Set the vaslue to one to specify there is only single running job instance allowed at a time
  • job_self_destruct = false means that the chain will survive its own execution. If you set this to true, a chain is only executed once (in case it succeeds).
  • job_exclusive means that other chains doing something should wait until the end of this chain execution.

Detailed description of the add_job() function is available in the manual.

If you want to see more complex multi-step chain examples, please, visit our Samples folder in the repo and appropriate Samples section in the manual.

Using pg_timetable concurrency protection

Concurrency protection is an important aspect. Suppose you want to backup a database daily. What is going to happen if the backup takes 60 hours to complete? At some point, your server is going to die because jobs pile up. Or maybe backup files will start to overwrite each other. In short: If you want to make sure that only one backup at a time is allowed, pg_timetable concurrency protection is what you want. The max_instances = 1 setting is the easiest way to do that. There is no more need for pid files, no need for anything else. pg_timetable does it all for you.

Utilizing self-destructive chains

More often than not, it happens that a job should only be done once. The idea is simple: Suppose somebody gets a 7-day free trial. After 7 days, you want to shut down the person's account. Obviously, shutting down somebody's account is not done daily - it happens once, and if the process was successful, that was basically it.
Self-destructive chains are especially useful for asynchronous execution. What do I mean by that? Suppose you have a website and you press a button to launch a job that runs for 2 hours. Clearly, nobody is going to sit in front of the browser window and wait. What you do instead is to tell pg_timetable to execute stuff as soon as possible and kill the job as soon as it is ended successfully. Asynchronous execution is the prime use case.

High load

In case you're wondering how many simultaneous jobs can pg_timetable run in one session, we have two special examples for you: one simulates many parallel chains and another shows how to handle many tasks within a chain.

What is even better, the later shows you how to check the result code and output of a previous task. If the last task failed, that is possible only if ignore_error boolean = true is set for that task. Otherwise, a scheduler will stop the chain. This sample shows how to calculate failed, successful, and the total number of tasks executed. Based on these values, we can calculate the success ratio:

Making use of cron-style and interval configuration

To make configuration easier, we have introduced a cron-style configuration interface:

But what if you want to have enhanced control over jobs? It's not that unusual one cannot foresee the execution time of some complicated operations, like REFRESH MATERIALIZED VIEW, CREATE INDEX CONCURRENTLY, etc. It's wise to run the next iteration after the some period, let say in an hour, or a day. In this case pg_timetable provides special interval format:

  • '@after 2 hours' will start the next chain iteration in two hours after the finish of the previous run;
  • while '@every 2 hours' will spin a new chain run every two hours no matter if and when the previous run is finished;
  • and a special '@reboot' value will run the chain at the very beginning of a pg_timetable session. Pay attention: not the server reboot! Rotate logs, install updates, check system status are the good examples of things you might want to do during start-up.

Finally …

If you like pg_timetable, we are looking forward to receive you feedback, or maybe even feature requests, bug reports and contributions. We also want to point out that commercial support for pg_timetable is available internationally.

If you want to read something else right now, we can recommend our post about pg_permissions which is available here.


In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Facebook or LinkedIn.

6 responses to “pg_timetable: Advanced PostgreSQL scheduling”

  1. Fantastic work team.
    I really don't understand why you are implementing core system features in Go.
    Why is this not done C ?
    Cheers

    • Hello Rafael. Thanks for your comment.

      Would you be kind to provide some examples on how PGQ can be used for scheduling? They are completely different, unless we missed something.

      Thanks

  2. Hi, I am new to PostgreSQL as I come from a SQL Server environment.
    I could not get the Windows installer to work but managed to install pg_timetable using the git clone option.

    I do like the tool but have a question regarding the scheduling.
    Do I have to have a command prompt open with the scheduling enabled for it to work (one per database?), or is there another option, i.e., backgroup worker, etc.? As I don't like keeping the command prompt open and running on the LIVE environment as it could be easy closed by someone without realising?

    • No; that would be pretty useless.
      You just add the job, then you can close the database connection, and the job will be started all by itself.

Leave a Reply

Your email address will not be published. Required fields are marked *

CYBERTEC Logo white
Get the newest PostgreSQL Info & Tools


    This site is protected by reCAPTCHA and the Google Privacy Policy & Terms of Service apply.

    ©
    2024
    CYBERTEC PostgreSQL International GmbH
    phone-handsetmagnifiercrosscross-circle
    linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram