Our team is proud to introduce a new major pg_timetable v4 release!
Table of Contents
This time a huge amount of work has been done and the pg_timetable v4 release is backward-incompatible with previous versions. However, job migration is straightforward and shouldn't stop anyone from updating.
We've added detailed Read the Docs documentation and will enhance it further. Let us know what areas should be highlighted or what topics we should add.
Now, let's check out some of the great new features! You can review the official release page for a complete changelog with links to issues, commits, and pull requests.
I want to remind you that pg_timetable is a community project. So, please, don't hesitate to ask any questions , to report bugs, to star pg_timetable project, and to tell the world about it.
The previous release is still considered stable. There are no confirmed bugs nor critical performance issues and is absolutely safe to stick with it.
However, it won't be backpatched if new bugs are discovered. In addition, new features won't be implemented for the v3 branch.
To increase performance, we worked on several fronts.
First of all, we reworked the database schema and simplified it, as much as possible. This way new pg_timetable can execute fewer system queries leaving more bandwidth and connections to the real workers.
Secondly, we rewrote cron-related functions in pure SQL instead of PL/PgSQL. It allows us not only to increase performance but also to get rid of tedious debugging.
And the third pillar is the comprehensive logging system. See details below in the separate section.
As soon as the functionality grows, it becomes error-prone and tedious to handle configuration using only command-line parameters and environmental variables.
Rules of common sense suggest to use configuration files, in this case. We've added a detailed self-explanatory config.example.yaml file for you.
Of course, you are free to use a custom set of options or not to use it at all. For example, it seems like a good idea to provide general options in the configuration file and connection options using environmental variables, for the sake of security.
One of the most frequent tasks for database schedulers is to import some data from external sources to the database, e.g. sensors, logs, official statistics, etc. The CSV format is standard de facto for such kinds of data. PostgreSQL can use the COPY command to import CSV files. But to use this function, the file should be placed in the server's file system, so the PostgreSQL process can access it. That is not always the case for production environments, especially in the cloud era.
To overcome this limitation in the previous version of pg_timetable, we suggest using psql copy command, which performs a frontend (client) copy. This is an operation that runs an SQL COPY
command, but instead of the server reading or writing the specified file, psql reads or writes the file and routes the data between the server and the local file system. This means that file accessibility and privileges are those of the local user, not the server, and no SQL superuser privileges are required.
From now on you don't need any additional programs to perform client copy. One can use the new built-in CopyFromFile
task. You may find the full example in the manual. Below you will notice that the task accepts JSON parameters in the form of an object, e.g.
1 2 3 4 |
{ 'sql': 'COPY location FROM STDIN WITH DELIMITER '|' NULL '<NULL>'', 'filename': 'download/orte_ansi.txt' } |
sql
is the correct COPY .. FROM STDIN
command. One can use any of the standard clauses controlling file specification, e.g. file encoding, condition for insert, escaping, quoting, etc.filename
is the correct path (absolute or relative to pg_timetable) of the source CSV file.This task perfectly works for all formats supported by PostgreSQL: text
, csv
and binary
.
One of the key architectural decisions made for pg_timetable was the database-driven concept. Meaning scheduler sessions are fully controlled by the database and every piece of the output is stored back in the database for further analysis.
That said, people were using output redirection, if they wanted to have logs stored in files, e.g.
1 |
pg_timetable -c loader -u scheduler -d timetable > pgtt.log |
The drawback of this approach is that information is stored as plain text, making it hard for machine processing. In the new version, we have introduced the ability to duplicate logs to a file system using --log-file
command line parameter. Moreover, the user can choose the exact format of the underlying file specifying --log-file-format
: json
or text
.
Another bottleneck we were trying to fix was the database logging. Heavy clients were producing a lot of information to be stored by separate INSERT
statements, causing intensive use of connections. We have come up with an outstanding log hook implementation for the famous logrus package.
The key benefits are:
COPY
machinery;I personally highly encourage you to use our log hook in your projects. Let us know if you think we should make it a standalone project.
pg_timetable has been storing output for program tasks from the very beginning. It's useful for debugging and back-in-time incident analysis. We never thought this feature could be applied to the built-in SQL tasks until Nikolay Samokhvalov proposed it. So now one can have SQL task with SELECT
statement to determine which output will be saved upon execution, e.g.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
my_database=> timetable=> SELECT timetable.add_job('log-sessions', '* * * * *', 'SELECT * FROM timetable.active_session'); add_job --------- 1 (1 row) ... my_database=> SELECT chain_id, finished, output FROM timetable.execution_log; chain_id | finished | output ----------+-------------------------------+---------- 1 | 2021-08-12 15:40:46.064795+02 | SELECT 2 (1 row) |
The output of the SQL task is the tag of the command executed and the number of rows returned or affected. We are not storing the result set since commands can be arbitrarily complicated or rather huge. Users should take care of this and explicitly store anything important, e.g.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 |
my_database=> CREATE TABLE history_session ( ts timestamptz DEFAULT now(), LIKE timetable.active_session); CREATE TABLE my_database=> SELECT timetable.add_job('dump-sessions', '* * * * *', 'INSERT INTO history_session SELECT now(), * FROM timetable.active_session'); add_job --------- 3 (1 row) ... my_database=> SELECT * FROM history_session; ts | client_pid | client_name | server_pid -------------------------------+------------+-------------+------------ 2021-08-12 15:52:02.378981+02 | 10440 | loader-foo | 15676 2021-08-12 15:52:02.378981+02 | 10440 | loader-foo | 16436 2021-08-12 15:52:02.378981+02 | 10440 | loader-foo | 14892 2021-08-12 15:52:02.378981+02 | 10440 | loader-foo | 25480 2021-08-12 15:53:02.446911+02 | 19664 | writer-boo | 21876 2021-08-12 15:53:02.446911+02 | 19664 | writer-boo | 11852 2021-08-12 15:53:02.446911+02 | 19664 | writer-boo | 10052 2021-08-12 15:53:02.446911+02 | 19664 | writer-boo | 27396 (8 rows) my_database=> SELECT chain_id, finished, output FROM timetable.execution_log; chain_id | finished | output ----------+-------------------------------+------------ 1 | 2021-08-12 15:40:46.064795+02 | SELECT 2 1 | 2021-08-12 15:52:02.380711+02 | SELECT 4 3 | 2021-08-12 15:52:02.382296+02 | INSERT 0 4 1 | 2021-08-12 15:52:07.34872+02 | SELECT 3 3 | 2021-08-12 15:52:07.448914+02 | INSERT 0 4 (5 rows) |
We want our scheduler to be as flexible as possible. That's why we have introduced the options category for resource management:
--cron-workers
: Number of parallel workers for scheduled chains (default: 16);--interval-workers
: Number of parallel workers for interval chains (default: 16);--chain-timeout
: Abort any chain that takes more than the specified number of milliseconds;--task-timeout
: Abort any task within a chain that takes more than the specified numberThis was the first in a series of posts dedicated to the new pg_timetable v4 features. Stay tuned for the coolest features to be highlighted.
You can find previous publications following the tag pg_timetable.
Stay safe, healthy, and wealthy!
Be happy! Peace! Love! ❤
+43 (0) 2622 93022-0
office@cybertec.at
You are currently viewing a placeholder content from Facebook. To access the actual content, click the button below. Please note that doing so will share data with third-party providers.
More InformationYou are currently viewing a placeholder content from X. To access the actual content, click the button below. Please note that doing so will share data with third-party providers.
More Information
Great, thx.
Nice to see - it's written in go(lang) and uses pgx.
I'm currently take a look at go also...