There are many reasons, actually, why people might want to export pgAgent jobs to pg_timetable. Most of them may be spotted by taking a look at the PostgreSQL schedulers comparison table I introduced in one of my previous posts and the supported platform table from the official readme.
Table of Contents
In my previous article I highlighted the reasons why and the ways how one might want to move scheduled jobs from pg_cron to pg_timetable.
SQL
and BATCH
(shell) tasks. pg_timetable's task can be one of SQL
, PROGRAM
, and BUILTIN
. Thus to emulate pgAgent's BATCH
step, pg_timetable will run a PROGRAM
task calling shell execution, e.g. bash -c "command"
.To export every piece of information available as precisely as possible, use this SQL snippet. I will explain it step by step.
As I said earlier, we need to convert boolean arrays storing execution information for minutes, hours, days, and months to proper cron notation.
The first thing to notice is that arrays are of different lengths, e.g., 60 for minutes, 12 for months, etc. Second, the cron clause starts from 0 for hours and minutes but with 1 for other units.
The trick is simple: unnest the input array, join it with order numbers, throw out rows with false values, and combine numbers into a cron-acceptable comma-separated string. If all values in the array are false, then just use asterisk cron notation.
1 2 3 4 5 6 7 8 |
CREATE OR REPLACE FUNCTION bool_array_to_cron(bool[], start_with int4 DEFAULT 0) RETURNS TEXT AS $$ WITH u AS ( SELECT unnest($1) e, generate_series($2, array_length($1, 1)-1+$2) AS i ) SELECT COALESCE(string_agg(i::text, ','), '*') FROM u WHERE e $$ LANGUAGE sql; |
pg_timetable is a cross-platform tool that may run on most of the known platforms. Because of this, we want to know exactly which shell will be used for batch pgAgent tasks. On Unix-like platforms the obvious choice would be sh -c "command"
. On Windows, it might be either pwsh -c "command"
or cmd /C "command"
.
Thecte_shell
clause serves precisely that purpose. You can easily specify the default shell to use:
1 2 3 |
cte_shell(shell, cmd_param) AS ( VALUES ('sh', '-c') ) |
Please note the naming convention I'm using here. If the CTE clause starts with 'cte', it somehow produces the final result. Meanwhile, CTE clauses named 'pga_*' prepare pgAgent-specific result sets for transformation.
In this CTE clause, we want to return all active pgAgent schedules with cron notation. The schedule name will later be used in the chain name.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
pga_schedule AS ( SELECT s.jscjobid, s.jscname, format('%s %s %s %s %s', bool_array_to_cron(s.jscminutes), bool_array_to_cron(s.jschours), bool_array_to_cron(s.jscmonthdays), bool_array_to_cron(s.jscmonths, 1), bool_array_to_cron(s.jscweekdays, 1)) AS schedule FROM pgagent.pga_schedule s WHERE s.jscenabled AND now() < COALESCE(s.jscend, 'infinity'::timestamptz) AND now() > s.jscstart ) |
This clause will return the result set for all pgAgent jobs scheduled. Because on the target database there might already be some chains, we cannot use jobid
directly. Instead, we want to generate the next chain_id
from the sequence to use later.
Given that pgAgent might have several schedules for the same job, we are specifying jobname
as a concatenation of the job name and the schedule name. cte_chain
will insert all selected rows into timetable.chain
table.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
pga_chain AS ( SELECT nextval('timetable.chain_chain_id_seq'::regclass) AS chain_id, jobid, format('%s @ %s', jobname, jscname) AS jobname, jobhostagent, jobenabled, schedule FROM pgagent.pga_job JOIN pga_schedule ON jobid = jscjobid ), cte_chain AS ( INSERT INTO timetable.chain (chain_id, chain_name, client_name, run_at, live) SELECT chain_id, jobname, jobhostagent, schedule, jobenabled FROM pga_chain ) |
This clause is the most complicated one here. So let me throw some light on it. We want to generate the task_id
in a similar way as we did for chain_id
.
pgAgent steps are performed in alphanumeric order by name. pg_timetable uses the numeric value to specify the task order. To generate these numeric values, we are using the rank() window function here.
The hardest part is with the connection string. pg_timetable and pgAgent allow remote execution for SQL tasks. Within the pgAgent settings, you may use either the database name to execute a task on the same cluster, or specify the connection string for a custom remote cluster. pg_timetable uses connection string notation in both cases. The logic is:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
pga_step AS ( SELECT c.chain_id, nextval('timetable.task_task_id_seq'::regclass) AS task_id, rank() OVER (ORDER BY jstname) AS jstorder, jstid, jstname, jstenabled, CASE jstkind WHEN 'b' THEN 'PROGRAM' ELSE 'SQL' END AS jstkind, jstcode, COALESCE( NULLIF(jstconnstr, ''), CASE WHEN jstdbname = current_database() THEN NULL WHEN jstdbname > '' THEN 'dbname=' || jstdbname END ) AS jstconnstr, jstonerror != 'f' AS jstignoreerror FROM pga_chain c JOIN pgagent.pga_jobstep js ON c.jobid = js.jstjobid ) |
These clauses are pretty straightforward. The only thing to explain is how PROGRAM
tasks are specified in pg_timetable. The command
column should contain the program name (possibly with full path) only. In our case, it's the name of the shell we specified in the first step. All command-line parameters for executing the shell go into the timetable.parameter
table. Pay attention; we don't do any quote escaping here. pg_timetable handles all parameter values automatically.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
cte_tasks AS ( INSERT INTO timetable.task(task_id, chain_id, task_name, task_order, kind, command, database_connection) SELECT task_id, chain_id, jstname, jstorder, jstkind::timetable.command_kind, CASE jstkind WHEN 'SQL' THEN jstcode ELSE sh.shell END, jstconnstr FROM pga_step, cte_shell sh ), cte_parameters AS ( INSERT INTO timetable.parameter (task_id, order_id, value) SELECT task_id, 1, jsonb_build_array(sh.cmd_param, s.jstcode) FROM pga_step s, cte_shell sh WHERE s.jstkind = 'PROGRAM' ) |
Finally, when you execute the migration script, you might see something like this:
1 2 3 4 5 |
chain_id|jobid|jobname |jobhostagent|jobenabled|schedule | --------+-----+------------------------------------------------+------------+----------+--------------------------------------------------------------------------------------------------------+ 91| 1|Download locations and aggregate @ every min |node1 |true |* * * * * | 92| 1|Download locations and aggregate @ test schedule|node1 |true |2 3,6 3 4 1 | 93| 1|Download locations and aggregate @ run download |node1 |true |0,5 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23 * 1,2,3,4,5,6,7,8,9,10,11,12 2,3,4,5,6| |
From this particular example, one may conclude pgAgent has:
--clientname=node1
.If you want to contribute to pg_timetable and help to make it better:
In conclusion, I wish you all the best! ♥️
One more thing. I want to thank everyone here in the Cybertec and the PostgreSQL community for the support you provided to Ukrainians and me personally! #StandWithUkraine 💙💛
I hope we can meet in person at one of the conferences, meetups, or training sessions!
Please leave your comments below. In order to receive regular updates on important changes in PostgreSQL, subscribe to the CYBERTEC newsletter, or follow us on Twitter, Facebook, or LinkedIn
You need to load content from reCAPTCHA to submit the form. Please note that doing so will share data with third-party providers.
More InformationYou are currently viewing a placeholder content from Facebook. To access the actual content, click the button below. Please note that doing so will share data with third-party providers.
More InformationYou are currently viewing a placeholder content from X. To access the actual content, click the button below. Please note that doing so will share data with third-party providers.
More Information
Leave a Reply