2020-03-08 14:26:56 +00:00
|
|
|
|
---
|
|
|
|
|
layout: docs
|
|
|
|
|
title: Periodic Tasks
|
2020-03-28 15:35:28 +00:00
|
|
|
|
permalink: dev/adr/0012_periodic_tasks
|
2020-03-08 14:26:56 +00:00
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
# Periodic Tasks
|
|
|
|
|
|
|
|
|
|
## Context and Problem Statement
|
|
|
|
|
|
|
|
|
|
Currently there is a `Scheduler` that consumes tasks off a queue in
|
|
|
|
|
the database. This allows multiple job executors running in parallel
|
|
|
|
|
racing for the next job to execute. This is for executing tasks
|
|
|
|
|
immediately – as long as there are enough resource.
|
|
|
|
|
|
|
|
|
|
What is missing, is a component that maintains periodic tasks. The
|
|
|
|
|
reason for this is to have house keeping tasks that run regularily and
|
|
|
|
|
clean up stale or unused data. Later, users should be able to create
|
2020-03-19 21:42:58 +00:00
|
|
|
|
periodic tasks, for example to read e-mails from an inbox or to be
|
|
|
|
|
notified of due items.
|
2020-03-08 14:26:56 +00:00
|
|
|
|
|
|
|
|
|
The problem is again, that it must work with multiple job executor
|
|
|
|
|
instances running at the same time. This is the same pattern as with
|
|
|
|
|
the `Scheduler`: it must be ensured that only one task is used at a
|
|
|
|
|
time. Multiple job exectuors must not schedule a perdiodic task more
|
|
|
|
|
than once. If a periodic tasks takes longer than the time between
|
|
|
|
|
runs, it must wait for the next interval.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
## Considered Options
|
|
|
|
|
|
|
|
|
|
1. Adding a `timer` and `nextrun` field to the current `job` table
|
|
|
|
|
2. Creating a separate table for periodic tasks
|
|
|
|
|
|
|
|
|
|
## Decision Outcome
|
|
|
|
|
|
|
|
|
|
The 2. option.
|
|
|
|
|
|
|
|
|
|
For internal housekeeping tasks, it may suffice to reuse the existing
|
|
|
|
|
`job` queue by adding more fields such that a job may be considered
|
|
|
|
|
periodic. But this conflates with what the `Scheduler` is doing now
|
|
|
|
|
(executing tasks as soon as possible while being bound to some
|
2020-03-19 21:42:58 +00:00
|
|
|
|
resource limits) with a completely different subject.
|
2020-03-08 14:26:56 +00:00
|
|
|
|
|
|
|
|
|
There will be a new `PeriodicScheduler` that works on a new table in
|
|
|
|
|
the database that is representing periodic tasks. This table will
|
2020-03-19 21:42:58 +00:00
|
|
|
|
share fields with the `job` table to be able to create `RJob` records.
|
|
|
|
|
This new component is only taking care of periodically submitting jobs
|
|
|
|
|
to the job queue such that the `Scheduler` will eventually pick it up
|
|
|
|
|
and run it. If the tasks cannot run (for example due to resource
|
|
|
|
|
limitation), the periodic scheduler can't do nothing but wait and try
|
|
|
|
|
next time.
|
2020-03-08 14:26:56 +00:00
|
|
|
|
|
|
|
|
|
```sql
|
|
|
|
|
CREATE TABLE "periodic_task" (
|
|
|
|
|
"id" varchar(254) not null primary key,
|
|
|
|
|
"enabled" boolean not null,
|
|
|
|
|
"task" varchar(254) not null,
|
|
|
|
|
"group_" varchar(254) not null,
|
|
|
|
|
"args" text not null,
|
|
|
|
|
"subject" varchar(254) not null,
|
|
|
|
|
"submitter" varchar(254) not null,
|
|
|
|
|
"priority" int not null,
|
|
|
|
|
"worker" varchar(254),
|
|
|
|
|
"marked" timestamp,
|
|
|
|
|
"timer" varchar(254) not null,
|
|
|
|
|
"nextrun" timestamp not null,
|
|
|
|
|
"created" timestamp not null
|
|
|
|
|
);
|
|
|
|
|
```
|
|
|
|
|
|
2020-03-19 21:42:58 +00:00
|
|
|
|
Preparing for other features, at some point periodic tasks will be
|
|
|
|
|
created by users. It should be possible to disable/enable them. The
|
|
|
|
|
next 6 properties are needed to insert jobs into the `job` table. The
|
|
|
|
|
`worker` field (and `marked`) are used to mark a periodic job as
|
|
|
|
|
"being worked on by a job executor".
|
2020-03-08 14:26:56 +00:00
|
|
|
|
|
|
|
|
|
The `timer` is the schedule, which is a
|
|
|
|
|
[systemd-like](https://man.cx/systemd.time#heading7) calendar event
|
|
|
|
|
string. This is parsed by [this
|
|
|
|
|
library](https://github.com/eikek/calev). The `nextrun` field will
|
|
|
|
|
store the timestamp of the next time the task would need to be
|
|
|
|
|
executed. This is needed to query this table for the newest task.
|
|
|
|
|
|
|
|
|
|
The `PeriodicScheduler` works roughly like this:
|
|
|
|
|
|
|
|
|
|
On startup:
|
|
|
|
|
- Remove stale worker values. If the process has been killed, there
|
|
|
|
|
may be marked tasks which must be cleared now.
|
|
|
|
|
|
|
|
|
|
Main-Loop:
|
|
|
|
|
0. Cancel current scheduled notify (see 4. below)
|
|
|
|
|
1. get next (= earliest & enabled) periodic job
|
|
|
|
|
2. if none: stop
|
|
|
|
|
3. if triggered (= `nextrun <= 'now'`):
|
|
|
|
|
- Mark periodic task. On fail: goto 1.
|
|
|
|
|
- Submit new job into the jobqueue:
|
|
|
|
|
- Update `nextrun` field
|
|
|
|
|
- Check for non-final jobs of that name. This is required to not
|
|
|
|
|
run the same periodic task multiple times concurrently.
|
|
|
|
|
- if exist: goto 4.
|
|
|
|
|
- if not exist: submit job
|
|
|
|
|
- Unmark periodic task
|
|
|
|
|
4. if future
|
|
|
|
|
- schedule notify: notify self to run again next time the task
|
|
|
|
|
schedule triggers
|