mirror of
https://github.com/TheAnachronism/docspell.git
synced 2025-06-23 19:08:26 +00:00
Addon docs
This commit is contained in:
376
website/site/content/docs/addons/writing.md
Normal file
376
website/site/content/docs/addons/writing.md
Normal file
@ -0,0 +1,376 @@
|
||||
+++
|
||||
title = "Writing"
|
||||
insert_anchor_links = "right"
|
||||
description = "How to write addons"
|
||||
weight = 20
|
||||
template = "docs.html"
|
||||
+++
|
||||
|
||||
# Writing Addons
|
||||
|
||||
Writing an addon can be divided into two things:
|
||||
|
||||
- create the program
|
||||
- define how to package and run it
|
||||
|
||||
The next sections describe both parts. For a quick start, check out
|
||||
the example addons.
|
||||
|
||||
As previously written, you can choose a language. The interaction with
|
||||
docspell happens by exchanging JSON data. So, whatever you choose, it
|
||||
should be possible to read and produce JSON with some convenience.
|
||||
|
||||
|
||||
# Writing the program
|
||||
|
||||
## Interface to Docspell
|
||||
|
||||
The interface to Docspell is JSON data. The addon receives all inputs
|
||||
as JSON and may return a JSON object as output (via stdout).
|
||||
|
||||
An addon can be executed in different contexts. Depending on this, the
|
||||
available inputs differ. The addon always receives one argument, which
|
||||
is a file containing the user supplied data (it may be empty). A user
|
||||
is able to provide data to every addon from the web-ui.
|
||||
|
||||
All other things are provided as environment variables. There are
|
||||
environment variables that are always provided and some are only
|
||||
available for specific contexts.
|
||||
|
||||
For example, an addon that is executed in the context of an item
|
||||
(maybe after processing or when a user selects an addon to run "on an
|
||||
item"), Docspell prepares all data for the corresponding item and
|
||||
makes it available to the addon. In contrast, an addon executed
|
||||
periodically by a schedule, won't have this data available.
|
||||
|
||||
|
||||
## Basic Environment
|
||||
|
||||
The following environment variables are always provided by Docspell:
|
||||
|
||||
- `ADDON_DIR` points to the directory containing the extracted addon
|
||||
zip file
|
||||
- `TMPDIR` / `TMP_DIR` a directory for storing temporary data
|
||||
- `OUTPUT_DIR` a directory for storing files that should be processed
|
||||
by docspell
|
||||
- `CACHE_DIR` a directory for storing data that should stay between
|
||||
addon runs
|
||||
|
||||
It is very much recommended to always use these environment variables
|
||||
when reading and writing data. This keeps Docspell in control about
|
||||
the exact location.
|
||||
|
||||
The working directory will be set to a directory that is also
|
||||
temporary, but please don't rely on that. Use the environment
|
||||
variables.
|
||||
|
||||
## Item data
|
||||
|
||||
When executed in the context of an item. Meaning for triggers:
|
||||
`final-process-item`, `final-reprocess-item`, `existing-item`.
|
||||
|
||||
### `ITEM_DATA_JSON`
|
||||
|
||||
This environment variable points to a JSON file containing information
|
||||
about the current item. If it is run at processing time, it includes
|
||||
all information gathered so far by Docspell.
|
||||
|
||||
**Example**
|
||||
{{ incl_json(path="templates/shortcodes/item-data") }}
|
||||
|
||||
|
||||
### `ITEM_ARGS_JSON`
|
||||
|
||||
This environment variable points to a JSON file that contains the user
|
||||
supplied information with an upload request. That is, a user may
|
||||
specify tags or a language when uploading files. This would be in this
|
||||
file.
|
||||
|
||||
*This is only available for uploads. Trigger `final-process-item`.*
|
||||
|
||||
**Example**
|
||||
{{ incl_json(path="templates/shortcodes/item-args") }}
|
||||
|
||||
|
||||
### `ITEM_ORIGINAL_JSON` and `ITEM_PDF_JSON`
|
||||
|
||||
These JSON files contains a list of objects. Each object provides
|
||||
properties about a file - either the original file or the converted
|
||||
pdf. The structure is the same.
|
||||
|
||||
**Example**
|
||||
{{ incl_json(path="templates/shortcodes/file-meta") }}
|
||||
|
||||
|
||||
|
||||
### Directories
|
||||
|
||||
These environment variables point to directories that contain the attachment files.
|
||||
|
||||
- `ITEM_PDF_DIR` contains all converted pdf files, the attachment id is the filename
|
||||
- `ITEM_ORIGINAL_DIR` contains all original files, the attachment id is the filename
|
||||
|
||||
For example, to obtain a converted pdf file, lookup the id in
|
||||
`ITEM_PDF_JSON` and then construct the file name via
|
||||
`ITEM_PDF_DIR/{id}`.
|
||||
|
||||
|
||||
## Session for dsc
|
||||
|
||||
An addon may use [dsc](@/docs/tools/cli.md) which requires for many
|
||||
commands a valid session identifier. Usually this is obtained by
|
||||
logging in (i.e. using `dsc login`). This is not really feasible from
|
||||
inside an addon, of course. Therefore you can configure an addon to
|
||||
run on behalf of some user when creating the run configuration.
|
||||
Docspell then generates a valid session identifier and puts it into
|
||||
the environment. The [dsc](@/docs/tools/cli.md) tool will pick them up
|
||||
automatically.
|
||||
|
||||
It will also setup the URL to connect to some restserver. (If you have
|
||||
multiple rest-servers running, it will pick one randomly).
|
||||
|
||||
- `DSC_SESSION` env variable containing a session identifier. It's
|
||||
validity is coupled on the configured timeout.
|
||||
- `DSC_DOCSPELL_URL` the base url to some rest server
|
||||
|
||||
That means when using an addon in this way, you can simply use `dsc`
|
||||
without worrying about authentication or the correct URL to connect
|
||||
to.
|
||||
|
||||
|
||||
## Output
|
||||
|
||||
Docspell doesn't interpret the returncode of an addon, except checking
|
||||
for being equal to `0` which indicates a successful run.
|
||||
|
||||
In order to do change data in Docspell, the addon program can run
|
||||
`dsc` (for example) to change some state - like setting tags etc. But
|
||||
the preferred approach would be to return instructions for Docspell.
|
||||
Docspell will execute the instructions when the addon terminates
|
||||
successfully - that is with return code `0`.
|
||||
|
||||
These instructions are in a JSON object which needs to go to stdout.
|
||||
You can use stderr in an addon for logging/debugging purposes. But if
|
||||
you specify `collectOutput: true` in the descriptior, then stdout must
|
||||
only return this specific JSON (or nothing, empty output is ignored).
|
||||
|
||||
You find the complete structure below. It consists of these parts:
|
||||
|
||||
- `commands`: let's you declare actions to do for an item or attachment
|
||||
- `files`: defines files relative to `OUTPUT_DIR` that should be
|
||||
processed
|
||||
- `newItems`: declares files relative to `OUTPUT_DIR` that should be
|
||||
processed as new uploads
|
||||
|
||||
The `commands` allows to set tags, fields and other things. All parts
|
||||
are optional, you don't need to return the complete structure. Just
|
||||
returning `commands` or only `files` is ok.
|
||||
|
||||
**Example**
|
||||
{{ incl_json(path="templates/shortcodes/addon-output") }}
|
||||
|
||||
|
||||
# Descriptor
|
||||
|
||||
An addon must provide an *addon descriptior*, which is a yaml or json
|
||||
file looking like this:
|
||||
|
||||
```yaml
|
||||
# The meta section is required. Name and version must not contain
|
||||
# whitespace
|
||||
meta:
|
||||
name: "name-of-addon"
|
||||
version: "2.21"
|
||||
description: |
|
||||
Describe the purpose and how it must be used here
|
||||
|
||||
# Defining when this addon is run. This is used to guide the user
|
||||
# interface in selecting an addon. At least one is required to specify.
|
||||
#
|
||||
# Possible values:
|
||||
# - scheduled: requires to enter a timer to run this addon periodically
|
||||
# - final-process-item: the final step when processing an item
|
||||
# - final-reprocess-item: the final step when reprocessing an item
|
||||
# - existing-item: A user selects the addon to run on an item
|
||||
triggers:
|
||||
- final-process-item
|
||||
- final-reprocess-item
|
||||
- existing-item
|
||||
|
||||
# How to build and run this addon (optional). If missing, auto
|
||||
# detection will enable a nix runner if a `flake.nix` is found in the
|
||||
# source root and docker if a `Dockerfile` is found.
|
||||
#
|
||||
# Both runners are compared to what is enabled at the server.
|
||||
runner:
|
||||
# Building the program using nix flakes. This requires a flake.nix
|
||||
# file in the source root with a default package and a flake-enabled
|
||||
# nix on the joex machine.
|
||||
#
|
||||
# The program is build via `nix build`. If the joex machine has
|
||||
# systemd-nspawn installed, it is used to run the addon inside a
|
||||
# container. Otherwise the addon is run directly on the machine.
|
||||
nix:
|
||||
enable: true
|
||||
|
||||
# Docker based runner can define a custom image to use. If a `build`
|
||||
# key exists pointing to a Dockerfile, the image is build before. If
|
||||
# the docker image is complex, you can build it independently and
|
||||
# provide the pre-build image.
|
||||
#
|
||||
# The program is run via `docker run` passing the arguments to the
|
||||
# addon. Thus it expects the entrypoint to be correctly configured
|
||||
# to the executable. You may use `args` in order to prepend
|
||||
# additional arguments, like the path to an executable if the image
|
||||
# requires that. The joex machine must have docker installed and the
|
||||
# user running joex must be allowed to use docker. You must either
|
||||
# define an image with an appropriate entry point or a dockerfile.
|
||||
docker:
|
||||
enable: false
|
||||
#image: myorg/myimage:latest
|
||||
build: Dockerfile
|
||||
|
||||
# Trivial runner that simply executes the file specified with
|
||||
# `exec`. Nothing is build before. This runner usually requires that
|
||||
# the joex machine contains all dependencies needed to run the
|
||||
# addon. You may need to install additional software on the machine
|
||||
# running joex.
|
||||
trivial:
|
||||
enable: false
|
||||
exec: src/addon.sh
|
||||
|
||||
# Optional arguments/options given to the program. The program
|
||||
# receives at least one argument, which is a file to the user input as
|
||||
# supplied in the application. The arguments here are prepended.
|
||||
args:
|
||||
|
||||
|
||||
options:
|
||||
# If false, the program is run inside a private network, blocking
|
||||
# traffic to the host and networks reachable from there. This only
|
||||
# applies if the addon can be run inside a container.
|
||||
#
|
||||
# If the addon runs side effects (such as using dsc to set tags),
|
||||
# this must be set to `true`.
|
||||
#
|
||||
# Default is false.
|
||||
networking: true
|
||||
|
||||
# If true, the stdout of the program is parsed into a JSON structure
|
||||
# that is interpreted as actions executed by the task that runs the
|
||||
# addon. If the addon runs side effects only, set this to `false`
|
||||
# and the output is ignored.
|
||||
#
|
||||
# It is recommended to use this approach, if possible. It allows
|
||||
# docspell itself to apply any changes and the addon can run
|
||||
# completely isolated.
|
||||
#
|
||||
# Default is false.
|
||||
collectOutput: true
|
||||
```
|
||||
|
||||
|
||||
# Packaging
|
||||
|
||||
Docspell can use different ways to build and run the addon:
|
||||
`nix-flake`, `docker` and `trivial`. The first two allow to package
|
||||
the addon in a defined way (with a single dependency, either nix or
|
||||
docker) and then execute it independently from the underlying system.
|
||||
This makes it possible to execute the addon on a variety of systems.
|
||||
This is especially useful for addons that are meant to be public and
|
||||
reusable by different people.
|
||||
|
||||
The "trivial" runner is only executing some program specified in
|
||||
`docspell-addon.yaml`, directly on the joex machine (or via
|
||||
`systemd-nspawn`). The machine running joex must then provide all
|
||||
necessary dependencies and it must be compatible to run the addon. It
|
||||
may be useful especially for personal addons.
|
||||
|
||||
|
||||
## nix flake
|
||||
|
||||
Using [nix](https://nixos.org) with
|
||||
[flakes](https://nixos.wiki/wiki/Flakes) enabled, is the recommended
|
||||
approach. It is very flexible and reproducible while sharing most
|
||||
dependencies (in contrast to docker where each image contains the same
|
||||
packages again and again).
|
||||
|
||||
Docspell runs `nix build` to build the addon and then executes the
|
||||
file produced to `$out/bin`.
|
||||
|
||||
|
||||
## docker
|
||||
|
||||
For docker it is recommended to provide pre-build images. Docspell can
|
||||
build images from provided `Dockerfile`, but for larger images it
|
||||
might be better to do this apriori.
|
||||
|
||||
Docspell will run the addon using `docker run …` passing it only the
|
||||
user-input file as argument. Thus the image must define an appropriate
|
||||
`ENTRYPOINT`.
|
||||
|
||||
# Examples
|
||||
## Minimal Addon
|
||||
|
||||
The steps below create a minimal addon:
|
||||
|
||||
1. Create a bash script `addon.sh` with this content:
|
||||
|
||||
```bash
|
||||
#!/usr/bin/env bash
|
||||
|
||||
echo "Hello world!"
|
||||
```
|
||||
2. Make it executable:
|
||||
```bash
|
||||
chmod +x addon.sh
|
||||
```
|
||||
3. Create a yaml file `docspell-addon.yaml` with this content:
|
||||
|
||||
```yaml
|
||||
meta:
|
||||
name: "minimal-addon"
|
||||
version: "0.1.0"
|
||||
triggers:
|
||||
- existing-item
|
||||
- scheduled
|
||||
runner:
|
||||
trivial:
|
||||
enable: true
|
||||
exec: addon.sh
|
||||
```
|
||||
4. Create a zip file containing these two files:
|
||||
```bash
|
||||
zip addon.zip docspell-addon.yaml addon.sh
|
||||
```
|
||||
|
||||
The addon is now ready. Make it available via an url (use some file
|
||||
sharing tool, upload it somewhere etc) and then it can be installed
|
||||
and run.
|
||||
|
||||
## Non-Minimal Addon
|
||||
|
||||
The minimal example above is good to see what is required, but it is
|
||||
not very useful…. Please see this post about the [audio file
|
||||
addon](@/blog/2022-05-16_audio_file_addon.md) that walks through a
|
||||
more useful addon.
|
||||
|
||||
# Misc
|
||||
|
||||
## Advantages of "pure" addons
|
||||
|
||||
Although the output structure is not set in stone, it is recommended
|
||||
to use this in contrast to directly changing state via `dsc`.
|
||||
|
||||
- outputs of all addons are collected and only applied if all were
|
||||
successful; in contrast side effects are always applied even if the
|
||||
addon fails shortly after
|
||||
- since addons are executed as joex tasks, their result can be send as
|
||||
events to another http server for further processing.
|
||||
- addons can run in an isolated environment without network (no data
|
||||
can go out)
|
||||
|
||||
## Use addons in other addons?
|
||||
|
||||
This can be achieved very conveniently by using `nix`. If addons are
|
||||
defined as a nik flake, they can be easily consumed by each other.
|
Reference in New Issue
Block a user