docspell/modules/microsite/docs/doc/tools/consumedir.md
Eike Kettner 8500d4d804 Extend consumedir.sh to work with integration endpoint
Now running one consumedir script can upload files to multiple
collectives separately.
2020-06-28 00:08:37 +02:00

4.7 KiB
Raw Blame History

layout title permalink
docs Consume Directory doc/tools/consumedir

{{ page.title }}

The consumerdir.sh is a bash script that works in two modes:

  • Go through all files in given directories (recursively, if -r is specified) and sent each to docspell.
  • Watch one or more directories for new files and upload them to docspell.

It can watch or go through one or more directories. Files can be uploaded to multiple urls.

Run the script with the -h or --help option, to see a short help text. The help text will also show the values for any given option.

The script requires curl for uploading. It requires the inotifywait command if directories should be watched for new files.

Example for watching two directories:

./tools/consumedir.sh --path ~/Downloads --path ~/pdfs -m -dv http://localhost:7880/api/v1/open/upload/item/5DxhjkvWf9S-CkWqF3Kr892-WgoCspFWDo7-XBykwCyAUxQ

The script by default watches the given directories. If the -o or --once option is used, it will instead go through these directories and upload all files in there.

Example for uploading all immediatly (the same as above only with -o added):

$ consumedir.sh -o --path ~/Downloads --path ~/pdfs/ -m -dv http://localhost:7880/api/v1/open/upload/item/5DxhjkvWf9S-CkWqF3Kr892-WgoCspFWDo7-XBykwCyAUxQ

The URL can be any docspell url that accepts uploads without authentication. This is usually a source url. It is also possible to use the script with the integration endpoint.

Integration Endpoint

When given the -i or --integration option, the script changes its behaviour slightly to work with the integration endpoint.

First, if -i is given, it implies -r so the directories are watched or traversed recursively. The script then assumes that there is a subfolder with the collective name. Files must not be placed directly into a folder given by -p, but below a sub-directory that matches a collective name. In order to know for which collective the file is, the script uses the first subfolder.

If the endpoint is protected, these credentials can be specified as arguments --iuser and --iheader, respectively. The format is for both <name>:<value>, so the username cannot contain a colon character (but the password can).

Example:

$ consumedir.sh -i -iheader 'Docspell-Integration:test123' -m -p ~/Downloads/ http://localhost:7880/api/v1/open/integration/item

The url is the integration endpoint url without the collective, as this is amended by the script.

This watches the folder ~/Downloads. If a file is placed in this folder directly, say ~/Downloads/test.pdf the upload will fail, because the collective cannot be determined. Create a subfolder below ~/Downloads with the name of a collective, for example ~/Downloads/family and place files somewhere below this family subfolder, like ~/Downloads/family/test.pdf.

Duplicates

With the -m option, the script will not upload files that already exist at docspell. For this the sha256sum command is required.

So you can move and rename files in those folders without worring about duplicates. This allows to keep your files organized using the file-system and have them mirrored into docspell as well.

Systemd

The script can be used with systemd to run as a service. This is an example unit file:

[Unit]
After=networking.target
Description=Docspell Consumedir

[Service]
Environment="PATH=/set/a/path"

ExecStart=/bin/su -s /bin/bash someuser -c "consumedir.sh --path '/a/path/' -m 'http://localhost:7880/api/v1/open/upload/item/5DxhjkvWf9S-CkWqF3Kr892-WgoCspFWDo7-XBykwCyAUxQ'"

This unit file is just an example, it needs some fiddling. It assumes an existing user someuser that is used to run this service. The url http://localhost:7880/api/v1/open/upload/... is an anonymous upload url as described here.

Docker

The provided docker image runs this script to watch a directory for new files. If a new file is detected, it is pushed to docspell.

For this to work, the container must know about a valid upload url. Therefore, you must first signup and create such an upload url, as described here. Get only the id (something like AvR6sA8GKFm-hgYDgZfwzXa-Tqnu8yqyz6X-KzuefvEvrRf) and define an environment variable SOURCE_ID with that value before running docker-compose up a second time.

export SOURCE_ID="AvR6sA8GKFm-hgYDgZfwzXa-Tqnu8yqyz6X-KzuefvEvrRf"
docker-compose up

Now you can create a folder ./docs and place all files in there that you want to import. Once dropped in this folder the consumedir container will push it to docspell.