mirror of
https://github.com/TheAnachronism/docspell.git
synced 2025-06-21 09:58:26 +00:00
Extend consumedir.sh to work with integration endpoint
Now running one consumedir script can upload files to multiple collectives separately.
This commit is contained in:
@ -8,21 +8,20 @@ permalink: doc/tools/consumedir
|
||||
|
||||
The `consumerdir.sh` is a bash script that works in two modes:
|
||||
|
||||
- Go through all files in given directories (non recursively) and sent
|
||||
each to docspell.
|
||||
- Go through all files in given directories (recursively, if `-r` is
|
||||
specified) and sent each to docspell.
|
||||
- Watch one or more directories for new files and upload them to
|
||||
docspell.
|
||||
|
||||
It can watch or go through one or more directories. Files can be
|
||||
uploaded to multiple urls.
|
||||
|
||||
Run the script with the `-h` option, to see a short help text. The
|
||||
help text will also show the values for any given option.
|
||||
Run the script with the `-h` or `--help` option, to see a short help
|
||||
text. The help text will also show the values for any given option.
|
||||
|
||||
The script requires `curl` for uploading. It requires the
|
||||
`inotifywait` command if directories should be watched for new
|
||||
files. If the `-m` option is used, the script will skip duplicate
|
||||
files. For this the `sha256sum` command is required.
|
||||
files.
|
||||
|
||||
Example for watching two directories:
|
||||
|
||||
@ -30,18 +29,69 @@ Example for watching two directories:
|
||||
./tools/consumedir.sh --path ~/Downloads --path ~/pdfs -m -dv http://localhost:7880/api/v1/open/upload/item/5DxhjkvWf9S-CkWqF3Kr892-WgoCspFWDo7-XBykwCyAUxQ
|
||||
```
|
||||
|
||||
The script by default watches the given directories. If the `-o`
|
||||
option is used, it will instead go through these directories and
|
||||
upload all files in there.
|
||||
The script by default watches the given directories. If the `-o` or
|
||||
`--once` option is used, it will instead go through these directories
|
||||
and upload all files in there.
|
||||
|
||||
Example for uploading all immediatly (the same as above only with `-o`
|
||||
added):
|
||||
|
||||
``` bash
|
||||
./tools/consumedir.sh -o --path ~/Downloads --path ~/pdfs/ -m -dv http://localhost:7880/api/v1/open/upload/item/5DxhjkvWf9S-CkWqF3Kr892-WgoCspFWDo7-XBykwCyAUxQ
|
||||
$ consumedir.sh -o --path ~/Downloads --path ~/pdfs/ -m -dv http://localhost:7880/api/v1/open/upload/item/5DxhjkvWf9S-CkWqF3Kr892-WgoCspFWDo7-XBykwCyAUxQ
|
||||
```
|
||||
|
||||
|
||||
The URL can be any docspell url that accepts uploads without
|
||||
authentication. This is usually a [source
|
||||
url](../uploading#anonymous-upload). It is also possible to use the
|
||||
script with the [integration
|
||||
endpoint](../uploading#integration-endpoint).
|
||||
|
||||
|
||||
## Integration Endpoint
|
||||
|
||||
When given the `-i` or `--integration` option, the script changes its
|
||||
behaviour slightly to work with the [integration
|
||||
endpoint](../uploading#integration-endpoint).
|
||||
|
||||
First, if `-i` is given, it implies `-r` – so the directories are
|
||||
watched or traversed recursively. The script then assumes that there
|
||||
is a subfolder with the collective name. Files must not be placed
|
||||
directly into a folder given by `-p`, but below a sub-directory that
|
||||
matches a collective name. In order to know for which collective the
|
||||
file is, the script uses the first subfolder.
|
||||
|
||||
If the endpoint is protected, these credentials can be specified as
|
||||
arguments `--iuser` and `--iheader`, respectively. The format is for
|
||||
both `<name>:<value>`, so the username cannot contain a colon
|
||||
character (but the password can).
|
||||
|
||||
Example:
|
||||
``` bash
|
||||
$ consumedir.sh -i -iheader 'Docspell-Integration:test123' -m -p ~/Downloads/ http://localhost:7880/api/v1/open/integration/item
|
||||
```
|
||||
|
||||
The url is the integration endpoint url without the collective, as
|
||||
this is amended by the script.
|
||||
|
||||
This watches the folder `~/Downloads`. If a file is placed in this
|
||||
folder directly, say `~/Downloads/test.pdf` the upload will fail,
|
||||
because the collective cannot be determined. Create a subfolder below
|
||||
`~/Downloads` with the name of a collective, for example
|
||||
`~/Downloads/family` and place files somewhere below this `family`
|
||||
subfolder, like `~/Downloads/family/test.pdf`.
|
||||
|
||||
|
||||
## Duplicates
|
||||
|
||||
With the `-m` option, the script will not upload files that already
|
||||
exist at docspell. For this the `sha256sum` command is required.
|
||||
|
||||
So you can move and rename files in those folders without worring
|
||||
about duplicates. This allows to keep your files organized using the
|
||||
file-system and have them mirrored into docspell as well.
|
||||
|
||||
|
||||
## Systemd
|
||||
|
||||
The script can be used with systemd to run as a service. This is an
|
||||
|
@ -112,8 +112,15 @@ the [configuration file](configure#rest-server).
|
||||
If queried by a `GET` request, it returns whether it is enabled and
|
||||
the collective exists.
|
||||
|
||||
See the [SMTP gateway](tools/smtpgateway) for an example to use this
|
||||
endpoint.
|
||||
It is also possible to check for existing files using their sha256
|
||||
checksum with:
|
||||
|
||||
```
|
||||
/api/v1/open/integration/checkfile/[collective-name]/[sha256-checksum]
|
||||
```
|
||||
|
||||
See the [SMTP gateway](tools/smtpgateway) or the [consumedir
|
||||
script](tools/consumedir) for examples to use this endpoint.
|
||||
|
||||
## The Request
|
||||
|
||||
|
@ -299,8 +299,8 @@ paths:
|
||||
$ref: "#/components/schemas/BasicResult"
|
||||
/open/integration/item/{id}:
|
||||
get:
|
||||
tags: [ Upload Integration ]
|
||||
summary: Upload files to docspell.
|
||||
tags: [ Integration Endpoint ]
|
||||
summary: Check if integration endpoint is available.
|
||||
description: |
|
||||
Allows to check whether an integration endpoint is enabled for
|
||||
a collective. The collective is given by the `id` parameter.
|
||||
@ -325,7 +325,7 @@ paths:
|
||||
401:
|
||||
description: Unauthorized
|
||||
post:
|
||||
tags: [ Upload Integration ]
|
||||
tags: [ Integration Endpoint ]
|
||||
summary: Upload files to docspell.
|
||||
description: |
|
||||
Upload a file to docspell for processing. The id is a
|
||||
@ -368,6 +368,30 @@ paths:
|
||||
application/json:
|
||||
schema:
|
||||
$ref: "#/components/schemas/BasicResult"
|
||||
/open/integration/checkfile/{id}/{checksum}:
|
||||
get:
|
||||
tags: [ Integration Endpoint ]
|
||||
summary: Check if a file is in docspell.
|
||||
description: |
|
||||
Checks if a file with the given SHA-256 checksum is in
|
||||
docspell. The `id` is the *collective name*. This route only
|
||||
exists, if it is enabled in the configuration file.
|
||||
|
||||
The result shows all items that contains a file with the given
|
||||
checksum.
|
||||
security:
|
||||
- authTokenHeader: []
|
||||
parameters:
|
||||
- $ref: "#/components/parameters/id"
|
||||
- $ref: "#/components/parameters/checksum"
|
||||
responses:
|
||||
200:
|
||||
description: Ok
|
||||
content:
|
||||
application/json:
|
||||
schema:
|
||||
$ref: "#/components/schemas/CheckFileResult"
|
||||
|
||||
/open/signup/register:
|
||||
post:
|
||||
tags: [ Registration ]
|
||||
|
@ -42,7 +42,7 @@ object CheckFileRoutes {
|
||||
}
|
||||
}
|
||||
|
||||
private def convert(v: Vector[RItem]): CheckFileResult =
|
||||
def convert(v: Vector[RItem]): CheckFileResult =
|
||||
CheckFileResult(
|
||||
v.nonEmpty,
|
||||
v.map(r => BasicItem(r.id, r.name, r.direction, r.state, r.created, r.itemDate))
|
||||
|
@ -8,6 +8,7 @@ import docspell.common._
|
||||
import docspell.restserver.Config
|
||||
import docspell.restserver.conv.Conversions._
|
||||
import docspell.restserver.http4s.Responses
|
||||
import docspell.store.records.RItem
|
||||
import org.http4s._
|
||||
import org.http4s.circe.CirceEntityEncoder._
|
||||
import org.http4s.dsl.Http4sDsl
|
||||
@ -24,12 +25,17 @@ object IntegrationEndpointRoutes {
|
||||
val dsl = new Http4sDsl[F] {}
|
||||
import dsl._
|
||||
|
||||
def validate(req: Request[F], collective: Ident) =
|
||||
for {
|
||||
_ <- authRequest(req, cfg.integrationEndpoint)
|
||||
_ <- checkEnabled(cfg.integrationEndpoint)
|
||||
_ <- lookupCollective(collective, backend)
|
||||
} yield ()
|
||||
|
||||
HttpRoutes.of {
|
||||
case req @ POST -> Root / "item" / Ident(collective) =>
|
||||
(for {
|
||||
_ <- authRequest(req, cfg.integrationEndpoint)
|
||||
_ <- checkEnabled(cfg.integrationEndpoint)
|
||||
_ <- lookupCollective(collective, backend)
|
||||
_ <- validate(req, collective)
|
||||
res <- EitherT.liftF[F, Response[F], Response[F]](
|
||||
uploadFile(collective, backend, cfg, dsl)(req)
|
||||
)
|
||||
@ -37,11 +43,20 @@ object IntegrationEndpointRoutes {
|
||||
|
||||
case req @ GET -> Root / "item" / Ident(collective) =>
|
||||
(for {
|
||||
_ <- authRequest(req, cfg.integrationEndpoint)
|
||||
_ <- checkEnabled(cfg.integrationEndpoint)
|
||||
_ <- lookupCollective(collective, backend)
|
||||
_ <- validate(req, collective)
|
||||
res <- EitherT.liftF[F, Response[F], Response[F]](Ok(()))
|
||||
} yield res).fold(identity, identity)
|
||||
|
||||
case req @ GET -> Root / "checkfile" / Ident(collective) / checksum =>
|
||||
(for {
|
||||
_ <- validate(req, collective)
|
||||
items <- EitherT.liftF[F, Response[F], Vector[RItem]](
|
||||
backend.itemSearch.findByFileCollective(checksum, collective)
|
||||
)
|
||||
resp <-
|
||||
EitherT.liftF[F, Response[F], Response[F]](Ok(CheckFileRoutes.convert(items)))
|
||||
} yield resp).fold(identity, identity)
|
||||
|
||||
}
|
||||
}
|
||||
|
||||
|
Reference in New Issue
Block a user