Initial website
12
website/site/content/docs/webapp/_index.md
Normal file
@ -0,0 +1,12 @@
|
||||
+++
|
||||
title = "Web-UI"
|
||||
summary = true
|
||||
description = "This section describes the features of the web application."
|
||||
weight = 50
|
||||
insert_anchor_links = "right"
|
||||
template = "pages.html"
|
||||
sort_by = "weight"
|
||||
redirect_to = "docs/webapp/uploading"
|
||||
+++
|
||||
|
||||
No content here.
|
66
website/site/content/docs/webapp/curate.md
Normal file
@ -0,0 +1,66 @@
|
||||
+++
|
||||
title = "Curate Items"
|
||||
weight = 20
|
||||
+++
|
||||
|
||||
Curating the items meta data helps finding them later. This page
|
||||
describes how you can quickly go through those items and correct or
|
||||
amend with existing data.
|
||||
|
||||
## Select New items
|
||||
|
||||
After files have been uploaded and the job executor created the
|
||||
corresponding items, they will show up on the main page. All items,
|
||||
the job executor has created are initially marked as *New*. The option
|
||||
*only New* in the left search menu can be used to select only new
|
||||
items:
|
||||
|
||||
{{ figure(file="docspell-curate-1.jpg") }}
|
||||
|
||||
|
||||
## Check selected items
|
||||
|
||||
Then you can go through all new items and check their metadata: Click
|
||||
on the first item to open the detail view. This shows the documents
|
||||
and the meta data in the header.
|
||||
|
||||
{{ figure(file="docspell-curate-2.jpg") }}
|
||||
|
||||
|
||||
## Modify if necessary
|
||||
|
||||
To change something, click the *Edit* button in the menu above the
|
||||
document view. This will open a form next to your documents. You can
|
||||
compare the data with the documents and change as you like. Since the
|
||||
item status is *New*, you'll see the suggestions docspell found during
|
||||
processing. If there were multiple candidates, you can select another
|
||||
one by clicking its name in the suggestion list.
|
||||
|
||||
{{ figure(file="docspell-curate-3.jpg") }}
|
||||
|
||||
|
||||
When you change something in the form, it is immediatly applied. Only
|
||||
when changing text fields, a click on the *Save* symbol next to the
|
||||
field is required.
|
||||
|
||||
|
||||
## Confirm
|
||||
|
||||
If everything looks good, click the *Confirm* button to confirm the
|
||||
current data. The *New* status goes away and also the suggestions are
|
||||
hidden in this state. You can always go back by clicking the
|
||||
*Unconfirm* button.
|
||||
|
||||
|
||||
{{ figure(file="docspell-curate-5.jpg") }}
|
||||
|
||||
|
||||
## Proceed with next item
|
||||
|
||||
To look at the next item in the search results, click the *Next*
|
||||
button in the menu (next to the *Edit* button). Clicking next, will
|
||||
keep the current view, so you can continue checking the data. If you
|
||||
are on the last item, the view switches to the listing view when
|
||||
clicking *Next*.
|
||||
|
||||
{{ figure(file="docspell-curate-6.jpg") }}
|
BIN
website/site/content/docs/webapp/docspell-curate-1.jpg
Normal file
After Width: | Height: | Size: 87 KiB |
BIN
website/site/content/docs/webapp/docspell-curate-2.jpg
Normal file
After Width: | Height: | Size: 89 KiB |
BIN
website/site/content/docs/webapp/docspell-curate-3.jpg
Normal file
After Width: | Height: | Size: 123 KiB |
BIN
website/site/content/docs/webapp/docspell-curate-5.jpg
Normal file
After Width: | Height: | Size: 124 KiB |
BIN
website/site/content/docs/webapp/docspell-curate-6.jpg
Normal file
After Width: | Height: | Size: 108 KiB |
234
website/site/content/docs/webapp/emailsettings.md
Normal file
@ -0,0 +1,234 @@
|
||||
+++
|
||||
title = "E-Mail Settings"
|
||||
weight = 40
|
||||
[extra]
|
||||
mktoc = true
|
||||
+++
|
||||
|
||||
Docspell has a good integration for E-Mail. You can send e-mails
|
||||
related to an item and you can import e-mails from your mailbox into
|
||||
docspell.
|
||||
|
||||
This requires to define settings to use for sending and receiving
|
||||
e-mails. E-Mails are commonly send via
|
||||
[SMTP](https://en.wikipedia.org/wiki/Simple_Mail_Transfer_Protocol)
|
||||
and for receiving
|
||||
[IMAP](https://en.wikipedia.org/wiki/Internet_Message_Access_Protocol)
|
||||
is quite common. Docspell has support for SMTP and IMAP. These
|
||||
settings are associated to a user, so that each user can specify its
|
||||
own settings separately from others in the collective.
|
||||
|
||||
*Note: Passwords to your e-mail accounts are stored in plain-text in
|
||||
docspell's database. This is necessary to have docspell connect to
|
||||
your e-mail account to send mails on behalf of you and receive your
|
||||
mails.*
|
||||
|
||||
|
||||
## SMTP Settings
|
||||
|
||||
For sending mail, you need to provide information to connect to a SMTP
|
||||
server. Every e-mail provider has this information somewhere
|
||||
available.
|
||||
|
||||
Configure this in *User Settings -> E-Mail Settings (SMTP)*:
|
||||
|
||||
{{ figure(file="mail-settings-1.png") }}
|
||||
|
||||
First, you need to provide some name that is used to recognize this
|
||||
account. This name is also used in URLs to docspell and so it must not
|
||||
contain whitespace or any special characters. A good value is the
|
||||
domain of your provider, for example `gmail.com`, or something like
|
||||
that.
|
||||
|
||||
These information should be available from your e-mail provider. For
|
||||
example, for google-mail it is:
|
||||
|
||||
- SMTP Host: `smtp.gmail.com`
|
||||
- SMTP Port: `587` or `465`
|
||||
- SMTP User: Your Gmail address (for example, example@gmail.com)
|
||||
- SMTP Password: Your Gmail password
|
||||
- SSL: use `SSL` for port `465` and `StartSSL` for port `587`
|
||||
|
||||
Then you need to define the e-mail address that is used for the `From`
|
||||
field. This is in most cases the same address as used for the SMTP
|
||||
User field.
|
||||
|
||||
The `Reply-To` field is optional and can be set to define a different
|
||||
e-mail address that your recipients should use to answer a mail.
|
||||
|
||||
Once this is setup, you can start sending mails within docspell. It is
|
||||
possible to set up these settings for multiple providers, so you can
|
||||
choose from which account you want to send mails.
|
||||
|
||||
|
||||
## IMAP Settings
|
||||
|
||||
For receiving e-mails, you need to provide information to connect to
|
||||
an IMAP server. Your e-mail provider should have this information
|
||||
somewhere available.
|
||||
|
||||
Configure this in *User Settings -> E-Mail Settings (IMAP)*:
|
||||
|
||||
{{ figure(file="mail-settings-2.png") }}
|
||||
|
||||
First you need to define a *Name* to recognize this connection inside
|
||||
docspell. This name is also used in URLs to docspell and so it must
|
||||
not contain whitespace or any special characters. A good value is the
|
||||
domain of your provider, for example `gmail.com`, or something like
|
||||
that.
|
||||
|
||||
You can provide imap connections to multiple mailboxes.
|
||||
|
||||
Here is an example for posteo.de:
|
||||
|
||||
- IMAP Server: `posteo.de`
|
||||
- IMAP Port: 143
|
||||
- IMAP User: Your posteo address
|
||||
- IMAP Password: Your posteo password
|
||||
- SSL: use `StartTLS`
|
||||
|
||||
|
||||
## SSL / TLS / StartTLS
|
||||
|
||||
*Please Note: If `SSL` is set to `None`, then mails will be sent
|
||||
unencrypted to your mail provider! If `Ignore certificate check` is
|
||||
enabled, connections to your mail provider will succeed even if the
|
||||
provider is wrongly configured for SSL/TLS. This flag should only be
|
||||
enabled if you know why.*
|
||||
|
||||
|
||||
## GMail
|
||||
|
||||
Authenticating with GMail may be not so simple. GMail implements an
|
||||
authentication scheme called *XOAUTH2* (at least for Imap). It will
|
||||
not work with your normal password. This is to avoid giving an
|
||||
application full access to your gmail account.
|
||||
|
||||
The e-mail integration in docspell relies on the
|
||||
[JavaMail](https://javaee.github.io/javamail) library which has
|
||||
support for XOAUTH2. It also has documentation on what you need to do
|
||||
on your gmail account: <https://javaee.github.io/javamail/OAuth2>.
|
||||
|
||||
First you need to go to the [Google Developers
|
||||
Console](https://console.developers.google.com) and create an "App" to
|
||||
get a Client-Id and a Client-Secret. This "App" will be your instance
|
||||
of docspell. You tell google that this app may send and read your
|
||||
mails and then you get an *access token* that should be used instead
|
||||
of the password.
|
||||
|
||||
Once you setup an App in Google Developers Console, you get the
|
||||
Client-Id and the Client-Secret, which look something like this:
|
||||
|
||||
- Client-Id: 106701....d8c.apps.googleusercontent.com
|
||||
- Client-Secret: 5Z1...Kir_t
|
||||
|
||||
Google has a python tool to help with getting this access token.
|
||||
Download the `oauth2.py` script from
|
||||
[here](https://github.com/google/gmail-oauth2-tools) and first create
|
||||
an *oauth2-token*:
|
||||
|
||||
``` bash
|
||||
./oauth2.py --user=your.name@gmail.com \
|
||||
--client_id=106701....d8c.apps.googleusercontent.com \
|
||||
--client_secret=5Z1...Kir_t \
|
||||
--generate_oauth2_token
|
||||
```
|
||||
|
||||
This will "redirect you" to an URL where you have to authenticate with
|
||||
google. Afterwards it lets you add permissions to the app for
|
||||
accessing your mail account. The result is another code you need to
|
||||
give to the script to proceed:
|
||||
|
||||
```
|
||||
4/zwE....q0QBAb-99yD7lw
|
||||
```
|
||||
|
||||
Then the scripts produces this:
|
||||
|
||||
```
|
||||
Refresh Token: 1//09zH.........Lj6oc2SmFlZww
|
||||
Access Token: ya29.a0........SECDQ
|
||||
Access Token Expiration Seconds: 3599
|
||||
```
|
||||
|
||||
The access token can be used to sign in via IMAP with google. The
|
||||
Refresh Token doesn't expire and can be used to generate new access
|
||||
tokens:
|
||||
|
||||
```
|
||||
./oauth2.py --user=your.name@gmail.com \
|
||||
--client_id=106701....d8c.apps.googleusercontent.com \
|
||||
--client_secret=5Z1...Kir_t \
|
||||
--refresh_token=1//09zH.........Lj6oc2SmFlZww
|
||||
```
|
||||
|
||||
Output:
|
||||
```
|
||||
Access Token: ya29.a0....._q-lX3ypntk3ln0h9Yk
|
||||
Access Token Expiration Seconds: 3599
|
||||
```
|
||||
|
||||
The problem is that the access token expires. Docspell doesn't support
|
||||
updating the access token. It could be worked around by setting up a
|
||||
cron-job or similiar which uses the `oauth2.py` tool to generate new
|
||||
access tokens and update your imap settings via a
|
||||
[REST](@/docs/api/_index.md) call.
|
||||
|
||||
``` bash
|
||||
#!/usr/bin/env bash
|
||||
set -e
|
||||
|
||||
## Change this to your values:
|
||||
|
||||
DOCSPELL_USER="[docspell-user]"
|
||||
DOCSPELL_PASSWORD="[docspell-password]"
|
||||
DOCSPELL_URL="http://localhost:7880"
|
||||
DOCSPELL_IMAP_NAME="gmail.com"
|
||||
|
||||
GMAIL_USER="your.name@gmail.com"
|
||||
CLIENT_ID="106701....d8c.apps.googleusercontent.com"
|
||||
CLIENT_SECRET="secret=5Z1...Kir_t"
|
||||
REFRESH_TOKEN="1//09zH.........Lj6oc2SmFlZww"
|
||||
# Path to the oauth2.py tool
|
||||
OAUTH_TOOL="./oauth2.py"
|
||||
|
||||
##############################################################################
|
||||
## Script
|
||||
|
||||
|
||||
# Login to docspell and store the auth-token
|
||||
AUTH_DATA=$(curl --silent -XPOST \
|
||||
-H 'Content-Type: application/json' \
|
||||
--data-binary "{\"account\":\"$DOCSPELL_USER\",\"password\":\"$DOCSPELL_PASSWORD\"}" \
|
||||
$DOCSPELL_URL/api/v1/open/auth/login)
|
||||
if [ $(echo $AUTH_DATA | jq .success) == "false" ]; then
|
||||
echo "Auth failed"
|
||||
echo $AUTH_DATA
|
||||
fi
|
||||
TOKEN="$(echo $AUTH_DATA | jq -r .token)"
|
||||
|
||||
|
||||
# Get the imap settings
|
||||
UPDATE_URL="$DOCSPELL_URL/api/v1/sec/email/settings/imap/$DOCSPELL_IMAP_NAME"
|
||||
IMAP_DATA=$(curl -s -H "X-Docspell-Auth: $TOKEN" "$UPDATE_URL")
|
||||
|
||||
echo "Current Settings:"
|
||||
echo $IMAP_DATA | jq
|
||||
|
||||
|
||||
# Get the new access token
|
||||
ACCESS_TOKEN=$($OAUTH_TOOL --user=$GMAIL_USER \
|
||||
--client_id="$CLIENT_ID" \
|
||||
--client_secret="$CLIENT_SECRET" \
|
||||
--refresh_token="$REFRESH_TOKEN" | head -n1 | cut -d':' -f2 | xargs)
|
||||
|
||||
# Update settings
|
||||
echo "Updating IMAP settings"
|
||||
NEW_IMAP=$(echo $IMAP_DATA | jq ".imapPassword |= \"$ACCESS_TOKEN\"")
|
||||
curl -s -XPUT -H "X-Docspell-Auth: $TOKEN" \
|
||||
-H 'Content-Type: application/json' \
|
||||
--data-binary "$NEW_IMAP" "$UPDATE_URL"
|
||||
echo
|
||||
echo "New Settings:"
|
||||
curl -s -H "X-Docspell-Auth: $TOKEN" "$UPDATE_URL" | jq
|
||||
```
|
178
website/site/content/docs/webapp/finding.md
Normal file
@ -0,0 +1,178 @@
|
||||
+++
|
||||
title = "Finding Items"
|
||||
weight = 30
|
||||
[extra]
|
||||
mktoc = true
|
||||
+++
|
||||
|
||||
Items can be searched by their annotated meta data and their contents
|
||||
using full text search. The landing page shows a list of current
|
||||
items. Items are displayed sorted by their date, newest first.
|
||||
|
||||
Docspell has two modes for searching: a simple search bar and a search
|
||||
menu with many options. Both are active at the same time, but only one
|
||||
is visible. You can switch between them without affecting the results.
|
||||
|
||||
|
||||
## Search Bar
|
||||
|
||||
{{ imgright(file="search-bar.png") }}
|
||||
|
||||
By default, the search bar is shown. It provides a refined view of the
|
||||
search menu. The dropdown contains different options to do a quick
|
||||
search.
|
||||
|
||||
### *All Names* and *Contents*
|
||||
|
||||
These two options correspond to the same named field in the search
|
||||
menu. If you switch between search menu and search bar (by clicking
|
||||
the icon on the left), you'll see that they are the same fields.
|
||||
Typing in the search bar also fills the corresponding field in the
|
||||
search menu (and vice versa).
|
||||
|
||||
- The *All Names* searches in the item name, item notes, names of
|
||||
correspondent organization and person, and names of concering person
|
||||
and equipment. It uses a simple substring search.
|
||||
- The option *Contents* searches the contents of all attachments
|
||||
(documents), attachment names, the item name and item notes. It uses
|
||||
full text search. However, it does not search the names of attached
|
||||
meta data.
|
||||
|
||||
When searching with one of these fields active, it simply submits the
|
||||
(hidden) search menu. So if the menu has other fields filled out, they
|
||||
will affect the result, too. Using one of these fields, the bar is
|
||||
just a reduced view of the search menu.
|
||||
|
||||
So you can choose tags or correspondents in the search menu and
|
||||
further restrict the results using full text search. The results will
|
||||
be returned sorted by the item date, newest first.
|
||||
|
||||
If the left button in the search bar shows a little blue bubble, it
|
||||
means that there are more search fields filled out in the search menu
|
||||
that you currently can't see. In this case the results are not only
|
||||
restricted by the search term given in the search-bar, but also by
|
||||
what is specified in the search menu.
|
||||
|
||||
|
||||
### *Contents Only*
|
||||
|
||||
This option has no corresponding part in the search menu. Searching
|
||||
with this option active, there is only a full text search done in the
|
||||
attachments contents, attachment names, item name and item notes.
|
||||
|
||||
The results are not ordered by item date, but by relevance with
|
||||
respect to the search term. This ordering is returned from the full
|
||||
text search engine and is simply transfered unmodified.
|
||||
|
||||
|
||||
## Search Menu
|
||||
|
||||
{{ imgright(file="search-menu.png") }}
|
||||
|
||||
The search menu can be opened by clicking the left icon in the top
|
||||
bar. It shows some options to constrain the item list:
|
||||
|
||||
### Show new items
|
||||
|
||||
Clicking the checkbox "Only new" shows items that have not been
|
||||
"Confirmed". All items that have been created by docspell and not
|
||||
looked at are marked as "new" automatically.
|
||||
|
||||
### Names
|
||||
|
||||
Searches in names of certain properties. The `All Names` field is the
|
||||
same as the search in the search bar (see above).
|
||||
|
||||
The `Name` field only searches in the name property of an item.
|
||||
|
||||
### Folder
|
||||
|
||||
Set a folder to only show items in that folder. If no folder is set,
|
||||
all accessible items are shown. These are all items that either have
|
||||
no folder set, or a folder where the current user is member.
|
||||
|
||||
### Tags
|
||||
|
||||
Specify a list of tags that the items must have. When adding tags to
|
||||
the "Include" list, an item must have all these tags in order to be
|
||||
included in the results.
|
||||
|
||||
When adding tags to the "Exclude" list, then an item is removed from
|
||||
the results if it has at least one of these tags.
|
||||
|
||||
### Correspondent
|
||||
|
||||
Pick a correspondent to show only these items.
|
||||
|
||||
### Concerned
|
||||
|
||||
Pick a concerned entity to show only these items.
|
||||
|
||||
### Date
|
||||
|
||||
Specify a date range to show only items whose date property is within
|
||||
this range. If you want to see items of a specific day, choose the
|
||||
same day for both fields.
|
||||
|
||||
For items that don't have an explicitly date property set, the created
|
||||
date is used.
|
||||
|
||||
### Due Date
|
||||
|
||||
Specify a date range to show only items whose due date property is
|
||||
within this range. Items without a due date are not shown.
|
||||
|
||||
|
||||
### Direction
|
||||
|
||||
Specify whether to show only incoming, only outgoing or all items.
|
||||
|
||||
|
||||
## Customize Substring Search
|
||||
|
||||
The substring search of the *All Names* and *Name* field can be
|
||||
customized in the following way: A wildcard `*` can be used at the
|
||||
start or end of a search term to do a substring match. A `*` means
|
||||
"everything". So a term `*company` matches all names ending in
|
||||
`company` and `*company*` matches all names containing the word
|
||||
`company`. The matching is case insensitive.
|
||||
|
||||
Docspell adds a `*` to the front and end of a term automatically,
|
||||
unless one of the following is true:
|
||||
|
||||
- The term already has a wildcard.
|
||||
- The term is enclosed in quotes `"`.
|
||||
|
||||
|
||||
## Full Text Search
|
||||
|
||||
|
||||
### The Query
|
||||
|
||||
The query string for full text search is very powerful. Docspell
|
||||
currently supports [Apache SOLR](https://lucene.apache.org/solr/) as
|
||||
full text search backend, so you may want to have a look at their
|
||||
[documentation on query
|
||||
syntax](https://lucene.apache.org/solr/guide/8_4/query-syntax-and-parsing.html#query-syntax-and-parsing)
|
||||
for a in depth guide.
|
||||
|
||||
- Wildcards: `?` matches any single character, `*` matches zero or
|
||||
more characters
|
||||
- Fuzzy search: Appending a `~` to a term, results in a fuzzy search
|
||||
(search this term and similiar spelled ones)
|
||||
- Proximity Search: Search for terms that "near" each other, again
|
||||
using `~` appended to a search phrase. Example: `"cheese cake"~5`.
|
||||
- Boosting: apply more weight to a term with `^`. Example: `cheese^4
|
||||
cake` – cheese is 4x more important.
|
||||
|
||||
Docspell will preprocess the search query to prepare a query for SOLR.
|
||||
It will by default search all indexed fields, which are: attachment
|
||||
contents, attachment names, item name and item notes.
|
||||
|
||||
|
||||
### The Results
|
||||
|
||||
When using full text search, each item in the result list is annotated
|
||||
with the highlighted occurrence of the match.
|
||||
|
||||
{{ figure(file="search-content-results.png") }}
|
BIN
website/site/content/docs/webapp/mail-item-1.jpg
Normal file
After Width: | Height: | Size: 162 KiB |
BIN
website/site/content/docs/webapp/mail-item-2.jpg
Normal file
After Width: | Height: | Size: 177 KiB |
BIN
website/site/content/docs/webapp/mail-item-3.jpg
Normal file
After Width: | Height: | Size: 130 KiB |
BIN
website/site/content/docs/webapp/mail-item-4.jpg
Normal file
After Width: | Height: | Size: 150 KiB |
BIN
website/site/content/docs/webapp/mail-settings-1.png
Normal file
After Width: | Height: | Size: 71 KiB |
BIN
website/site/content/docs/webapp/mail-settings-2.png
Normal file
After Width: | Height: | Size: 66 KiB |
70
website/site/content/docs/webapp/mailitem.md
Normal file
@ -0,0 +1,70 @@
|
||||
+++
|
||||
title = "Send items via E-Mail"
|
||||
weight = 50
|
||||
[extra]
|
||||
mktoc = true
|
||||
+++
|
||||
|
||||
You can send e-mails from within docspell attaching the files of an
|
||||
item. This is useful to collaborate or share certain documents with
|
||||
people outside docspell.
|
||||
|
||||
All sent mails are stored attached to the item.
|
||||
|
||||
|
||||
## E-Mail Settings (SMTP)
|
||||
|
||||
To send mails, there are SMTP settings required. Please see the page
|
||||
about [e-mail settings](@/docs/webapp/emailsettings.md#smtp-settings).
|
||||
|
||||
|
||||
## Sending Mails
|
||||
|
||||
Currently, it is possible to send mails related to only one item. You
|
||||
can define the mail body and docspell will add the attachments of an
|
||||
item, or you may choose to send the mail without any attachments.
|
||||
|
||||
In the item detail view, click on the envelope icon to open the mail
|
||||
form:
|
||||
|
||||
{{ figure(file="mail-item-1.jpg") }}
|
||||
|
||||
Then write the mail. Multiple recipients may be specified. The input
|
||||
field shows completion proposals from all contacts in your address
|
||||
book (from organizations and persons). Choose an address by pressing
|
||||
*Enter* or by clicking a proposal from the list. The proposal list can
|
||||
be iterated by the *Up* and *Down* arrows. You can type in any
|
||||
address, of course, it doesn't need to match a proposal.
|
||||
|
||||
If you have multiple mail settings defined, you can choose in the top
|
||||
dropdown which account to use for sending.
|
||||
|
||||
The last checkbox allows to choose whether docspell should add all
|
||||
attachments of the item to the mail. If it is unchecked, no
|
||||
attachments will be added. It is currently not possible to pick
|
||||
specific attachments, it's all or nothing.
|
||||
|
||||
Clicking *Cancel* will delete the inputs and close the mail form, but
|
||||
clicking the envelope icon again, will only close the form without
|
||||
clearing its contents.
|
||||
|
||||
The *Send* button is active once all input fields have been filled.
|
||||
Once you click *Send*, the docspell server will send the mail using
|
||||
your connection settings. If that succeeds the mail is saved to the
|
||||
database and you'll see a message in the form.
|
||||
|
||||
## Accessing Sent Mails
|
||||
|
||||
If there is an e-mail for an item, a tab shows up at the right side,
|
||||
next to the attachments.
|
||||
|
||||
{{ figure(file="mail-item-2.jpg") }}
|
||||
|
||||
This tab shows a list of all mails that have been sent related to this
|
||||
item.
|
||||
|
||||
{{ figure(file="mail-item-3.jpg") }}
|
||||
|
||||
Clicking on a mail opens it in detail.
|
||||
|
||||
{{ figure(file="mail-item-4.jpg") }}
|
114
website/site/content/docs/webapp/metadata.md
Normal file
@ -0,0 +1,114 @@
|
||||
+++
|
||||
title = "Meta Data"
|
||||
weight = 10
|
||||
[extra]
|
||||
mktoc = true
|
||||
+++
|
||||
|
||||
Docspell processes each uploaded file. Processing involves extracting
|
||||
archives, extracting text, anlyzing the extracted text and converting
|
||||
the file into a pdf. Text is analyzed to find metadata that can be set
|
||||
automatically. Docspell compares the extracted text against a set of
|
||||
known meta data. The *Meta Data* page allows to manage this meta data:
|
||||
|
||||
- Tags
|
||||
- Organizations
|
||||
- Persons
|
||||
- Equipments
|
||||
- Folders
|
||||
|
||||
|
||||
### Tags
|
||||
|
||||
Items can be tagged with multiple custom tags (aka labels). This
|
||||
allows to describe many different workflows people may have with their
|
||||
documents.
|
||||
|
||||
A tag can have a *category*. This is meant to group tags together. For
|
||||
example, you may want to have a tag category *doctype* that is
|
||||
comprised of tags like *bill*, *contract*, *receipt* and so on. Or for
|
||||
workflows, a tag category *state* may exist that includes tags like
|
||||
*Todo* or *Waiting*. Or you can tag items with user names to provide
|
||||
"assignment" semantics. Docspell doesn't propose any workflow, but it
|
||||
can help to implement some.
|
||||
|
||||
The tags are *not* taken into account when processing. Docspell will
|
||||
not automatically associate tags to your items. The tags are only
|
||||
meant to be used manually for now.
|
||||
|
||||
|
||||
### Organization and Person
|
||||
|
||||
The organization entity represents an non-personal (organization or
|
||||
company) correspondent of an item. Docspell will choose one or more
|
||||
organizations when processing documents and associate the "best" match
|
||||
with your item.
|
||||
|
||||
The person entitiy can appear in two roles: It may be a correspondent
|
||||
or the person an item is about. So a person is either a correspondent
|
||||
or a concerning person. Docspell can not know which person is which,
|
||||
therefore you need to tell this by checking the box "Use for
|
||||
concerning person suggestion only". If this is checked, docspell will
|
||||
use this person only to suggest a concerning person. Otherwise the
|
||||
person is used only for correspondent suggestions.
|
||||
|
||||
Document processing uses the following properties:
|
||||
|
||||
- name
|
||||
- websites
|
||||
- e-mails
|
||||
|
||||
The website and e-mails can be added as contact information. If these
|
||||
three are present, you should get good matches from docspell. All
|
||||
other fields of an organization and person are not used during
|
||||
document processing. They might be useful when using this as a real
|
||||
address book.
|
||||
|
||||
|
||||
### Equipment
|
||||
|
||||
The equipment entity is almost like a tag. In fact, it could be
|
||||
replaced by a tag with a specific known category. The difference is
|
||||
that docspell will try to find a match and associate it with your
|
||||
item. The equipment represents non-personal things that an item is
|
||||
about. Examples are: bills or insurances for *cars*, contracts for
|
||||
*houses* or *flats*.
|
||||
|
||||
Equipments don't have contact information, so the only property that
|
||||
is used to find matches during document processing is its name.
|
||||
|
||||
|
||||
### Folders
|
||||
|
||||
Folders provide a way to divide all documents into disjoint subsets.
|
||||
Unlike with tags, an item can have at most one folder or none. A
|
||||
folder has an owner – the user who created the folder. Additionally,
|
||||
it can have members: users of the collective that the owner can assign
|
||||
to a folder.
|
||||
|
||||
When searching for items, the results are restricted to items that
|
||||
have either no folder assigned or a folder where the current user is
|
||||
owner or member. It can be used to control visibility when searching.
|
||||
However: there are no hard access checks. For example, if the item id
|
||||
is known, any user of the collective can see it and modify its meta
|
||||
data.
|
||||
|
||||
One use case is, that you can hide items from other users, like bills
|
||||
for birthday presents. In this case it is very unlikely that someone
|
||||
can guess the item-id.
|
||||
|
||||
While folders are *not* taken into account when processing documents,
|
||||
they can be specified with the upload request or a [source
|
||||
url](uploading#anonymous-upload) to have them automatically set when
|
||||
they arrive.
|
||||
|
||||
|
||||
## Document Language
|
||||
|
||||
An important setting is the language of your documents. This helps OCR
|
||||
and text analysis. You can select between English and German
|
||||
currently.
|
||||
|
||||
Go to the *Collective Settings* page and click *Document
|
||||
Language*. This will set the lanugage for all your documents. It is
|
||||
not (yet) possible to specify it when uploading.
|
BIN
website/site/content/docs/webapp/notify-due-items.jpg
Normal file
After Width: | Height: | Size: 233 KiB |
74
website/site/content/docs/webapp/notifydueitems.md
Normal file
@ -0,0 +1,74 @@
|
||||
+++
|
||||
title = "Notify about due items"
|
||||
weight = 60
|
||||
[extra]
|
||||
mktoc = true
|
||||
+++
|
||||
|
||||
A user that provides valid email (smtp) settings, can be notified by
|
||||
docspell about due items. You will then receive an e-mail containing a
|
||||
list of items, sorted by their due date.
|
||||
|
||||
You need first define smtp settings, please see [this
|
||||
page](@/docs/webapp/emailsettings.md#smtp-settings).
|
||||
|
||||
Notifying works simply by searching for due items periodically. It
|
||||
will be submitted to the job queue and is picked up by an available
|
||||
[job executor](joex) eventually. This can be setup in the user
|
||||
settings page.
|
||||
|
||||
{{ figure(file="notify-due-items.jpg") }}
|
||||
|
||||
At first, the task can be disabled/enabled any time.
|
||||
|
||||
Then two settings are required for sending an e-mail. You need to
|
||||
specify the connection to use and the recipients.
|
||||
|
||||
It follows some settings to customize the query for searching items.
|
||||
You can choose to only include items that have one or more tags (these
|
||||
are `and`-ed, so all tags must exist on the item). You can also
|
||||
provide tags that must *not* appear on an item (these tags are
|
||||
`or`-ed, so only one such tag is enough ot exclude an item). A common
|
||||
use-case would be to manually tag an item with *Done* once there is
|
||||
nothing more to do. Then these items can be excluded from the search.
|
||||
The somewhat inverse use-case is to always tag items with a *Todo* tag
|
||||
and remove it once completed.
|
||||
|
||||
The *Remind Days* field species the number of days the due date may be
|
||||
in the future. Each time the task executes, it searches for items with
|
||||
a due date lower than `today + remindDays`.
|
||||
|
||||
If you don't restrict the search using tags, then all items with a due
|
||||
date lower than this value are selected. Since items are (usually) not
|
||||
deleted, this only makes sense, if you remove the due date once you
|
||||
are done with an item.
|
||||
|
||||
The last option is to check *cap overdue items*, which uses the value
|
||||
in *Remind Days* to further restrict the due date of an item: only
|
||||
those with a due date *greater than* `today - remindDays` are
|
||||
selected. In other words, only items with an overdue time of *at most*
|
||||
*Remind Days* are included.
|
||||
|
||||
The *Schedule* field specifies the periodicity. The syntax is similiar
|
||||
to a date-time string, like `2019-09-15 12:32`, where each part is a
|
||||
pattern to also match multple values. The ui tries to help a little by
|
||||
displaying the next two date-times this task would execute. A more in
|
||||
depth help is available
|
||||
[here](https://github.com/eikek/calev#what-are-calendar-events). For
|
||||
example, to execute the task every monday at noon, you would write:
|
||||
`Mon *-*-* 12:00`. A date-time part can match all values (`*`), a list
|
||||
of values (e.g. `1,5,12,19`) or a range (e.g. `1..9`). Long lists may
|
||||
be written in a shorter way using a repetition value. It is written
|
||||
like this: `1/7` which is the same as a list with `1` and all
|
||||
multiples of `7` added to it. In other words, it matches `1`, `1+7`,
|
||||
`1+7+7`, `1+7+7+7` and so on.
|
||||
|
||||
You can click on *Start Once* to run this task right now, without
|
||||
saving the form to the database ("right now" means it is picked up by
|
||||
a free job executor).
|
||||
|
||||
If you click *Submit* these settings are saved and the task runs
|
||||
periodically.
|
||||
|
||||
You can see the task executing at the [processing
|
||||
page](@/docs/webapp/processing.md).
|
BIN
website/site/content/docs/webapp/processing-queue.jpg
Normal file
After Width: | Height: | Size: 105 KiB |
39
website/site/content/docs/webapp/processing.md
Normal file
@ -0,0 +1,39 @@
|
||||
+++
|
||||
title = "Processing Queue"
|
||||
weight = 80
|
||||
[extra]
|
||||
mktoc = true
|
||||
+++
|
||||
|
||||
|
||||
The page *Processing Queue* shows the current state of document
|
||||
processing for your uploads.
|
||||
|
||||
At the top of the page a list of running jobs is shown. Below that,
|
||||
the left column shows jobs that wait to be picked up by the job
|
||||
executor. On the right are finished jobs. The number of finished jobs
|
||||
is cut to some maximum and is also restricted by a date range. The
|
||||
page refreshes itself automatically to show the progress.
|
||||
|
||||
Example screenshot:
|
||||
|
||||
{{ figure(file="processing-queue.jpg") }}
|
||||
|
||||
You can cancel running jobs or remove waiting ones from the queue. If
|
||||
you click on the small file symbol on finished jobs, you can inspect
|
||||
its log messages again. A running job displays the job executor id
|
||||
that executes the job.
|
||||
|
||||
The jobs listed here are all long-running tasks for your collective.
|
||||
Most of the time it executes the document processing tasks. But user
|
||||
defined tasks, like "import mailbox", are also visible here.
|
||||
|
||||
Since job executors are shared among all collectives, it may happen
|
||||
that a job is some time waiting until it is picked up by a job
|
||||
executor. You can always start more job executors to help out.
|
||||
|
||||
If a job fails, it is retried after some time. Only if it fails too
|
||||
often (can be configured), it then is finished with *failed* state.
|
||||
|
||||
For the document-processing task, if processing finally fails or a job
|
||||
is cancelled, the item is still created, just without suggestions.
|
BIN
website/site/content/docs/webapp/scanmailbox-detail.png
Normal file
After Width: | Height: | Size: 228 KiB |
BIN
website/site/content/docs/webapp/scanmailbox-list.png
Normal file
After Width: | Height: | Size: 72 KiB |
122
website/site/content/docs/webapp/scanmailbox.md
Normal file
@ -0,0 +1,122 @@
|
||||
+++
|
||||
title = "Scan Mailboxes"
|
||||
weight = 70
|
||||
[extra]
|
||||
mktoc = true
|
||||
+++
|
||||
|
||||
User that provide valid email (imap) settings, can import mails from
|
||||
their mailbox into docspell periodically.
|
||||
|
||||
You need first define imap settings, please see [this
|
||||
page](@/docs/webapp/emailsettings.md#imap-settings).
|
||||
|
||||
Go to *User Settings -> Scan Mailbox Task*. You can define periodic
|
||||
tasks that connects to your mailbox and import mails into docspell. It
|
||||
is possible to define multiple tasks, for example, if you have
|
||||
multiple e-mail accounts you want to import periodically.
|
||||
|
||||
{{ figure(file="scanmailbox-list.png") }}
|
||||
|
||||
|
||||
## Details
|
||||
|
||||
Creating a task requires the following information:
|
||||
|
||||
{{ figure(file="scanmailbox-detail.png") }}
|
||||
|
||||
You can enable or disable this task. A disabled task will not run
|
||||
periodically. You can still choose to run it manually if you click the
|
||||
`Start Once` button.
|
||||
|
||||
Then you need to specify which [IMAP
|
||||
connection](@/docs/webapp/emailsettings.md#imap-settings) to use.
|
||||
|
||||
A list of folders is required. Docspell will only look into these
|
||||
folders. You can specify multiple folders. The "Inbox" folder is a
|
||||
special folder, which will usually appear translated in your web-mail
|
||||
client. You can specify "INBOX" case insensitive, it will then read
|
||||
mails in your inbox. Any other folder is usually case-sensitive
|
||||
(depends on the imap server, but usually they are case sensitive
|
||||
except the INBOX folder). Type in a folder name and click the add
|
||||
button on the right.
|
||||
|
||||
Then the field *Received Since Hours* defines how many hours to go
|
||||
back and look for mails. Usually there are many mails in your inbox
|
||||
and importing them all at once is not feasible or desirable. It can
|
||||
work together with the *Schedule* field below. For example, you could
|
||||
run this task all 6 hours and read mails from 8 hours back.
|
||||
|
||||
The next two settings tell docspell what to do once a mail has been
|
||||
submitted to docspell. It can be moved into another folder in your
|
||||
mail account. This moves it out of the way for the next run. You can
|
||||
also choose to delete the mail, but *note that it will really be
|
||||
deleted and not moved to your trash folder*. If both options are off,
|
||||
nothing happens with that mail, it simply stays (and could be re-read
|
||||
on the next run).
|
||||
|
||||
When docspell creates an item from a mail, it needs to set a direction
|
||||
value (incoming or outgoing). If you know that all mails you want to
|
||||
import have a specific directon, then you can set it here. Otherwise,
|
||||
*automatic* means that docspell chooses a direction based on the
|
||||
`From` header of a mail. If the `From` header is an e-mail address
|
||||
that belongs to a “concerning” person in your address book, then it is
|
||||
set to "outgoing". Otherwise it is set to "incoming". To support this,
|
||||
you need to add your own e-mail address(es) to your address book.
|
||||
|
||||
The *Item Folder* setting is used to put all items that are created
|
||||
from mails into the specified [folder](metadata#folders). If you
|
||||
define a folder here, where you are not a member, you won't find
|
||||
resulting items.
|
||||
|
||||
The last field is the *Schedule* which defines when and how often this
|
||||
task should run. The syntax is similiar to a date-time string, like
|
||||
`2019-09-15 12:32`, where each part is a pattern to also match multple
|
||||
values. The ui tries to help a little by displaying the next two
|
||||
date-times this task would execute. A more in depth help is available
|
||||
[here](https://github.com/eikek/calev#what-are-calendar-events). For
|
||||
example, to execute the task every monday at noon, you would write:
|
||||
`Mon *-*-* 12:00`. A date-time part can match all values (`*`), a list
|
||||
of values (e.g. `1,5,12,19`) or a range (e.g. `1..9`). Long lists may
|
||||
be written in a shorter way using a repetition value. It is written
|
||||
like this: `1/7` which is the same as a list with `1` and all
|
||||
multiples of `7` added to it. In other words, it matches `1`, `1+7`,
|
||||
`1+7+7`, `1+7+7+7` and so on.
|
||||
|
||||
|
||||
## Reading Mails twice / Duplicates
|
||||
|
||||
Since users can move around mails in their mailboxes, it can happen
|
||||
that docspell unintentionally reads a mail multiple times. If docspell
|
||||
reads a mail, it will first check if an item already exists that
|
||||
originated from this mail. It only proceeds to import it, if it cannot
|
||||
find any. If you deleted an item in the meantime, docspell would
|
||||
import the mail again.
|
||||
|
||||
This check uses the
|
||||
[`Message-ID`](https://en.wikipedia.org/wiki/Message-ID) of an e-mail.
|
||||
This is usually there and should identify a complete mail. But it
|
||||
won't catch duplicate mails, that are sent multiple times - they might
|
||||
have different `Message-ID`s. Also some mails have no such ids and are
|
||||
then imported from docspell without any checks.
|
||||
|
||||
In later versions, docspell may use the checksum of the generated eml
|
||||
file to look for duplicates, too.
|
||||
|
||||
|
||||
## How it works
|
||||
|
||||
Docspell will go through all folders and download mails in “batches”.
|
||||
This size can be set by the admin in the [configuration
|
||||
file](@/docs/configure/_index.md#joex) and applies to all these tasks
|
||||
(same for all users). This batch only contains the mail headers and
|
||||
not the complete mail.
|
||||
|
||||
Then each mail is downloaded completely one by one and converted into
|
||||
an [eml](https://en.wikipedia.org/wiki/Email#Filename_extensions) file
|
||||
which is then submitted to docspell. Then the usual processing
|
||||
machinery starts, just like uploading an eml file via the webapp.
|
||||
|
||||
The number of folders and the number of mails to import can be limited
|
||||
by an admin via the config file. Note that this limit applies to one
|
||||
task run only, it is meant to reduce resource allocation of one task.
|
BIN
website/site/content/docs/webapp/search-bar.png
Normal file
After Width: | Height: | Size: 6.0 KiB |
BIN
website/site/content/docs/webapp/search-content-results.png
Normal file
After Width: | Height: | Size: 44 KiB |
BIN
website/site/content/docs/webapp/search-menu.png
Normal file
After Width: | Height: | Size: 43 KiB |
BIN
website/site/content/docs/webapp/sources-form.png
Normal file
After Width: | Height: | Size: 146 KiB |
177
website/site/content/docs/webapp/uploading.md
Normal file
@ -0,0 +1,177 @@
|
||||
+++
|
||||
title = "Uploads"
|
||||
weight = 0
|
||||
+++
|
||||
|
||||
This page describes, how files can get into docspell. Technically,
|
||||
there is just one way: via http multipart/form-data requests.
|
||||
|
||||
|
||||
## Authenticated Upload
|
||||
|
||||
From within the web application there is the "Upload Files"
|
||||
page. There you can select multiple files to upload. You can also
|
||||
specify whether these files should become one item or if every file is
|
||||
a separate item.
|
||||
|
||||
When you click "Submit" the files are uploaded and stored in the
|
||||
database. Then the job executor(s) are notified which immediately
|
||||
start processing them.
|
||||
|
||||
Go to the top-right menu and click "Processing Queue" to see the
|
||||
current state.
|
||||
|
||||
This obviously requires an authenticated user. While this is handy for
|
||||
ad-hoc uploads, it is very inconvenient for automating it by custom
|
||||
scripts. For this the next variant exists.
|
||||
|
||||
## Anonymous Upload
|
||||
|
||||
It is also possible to upload files without authentication. This
|
||||
should make tools that interact with docspell much easier to write.
|
||||
|
||||
### Creating Anonymous Uploads
|
||||
|
||||
Go to "Collective Settings" and then to the "Source" tab. A *Source*
|
||||
identifies an endpoint where files can be uploaded
|
||||
anonymously. Creating a new source creates a long unique id which is
|
||||
part on an url that can be used to upload files. You can choose any
|
||||
time to deactivate or delete the source at which point uploading is
|
||||
not possible anymore. The idea is to give this URL away safely. You
|
||||
can delete it any time and no passwords or secrets are visible, even
|
||||
your username is not visible.
|
||||
|
||||
Example screenshot:
|
||||
|
||||
{{ figure(file="sources-form.png") }}
|
||||
|
||||
This example shows a source with name "test". Besides a description
|
||||
and a name that is only used for displaying purposes, a priority and a
|
||||
[folder](@/docs/webapp/metadata.md#folders) can be specified.
|
||||
|
||||
The priority is used for the processing jobs that are submitted when
|
||||
files are uploaded via this endpoint.
|
||||
|
||||
The folder is used to place all items, that result from uploads to
|
||||
this endpoint, into this folder.
|
||||
|
||||
The source endpoint defines two urls:
|
||||
|
||||
- `/app/upload/<id>`
|
||||
- `/api/v1/open/upload/item/<id>`
|
||||
|
||||
The first points to a web page where everyone could upload files into
|
||||
your account. You could give this url to people for sending files
|
||||
directly into your docspell.
|
||||
|
||||
The second url is the API url, which accepts the requests to upload
|
||||
files (which is used by the first url).
|
||||
|
||||
For example, this url can be used to upload files with curl:
|
||||
|
||||
``` bash
|
||||
$ curl -XPOST -F file=@test.pdf http://localhost:7880/api/v1/open/upload/item/CqpFTb7UmGe-9nMVPZSmnwc-AHH6nWFh52t-M1JFQ9y7cdH
|
||||
{"success":true,"message":"Files submitted."}
|
||||
```
|
||||
|
||||
You could add more `-F file=@/path/to/your/file.pdf` to upload
|
||||
multiple files (note, the `@` is required by curl, so it knows that
|
||||
the following is a file).
|
||||
|
||||
When files are uploaded to an source endpoint, the items resulting
|
||||
from this uploads are marked with the name of the source. So you know
|
||||
which source an item originated.
|
||||
|
||||
If files are uploaded using the web applications *Upload files* page,
|
||||
the source is implicitly set to `webapp`. If you also want to let
|
||||
docspell count the files uploaded through the web interface, just
|
||||
create a source (can be inactive) with that name (`webapp`).
|
||||
|
||||
|
||||
## Integration Endpoint
|
||||
|
||||
Another option for uploading files is the special *integration
|
||||
endpoint*. This endpoint allows an admin to upload files to any
|
||||
collective, that is known by name.
|
||||
|
||||
```
|
||||
/api/v1/open/integration/item/[collective-name]
|
||||
```
|
||||
|
||||
The endpoint is behind `/api/v1/open`, so this route is not protected
|
||||
by an authentication token (see [REST Api](@/docs/api/_index.md) for
|
||||
more information). However, it can be protected via settings in the
|
||||
configuration file. The idea is that this endpoint is controlled by an
|
||||
administrator and not the user of the application. The admin can
|
||||
enable this endpoint and choose between some methods to protect it.
|
||||
Then the administrator can upload files to any collective. This might
|
||||
be useful to connect other trusted applications to docspell (that run
|
||||
on the same host or network).
|
||||
|
||||
The endpoint is disabled by default, an admin must change the
|
||||
`docspell.server.integration-endpoint.enabled` flag to `true` in the
|
||||
[configuration file](@/docs/configure/_index.md#rest-server).
|
||||
|
||||
If queried by a `GET` request, it returns whether it is enabled and
|
||||
the collective exists.
|
||||
|
||||
It is also possible to check for existing files using their sha256
|
||||
checksum with:
|
||||
|
||||
```
|
||||
/api/v1/open/integration/checkfile/[collective-name]/[sha256-checksum]
|
||||
```
|
||||
|
||||
See the [SMTP gateway](@/docs/tools/smtpgateway.md) or the [consumedir
|
||||
script](@/docs/tools/consumedir.md) for examples to use this endpoint.
|
||||
|
||||
## The Request
|
||||
|
||||
This gives more details about the request for uploads. It is a http
|
||||
`multipart/form-data` request, with two possible fields:
|
||||
|
||||
- meta
|
||||
- file
|
||||
|
||||
The `file` field can appear multiple times and is required at least
|
||||
once. It is the part containing the file to upload.
|
||||
|
||||
The `meta` part is completely optional and can define additional meta
|
||||
data, that docspell uses to create items from the given files. It
|
||||
allows to transfer structured information together with the
|
||||
unstructured binary files.
|
||||
|
||||
The `meta` content must be `application/json` containing this
|
||||
structure:
|
||||
|
||||
``` elm
|
||||
{ multiple: Bool
|
||||
, direction: Maybe String
|
||||
, folder: Maybe String
|
||||
}
|
||||
```
|
||||
|
||||
The `multiple` property is by default `true`. It means that each file
|
||||
in the upload request corresponds to a single item. An upload with 5
|
||||
files will result in 5 items created. If it is `false`, then docspell
|
||||
will create just one item, that will then contain all files.
|
||||
|
||||
Furthermore, the direction of the document (one of `incoming` or
|
||||
`outgoing`) can be given. It is optional, it can be left out or
|
||||
`null`.
|
||||
|
||||
A `folder` id can be specified. Each item created by this request will
|
||||
be placed into this folder. Errors are logged (for example, the folder
|
||||
may have been deleted before the task is executed) and the item is
|
||||
then not put into any folder.
|
||||
|
||||
This kind of request is very common and most programming languages
|
||||
have support for this. For example, here is another curl command
|
||||
uploading two files with meta data:
|
||||
|
||||
``` bash
|
||||
curl -XPOST -F meta='{"multiple":false, "direction": "outgoing"}' \
|
||||
-F file=@letter-en-source.pdf \
|
||||
-F file=@letter-de-source.pdf \
|
||||
http://localhost:7880/api/v1/open/upload/item/CqpFTb7UmGe-9nMVPZSmnwc-AHH6nWFh52t-M1JFQ9y7cdH
|
||||
```
|