Initial website

2025-08-05 02:24:52 +00:00 · 2020-07-27 22:13:22 +02:00
parent dbd0f3ff97
commit f8c6f79b10
160 changed files with 8854 additions and 64 deletions
--- a/website/site/content/docs/webapp/_index.md
+++ b/website/site/content/docs/webapp/_index.md
@ -0,0 +1,12 @@
+++
+title = "Web-UI"
+summary = true
+description = "This section describes the features of the web application."
+weight = 50
+insert_anchor_links = "right"
+template = "pages.html"
+sort_by = "weight"
+redirect_to = "docs/webapp/uploading"
+++
+
+No content here.
--- a/website/site/content/docs/webapp/curate.md
+++ b/website/site/content/docs/webapp/curate.md
@ -0,0 +1,66 @@
+++
+title = "Curate Items"
+weight = 20
+++
+
+Curating the items meta data helps finding them later. This page
+describes how you can quickly go through those items and correct or
+amend with existing data.
+
+## Select New items
+
+After files have been uploaded and the job executor created the
+corresponding items, they will show up on the main page. All items,
+the job executor has created are initially marked as *New*. The option
+*only New* in the left search menu can be used to select only new
+items:
+
+{{ figure(file="docspell-curate-1.jpg") }}
+
+
+## Check selected items
+
+Then you can go through all new items and check their metadata: Click
+on the first item to open the detail view. This shows the documents
+and the meta data in the header.
+
+{{ figure(file="docspell-curate-2.jpg") }}
+
+
+## Modify if necessary
+
+To change something, click the *Edit* button in the menu above the
+document view. This will open a form next to your documents. You can
+compare the data with the documents and change as you like. Since the
+item status is *New*, you'll see the suggestions docspell found during
+processing. If there were multiple candidates, you can select another
+one by clicking its name in the suggestion list.
+
+{{ figure(file="docspell-curate-3.jpg") }}
+
+
+When you change something in the form, it is immediatly applied. Only
+when changing text fields, a click on the *Save* symbol next to the
+field is required.
+
+
+## Confirm
+
+If everything looks good, click the *Confirm* button to confirm the
+current data. The *New* status goes away and also the suggestions are
+hidden in this state. You can always go back by clicking the
+*Unconfirm* button.
+
+
+{{ figure(file="docspell-curate-5.jpg") }}
+
+
+## Proceed with next item
+
+To look at the next item in the search results, click the *Next*
+button in the menu (next to the *Edit* button). Clicking next, will
+keep the current view, so you can continue checking the data. If you
+are on the last item, the view switches to the listing view when
+clicking *Next*.
+
+{{ figure(file="docspell-curate-6.jpg") }}
--- a/website/site/content/docs/webapp/docspell-curate-1.jpg
+++ b/website/site/content/docs/webapp/docspell-curate-1.jpg
--- a/website/site/content/docs/webapp/docspell-curate-2.jpg
+++ b/website/site/content/docs/webapp/docspell-curate-2.jpg
--- a/website/site/content/docs/webapp/docspell-curate-3.jpg
+++ b/website/site/content/docs/webapp/docspell-curate-3.jpg
--- a/website/site/content/docs/webapp/docspell-curate-5.jpg
+++ b/website/site/content/docs/webapp/docspell-curate-5.jpg
--- a/website/site/content/docs/webapp/docspell-curate-6.jpg
+++ b/website/site/content/docs/webapp/docspell-curate-6.jpg
--- a/website/site/content/docs/webapp/emailsettings.md
+++ b/website/site/content/docs/webapp/emailsettings.md
@ -0,0 +1,234 @@
+++
+title = "E-Mail Settings"
+weight = 40
+[extra]
+mktoc = true
+++
+
+Docspell has a good integration for E-Mail. You can send e-mails
+related to an item and you can import e-mails from your mailbox into
+docspell.
+
+This requires to define settings to use for sending and receiving
+e-mails. E-Mails are commonly send via
+[SMTP](https://en.wikipedia.org/wiki/Simple_Mail_Transfer_Protocol)
+and for receiving
+[IMAP](https://en.wikipedia.org/wiki/Internet_Message_Access_Protocol)
+is quite common. Docspell has support for SMTP and IMAP. These
+settings are associated to a user, so that each user can specify its
+own settings separately from others in the collective.
+
+*Note: Passwords to your e-mail accounts are stored in plain-text in
+docspell's database. This is necessary to have docspell connect to
+your e-mail account to send mails on behalf of you and receive your
+mails.*
+
+
+## SMTP Settings
+
+For sending mail, you need to provide information to connect to a SMTP
+server. Every e-mail provider has this information somewhere
+available.
+
+Configure this in *User Settings -> E-Mail Settings (SMTP)*:
+
+{{ figure(file="mail-settings-1.png") }}
+
+First, you need to provide some name that is used to recognize this
+account. This name is also used in URLs to docspell and so it must not
+contain whitespace or any special characters. A good value is the
+domain of your provider, for example `gmail.com`, or something like
+that.
+
+These information should be available from your e-mail provider. For
+example, for google-mail it is:
+
+- SMTP Host: `smtp.gmail.com`
+- SMTP Port: `587` or `465`
+- SMTP User: Your Gmail address (for example, example@gmail.com)
+- SMTP Password: Your Gmail password
+- SSL: use `SSL` for port `465` and `StartSSL` for port `587`
+
+Then you need to define the e-mail address that is used for the `From`
+field. This is in most cases the same address as used for the SMTP
+User field.
+
+The `Reply-To` field is optional and can be set to define a different
+e-mail address that your recipients should use to answer a mail.
+
+Once this is setup, you can start sending mails within docspell. It is
+possible to set up these settings for multiple providers, so you can
+choose from which account you want to send mails.
+
+
+## IMAP Settings
+
+For receiving e-mails, you need to provide information to connect to
+an IMAP server. Your e-mail provider should have this information
+somewhere available.
+
+Configure this in *User Settings -> E-Mail Settings (IMAP)*:
+
+{{ figure(file="mail-settings-2.png") }}
+
+First you need to define a *Name* to recognize this connection inside
+docspell. This name is also used in URLs to docspell and so it must
+not contain whitespace or any special characters. A good value is the
+domain of your provider, for example `gmail.com`, or something like
+that.
+
+You can provide imap connections to multiple mailboxes.
+
+Here is an example for posteo.de:
+
+- IMAP Server: `posteo.de`
+- IMAP Port: 143
+- IMAP User: Your posteo address
+- IMAP Password: Your posteo password
+- SSL: use `StartTLS`
+
+
+## SSL / TLS / StartTLS
+
+*Please Note: If `SSL` is set to `None`, then mails will be sent
+unencrypted to your mail provider! If `Ignore certificate check` is
+enabled, connections to your mail provider will succeed even if the
+provider is wrongly configured for SSL/TLS. This flag should only be
+enabled if you know why.*
+
+
+## GMail
+
+Authenticating with GMail may be not so simple. GMail implements an
+authentication scheme called *XOAUTH2* (at least for Imap). It will
+not work with your normal password. This is to avoid giving an
+application full access to your gmail account.
+
+The e-mail integration in docspell relies on the
+[JavaMail](https://javaee.github.io/javamail) library which has
+support for XOAUTH2. It also has documentation on what you need to do
+on your gmail account: <https://javaee.github.io/javamail/OAuth2>.
+
+First you need to go to the [Google Developers
+Console](https://console.developers.google.com) and create an "App" to
+get a Client-Id and a Client-Secret. This "App" will be your instance
+of docspell. You tell google that this app may send and read your
+mails and then you get an *access token* that should be used instead
+of the password.
+
+Once you setup an App in Google Developers Console, you get the
+Client-Id and the Client-Secret, which look something like this:
+
+- Client-Id: 106701....d8c.apps.googleusercontent.com
+- Client-Secret: 5Z1...Kir_t
+
+Google has a python tool to help with getting this access token.
+Download the `oauth2.py` script from
+[here](https://github.com/google/gmail-oauth2-tools) and first create
+an *oauth2-token*:
+
+``` bash
+./oauth2.py --user=your.name@gmail.com \
+   --client_id=106701....d8c.apps.googleusercontent.com \
+   --client_secret=5Z1...Kir_t \
+   --generate_oauth2_token
+```
+
+This will "redirect you" to an URL where you have to authenticate with
+google. Afterwards it lets you add permissions to the app for
+accessing your mail account. The result is another code you need to
+give to the script to proceed:
+
+```
+4/zwE....q0QBAb-99yD7lw
+```
+
+Then the scripts produces this:
+
+```
+Refresh Token: 1//09zH.........Lj6oc2SmFlZww
+Access Token: ya29.a0........SECDQ
+Access Token Expiration Seconds: 3599
+```
+
+The access token can be used to sign in via IMAP with google. The
+Refresh Token doesn't expire and can be used to generate new access
+tokens:
+
+```
+./oauth2.py --user=your.name@gmail.com \
+   --client_id=106701....d8c.apps.googleusercontent.com \
+   --client_secret=5Z1...Kir_t \
+   --refresh_token=1//09zH.........Lj6oc2SmFlZww
+```
+
+Output:
+```
+Access Token: ya29.a0....._q-lX3ypntk3ln0h9Yk
+Access Token Expiration Seconds: 3599
+```
+
+The problem is that the access token expires. Docspell doesn't support
+updating the access token. It could be worked around by setting up a
+cron-job or similiar which uses the `oauth2.py` tool to generate new
+access tokens and update your imap settings via a
+[REST](@/docs/api/_index.md) call.
+
+``` bash
+#!/usr/bin/env bash
+set -e
+
+## Change this to your values:
+
+DOCSPELL_USER="[docspell-user]"
+DOCSPELL_PASSWORD="[docspell-password]"
+DOCSPELL_URL="http://localhost:7880"
+DOCSPELL_IMAP_NAME="gmail.com"
+
+GMAIL_USER="your.name@gmail.com"
+CLIENT_ID="106701....d8c.apps.googleusercontent.com"
+CLIENT_SECRET="secret=5Z1...Kir_t"
+REFRESH_TOKEN="1//09zH.........Lj6oc2SmFlZww"
+# Path to the oauth2.py tool
+OAUTH_TOOL="./oauth2.py"
+
+##############################################################################
+## Script
+
+
+# Login to docspell and store the auth-token
+AUTH_DATA=$(curl --silent -XPOST \
+                 -H 'Content-Type: application/json' \
+                 --data-binary "{\"account\":\"$DOCSPELL_USER\",\"password\":\"$DOCSPELL_PASSWORD\"}" \
+                 $DOCSPELL_URL/api/v1/open/auth/login)
+if [ $(echo $AUTH_DATA | jq .success) == "false" ]; then
+    echo "Auth failed"
+    echo $AUTH_DATA
+fi
+TOKEN="$(echo $AUTH_DATA | jq -r .token)"
+
+
+# Get the imap settings
+UPDATE_URL="$DOCSPELL_URL/api/v1/sec/email/settings/imap/$DOCSPELL_IMAP_NAME"
+IMAP_DATA=$(curl -s -H "X-Docspell-Auth: $TOKEN" "$UPDATE_URL")
+
+echo "Current Settings:"
+echo $IMAP_DATA | jq
+
+
+# Get the new access token
+ACCESS_TOKEN=$($OAUTH_TOOL --user=$GMAIL_USER \
+    --client_id="$CLIENT_ID" \
+    --client_secret="$CLIENT_SECRET" \
+    --refresh_token="$REFRESH_TOKEN" | head -n1 | cut -d':' -f2 | xargs)
+
+# Update settings
+echo "Updating IMAP settings"
+NEW_IMAP=$(echo $IMAP_DATA | jq ".imapPassword |= \"$ACCESS_TOKEN\"")
+curl -s -XPUT -H "X-Docspell-Auth: $TOKEN" \
+     -H 'Content-Type: application/json' \
+     --data-binary "$NEW_IMAP" "$UPDATE_URL"
+echo
+echo "New Settings:"
+curl -s -H "X-Docspell-Auth: $TOKEN" "$UPDATE_URL" | jq
+```
--- a/website/site/content/docs/webapp/finding.md
+++ b/website/site/content/docs/webapp/finding.md
@ -0,0 +1,178 @@
+++
+title = "Finding Items"
+weight = 30
+[extra]
+mktoc = true
+++
+
+Items can be searched by their annotated meta data and their contents
+using full text search. The landing page shows a list of current
+items. Items are displayed sorted by their date, newest first.
+
+Docspell has two modes for searching: a simple search bar and a search
+menu with many options. Both are active at the same time, but only one
+is visible. You can switch between them without affecting the results.
+
+
+## Search Bar
+
+{{ imgright(file="search-bar.png") }}
+
+By default, the search bar is shown. It provides a refined view of the
+search menu. The dropdown contains different options to do a quick
+search.
+
+### *All Names* and *Contents*
+
+These two options correspond to the same named field in the search
+menu. If you switch between search menu and search bar (by clicking
+the icon on the left), you'll see that they are the same fields.
+Typing in the search bar also fills the corresponding field in the
+search menu (and vice versa).
+
+- The *All Names* searches in the item name, item notes, names of
+  correspondent organization and person, and names of concering person
+  and equipment. It uses a simple substring search.
+- The option *Contents* searches the contents of all attachments
+  (documents), attachment names, the item name and item notes. It uses
+  full text search. However, it does not search the names of attached
+  meta data.
+
+When searching with one of these fields active, it simply submits the
+(hidden) search menu. So if the menu has other fields filled out, they
+will affect the result, too. Using one of these fields, the bar is
+just a reduced view of the search menu.
+
+So you can choose tags or correspondents in the search menu and
+further restrict the results using full text search. The results will
+be returned sorted by the item date, newest first.
+
+If the left button in the search bar shows a little blue bubble, it
+means that there are more search fields filled out in the search menu
+that you currently can't see. In this case the results are not only
+restricted by the search term given in the search-bar, but also by
+what is specified in the search menu.
+
+
+### *Contents Only*
+
+This option has no corresponding part in the search menu. Searching
+with this option active, there is only a full text search done in the
+attachments contents, attachment names, item name and item notes.
+
+The results are not ordered by item date, but by relevance with
+respect to the search term. This ordering is returned from the full
+text search engine and is simply transfered unmodified.
+
+
+## Search Menu
+
+{{ imgright(file="search-menu.png") }}
+
+The search menu can be opened by clicking the left icon in the top
+bar. It shows some options to constrain the item list:
+
+### Show new items
+
+Clicking the checkbox "Only new" shows items that have not been
+"Confirmed". All items that have been created by docspell and not
+looked at are marked as "new" automatically.
+
+### Names
+
+Searches in names of certain properties. The `All Names` field is the
+same as the search in the search bar (see above).
+
+The `Name` field only searches in the name property of an item.
+
+### Folder
+
+Set a folder to only show items in that folder. If no folder is set,
+all accessible items are shown. These are all items that either have
+no folder set, or a folder where the current user is member.
+
+### Tags
+
+Specify a list of tags that the items must have. When adding tags to
+the "Include" list, an item must have all these tags in order to be
+included in the results.
+
+When adding tags to the "Exclude" list, then an item is removed from
+the results if it has at least one of these tags.
+
+### Correspondent
+
+Pick a correspondent to show only these items.
+
+### Concerned
+
+Pick a concerned entity to show only these items.
+
+### Date
+
+Specify a date range to show only items whose date property is within
+this range. If you want to see items of a specific day, choose the
+same day for both fields.
+
+For items that don't have an explicitly date property set, the created
+date is used.
+
+### Due Date
+
+Specify a date range to show only items whose due date property is
+within this range. Items without a due date are not shown.
+
+
+### Direction
+
+Specify whether to show only incoming, only outgoing or all items.
+
+
+## Customize Substring Search
+
+The substring search of the *All Names* and *Name* field can be
+customized in the following way: A wildcard `*` can be used at the
+start or end of a search term to do a substring match. A `*` means
+"everything". So a term `*company` matches all names ending in
+`company` and `*company*` matches all names containing the word
+`company`. The matching is case insensitive.
+
+Docspell adds a `*` to the front and end of a term automatically,
+unless one of the following is true:
+
+- The term already has a wildcard.
+- The term is enclosed in quotes `"`.
+
+
+## Full Text Search
+
+
+### The Query
+
+The query string for full text search is very powerful. Docspell
+currently supports [Apache SOLR](https://lucene.apache.org/solr/) as
+full text search backend, so you may want to have a look at their
+[documentation on query
+syntax](https://lucene.apache.org/solr/guide/8_4/query-syntax-and-parsing.html#query-syntax-and-parsing)
+for a in depth guide.
+
+- Wildcards: `?` matches any single character, `*` matches zero or
+  more characters
+- Fuzzy search: Appending a `~` to a term, results in a fuzzy search
+  (search this term and similiar spelled ones)
+- Proximity Search: Search for terms that "near" each other, again
+  using `~` appended to a search phrase. Example: `"cheese cake"~5`.
+- Boosting: apply more weight to a term with `^`. Example: `cheese^4
+  cake` – cheese is 4x more important.
+
+Docspell will preprocess the search query to prepare a query for SOLR.
+It will by default search all indexed fields, which are: attachment
+contents, attachment names, item name and item notes.
+
+
+### The Results
+
+When using full text search, each item in the result list is annotated
+with the highlighted occurrence of the match.
+
+{{ figure(file="search-content-results.png") }}
--- a/website/site/content/docs/webapp/mail-item-1.jpg
+++ b/website/site/content/docs/webapp/mail-item-1.jpg
--- a/website/site/content/docs/webapp/mail-item-2.jpg
+++ b/website/site/content/docs/webapp/mail-item-2.jpg
--- a/website/site/content/docs/webapp/mail-item-3.jpg
+++ b/website/site/content/docs/webapp/mail-item-3.jpg
--- a/website/site/content/docs/webapp/mail-item-4.jpg
+++ b/website/site/content/docs/webapp/mail-item-4.jpg
--- a/website/site/content/docs/webapp/mail-settings-1.png
+++ b/website/site/content/docs/webapp/mail-settings-1.png
--- a/website/site/content/docs/webapp/mail-settings-2.png
+++ b/website/site/content/docs/webapp/mail-settings-2.png
--- a/website/site/content/docs/webapp/mailitem.md
+++ b/website/site/content/docs/webapp/mailitem.md
@ -0,0 +1,70 @@
+++
+title = "Send items via E-Mail"
+weight = 50
+[extra]
+mktoc = true
+++
+
+You can send e-mails from within docspell attaching the files of an
+item. This is useful to collaborate or share certain documents with
+people outside docspell.
+
+All sent mails are stored attached to the item.
+
+
+## E-Mail Settings (SMTP)
+
+To send mails, there are SMTP settings required. Please see the page
+about [e-mail settings](@/docs/webapp/emailsettings.md#smtp-settings).
+
+
+## Sending Mails
+
+Currently, it is possible to send mails related to only one item. You
+can define the mail body and docspell will add the attachments of an
+item, or you may choose to send the mail without any attachments.
+
+In the item detail view, click on the envelope icon to open the mail
+form:
+
+{{ figure(file="mail-item-1.jpg") }}
+
+Then write the mail. Multiple recipients may be specified. The input
+field shows completion proposals from all contacts in your address
+book (from organizations and persons). Choose an address by pressing
+*Enter* or by clicking a proposal from the list. The proposal list can
+be iterated by the *Up* and *Down* arrows. You can type in any
+address, of course, it doesn't need to match a proposal.
+
+If you have multiple mail settings defined, you can choose in the top
+dropdown which account to use for sending.
+
+The last checkbox allows to choose whether docspell should add all
+attachments of the item to the mail. If it is unchecked, no
+attachments will be added. It is currently not possible to pick
+specific attachments, it's all or nothing.
+
+Clicking *Cancel* will delete the inputs and close the mail form, but
+clicking the envelope icon again, will only close the form without
+clearing its contents.
+
+The *Send* button is active once all input fields have been filled.
+Once you click *Send*, the docspell server will send the mail using
+your connection settings. If that succeeds the mail is saved to the
+database and you'll see a message in the form.
+
+## Accessing Sent Mails
+
+If there is an e-mail for an item, a tab shows up at the right side,
+next to the attachments.
+
+{{ figure(file="mail-item-2.jpg") }}
+
+This tab shows a list of all mails that have been sent related to this
+item.
+
+{{ figure(file="mail-item-3.jpg") }}
+
+Clicking on a mail opens it in detail.
+
+{{ figure(file="mail-item-4.jpg") }}
--- a/website/site/content/docs/webapp/metadata.md
+++ b/website/site/content/docs/webapp/metadata.md
@ -0,0 +1,114 @@
+++
+title = "Meta Data"
+weight = 10
+[extra]
+mktoc = true
+++
+
+Docspell processes each uploaded file. Processing involves extracting
+archives, extracting text, anlyzing the extracted text and converting
+the file into a pdf. Text is analyzed to find metadata that can be set
+automatically. Docspell compares the extracted text against a set of
+known meta data. The *Meta Data* page allows to manage this meta data:
+
+- Tags
+- Organizations
+- Persons
+- Equipments
+- Folders
+
+
+### Tags
+
+Items can be tagged with multiple custom tags (aka labels). This
+allows to describe many different workflows people may have with their
+documents.
+
+A tag can have a *category*. This is meant to group tags together. For
+example, you may want to have a tag category *doctype* that is
+comprised of tags like *bill*, *contract*, *receipt* and so on. Or for
+workflows, a tag category *state* may exist that includes tags like
+*Todo* or *Waiting*. Or you can tag items with user names to provide
+"assignment" semantics. Docspell doesn't propose any workflow, but it
+can help to implement some.
+
+The tags are *not* taken into account when processing. Docspell will
+not automatically associate tags to your items. The tags are only
+meant to be used manually for now.
+
+
+### Organization and Person
+
+The organization entity represents an non-personal (organization or
+company) correspondent of an item. Docspell will choose one or more
+organizations when processing documents and associate the "best" match
+with your item.
+
+The person entitiy can appear in two roles: It may be a correspondent
+or the person an item is about. So a person is either a correspondent
+or a concerning person. Docspell can not know which person is which,
+therefore you need to tell this by checking the box "Use for
+concerning person suggestion only". If this is checked, docspell will
+use this person only to suggest a concerning person. Otherwise the
+person is used only for correspondent suggestions.
+
+Document processing uses the following properties:
+
+- name
+- websites
+- e-mails
+
+The website and e-mails can be added as contact information. If these
+three are present, you should get good matches from docspell. All
+other fields of an organization and person are not used during
+document processing. They might be useful when using this as a real
+address book.
+
+
+### Equipment
+
+The equipment entity is almost like a tag. In fact, it could be
+replaced by a tag with a specific known category. The difference is
+that docspell will try to find a match and associate it with your
+item. The equipment represents non-personal things that an item is
+about. Examples are: bills or insurances for *cars*, contracts for
+*houses* or *flats*.
+
+Equipments don't have contact information, so the only property that
+is used to find matches during document processing is its name.
+
+
+### Folders
+
+Folders provide a way to divide all documents into disjoint subsets.
+Unlike with tags, an item can have at most one folder or none. A
+folder has an owner – the user who created the folder. Additionally,
+it can have members: users of the collective that the owner can assign
+to a folder.
+
+When searching for items, the results are restricted to items that
+have either no folder assigned or a folder where the current user is
+owner or member. It can be used to control visibility when searching.
+However: there are no hard access checks. For example, if the item id
+is known, any user of the collective can see it and modify its meta
+data.
+
+One use case is, that you can hide items from other users, like bills
+for birthday presents. In this case it is very unlikely that someone
+can guess the item-id.
+
+While folders are *not* taken into account when processing documents,
+they can be specified with the upload request or a [source
+url](uploading#anonymous-upload) to have them automatically set when
+they arrive.
+
+
+## Document Language
+
+An important setting is the language of your documents. This helps OCR
+and text analysis. You can select between English and German
+currently.
+
+Go to the *Collective Settings* page and click *Document
+Language*. This will set the lanugage for all your documents. It is
+not (yet) possible to specify it when uploading.
--- a/website/site/content/docs/webapp/notify-due-items.jpg
+++ b/website/site/content/docs/webapp/notify-due-items.jpg
--- a/website/site/content/docs/webapp/notifydueitems.md
+++ b/website/site/content/docs/webapp/notifydueitems.md
@ -0,0 +1,74 @@
+++
+title = "Notify about due items"
+weight = 60
+[extra]
+mktoc = true
+++
+
+A user that provides valid email (smtp) settings, can be notified by
+docspell about due items. You will then receive an e-mail containing a
+list of items, sorted by their due date.
+
+You need first define smtp settings, please see [this
+page](@/docs/webapp/emailsettings.md#smtp-settings).
+
+Notifying works simply by searching for due items periodically. It
+will be submitted to the job queue and is picked up by an available
+[job executor](joex) eventually. This can be setup in the user
+settings page.
+
+{{ figure(file="notify-due-items.jpg") }}
+
+At first, the task can be disabled/enabled any time.
+
+Then two settings are required for sending an e-mail. You need to
+specify the connection to use and the recipients.
+
+It follows some settings to customize the query for searching items.
+You can choose to only include items that have one or more tags (these
+are `and`-ed, so all tags must exist on the item). You can also
+provide tags that must *not* appear on an item (these tags are
+`or`-ed, so only one such tag is enough ot exclude an item). A common
+use-case would be to manually tag an item with *Done* once there is
+nothing more to do. Then these items can be excluded from the search.
+The somewhat inverse use-case is to always tag items with a *Todo* tag
+and remove it once completed.
+
+The *Remind Days* field species the number of days the due date may be
+in the future. Each time the task executes, it searches for items with
+a due date lower than `today + remindDays`.
+
+If you don't restrict the search using tags, then all items with a due
+date lower than this value are selected. Since items are (usually) not
+deleted, this only makes sense, if you remove the due date once you
+are done with an item.
+
+The last option is to check *cap overdue items*, which uses the value
+in *Remind Days* to further restrict the due date of an item: only
+those with a due date *greater than* `today - remindDays` are
+selected. In other words, only items with an overdue time of *at most*
+*Remind Days* are included.
+
+The *Schedule* field specifies the periodicity. The syntax is similiar
+to a date-time string, like `2019-09-15 12:32`, where each part is a
+pattern to also match multple values. The ui tries to help a little by
+displaying the next two date-times this task would execute. A more in
+depth help is available
+[here](https://github.com/eikek/calev#what-are-calendar-events). For
+example, to execute the task every monday at noon, you would write:
+`Mon *-*-* 12:00`. A date-time part can match all values (`*`), a list
+of values (e.g. `1,5,12,19`) or a range (e.g. `1..9`). Long lists may
+be written in a shorter way using a repetition value. It is written
+like this: `1/7` which is the same as a list with `1` and all
+multiples of `7` added to it. In other words, it matches `1`, `1+7`,
+`1+7+7`, `1+7+7+7` and so on.
+
+You can click on *Start Once* to run this task right now, without
+saving the form to the database ("right now" means it is picked up by
+a free job executor).
+
+If you click *Submit* these settings are saved and the task runs
+periodically.
+
+You can see the task executing at the [processing
+page](@/docs/webapp/processing.md).
--- a/website/site/content/docs/webapp/processing-queue.jpg
+++ b/website/site/content/docs/webapp/processing-queue.jpg
--- a/website/site/content/docs/webapp/processing.md
+++ b/website/site/content/docs/webapp/processing.md
@ -0,0 +1,39 @@
+++
+title = "Processing Queue"
+weight = 80
+[extra]
+mktoc = true
+++
+
+
+The page *Processing Queue* shows the current state of document
+processing for your uploads.
+
+At the top of the page a list of running jobs is shown. Below that,
+the left column shows jobs that wait to be picked up by the job
+executor. On the right are finished jobs. The number of finished jobs
+is cut to some maximum and is also restricted by a date range. The
+page refreshes itself automatically to show the progress.
+
+Example screenshot:
+
+{{ figure(file="processing-queue.jpg") }}
+
+You can cancel running jobs or remove waiting ones from the queue. If
+you click on the small file symbol on finished jobs, you can inspect
+its log messages again. A running job displays the job executor id
+that executes the job.
+
+The jobs listed here are all long-running tasks for your collective.
+Most of the time it executes the document processing tasks. But user
+defined tasks, like "import mailbox", are also visible here.
+
+Since job executors are shared among all collectives, it may happen
+that a job is some time waiting until it is picked up by a job
+executor. You can always start more job executors to help out.
+
+If a job fails, it is retried after some time. Only if it fails too
+often (can be configured), it then is finished with *failed* state.
+
+For the document-processing task, if processing finally fails or a job
+is cancelled, the item is still created, just without suggestions.
--- a/website/site/content/docs/webapp/scanmailbox-detail.png
+++ b/website/site/content/docs/webapp/scanmailbox-detail.png
--- a/website/site/content/docs/webapp/scanmailbox-list.png
+++ b/website/site/content/docs/webapp/scanmailbox-list.png
--- a/website/site/content/docs/webapp/scanmailbox.md
+++ b/website/site/content/docs/webapp/scanmailbox.md
@ -0,0 +1,122 @@
+++
+title = "Scan Mailboxes"
+weight = 70
+[extra]
+mktoc = true
+++
+
+User that provide valid email (imap) settings, can import mails from
+their mailbox into docspell periodically.
+
+You need first define imap settings, please see [this
+page](@/docs/webapp/emailsettings.md#imap-settings).
+
+Go to *User Settings -> Scan Mailbox Task*. You can define periodic
+tasks that connects to your mailbox and import mails into docspell. It
+is possible to define multiple tasks, for example, if you have
+multiple e-mail accounts you want to import periodically.
+
+{{ figure(file="scanmailbox-list.png") }}
+
+
+## Details
+
+Creating a task requires the following information:
+
+{{ figure(file="scanmailbox-detail.png") }}
+
+You can enable or disable this task. A disabled task will not run
+periodically. You can still choose to run it manually if you click the
+`Start Once` button.
+
+Then you need to specify which [IMAP
+connection](@/docs/webapp/emailsettings.md#imap-settings) to use.
+
+A list of folders is required. Docspell will only look into these
+folders. You can specify multiple folders. The "Inbox" folder is a
+special folder, which will usually appear translated in your web-mail
+client. You can specify "INBOX" case insensitive, it will then read
+mails in your inbox. Any other folder is usually case-sensitive
+(depends on the imap server, but usually they are case sensitive
+except the INBOX folder). Type in a folder name and click the add
+button on the right.
+
+Then the field *Received Since Hours* defines how many hours to go
+back and look for mails. Usually there are many mails in your inbox
+and importing them all at once is not feasible or desirable. It can
+work together with the *Schedule* field below. For example, you could
+run this task all 6 hours and read mails from 8 hours back.
+
+The next two settings tell docspell what to do once a mail has been
+submitted to docspell. It can be moved into another folder in your
+mail account. This moves it out of the way for the next run. You can
+also choose to delete the mail, but *note that it will really be
+deleted and not moved to your trash folder*. If both options are off,
+nothing happens with that mail, it simply stays (and could be re-read
+on the next run).
+
+When docspell creates an item from a mail, it needs to set a direction
+value (incoming or outgoing). If you know that all mails you want to
+import have a specific directon, then you can set it here. Otherwise,
+*automatic* means that docspell chooses a direction based on the
+`From` header of a mail. If the `From` header is an e-mail address
+that belongs to a “concerning” person in your address book, then it is
+set to "outgoing". Otherwise it is set to "incoming". To support this,
+you need to add your own e-mail address(es) to your address book.
+
+The *Item Folder* setting is used to put all items that are created
+from mails into the specified [folder](metadata#folders). If you
+define a folder here, where you are not a member, you won't find
+resulting items.
+
+The last field is the *Schedule* which defines when and how often this
+task should run. The syntax is similiar to a date-time string, like
+`2019-09-15 12:32`, where each part is a pattern to also match multple
+values. The ui tries to help a little by displaying the next two
+date-times this task would execute. A more in depth help is available
+[here](https://github.com/eikek/calev#what-are-calendar-events). For
+example, to execute the task every monday at noon, you would write:
+`Mon *-*-* 12:00`. A date-time part can match all values (`*`), a list
+of values (e.g. `1,5,12,19`) or a range (e.g. `1..9`). Long lists may
+be written in a shorter way using a repetition value. It is written
+like this: `1/7` which is the same as a list with `1` and all
+multiples of `7` added to it. In other words, it matches `1`, `1+7`,
+`1+7+7`, `1+7+7+7` and so on.
+
+
+## Reading Mails twice / Duplicates
+
+Since users can move around mails in their mailboxes, it can happen
+that docspell unintentionally reads a mail multiple times. If docspell
+reads a mail, it will first check if an item already exists that
+originated from this mail. It only proceeds to import it, if it cannot
+find any. If you deleted an item in the meantime, docspell would
+import the mail again.
+
+This check uses the
+[`Message-ID`](https://en.wikipedia.org/wiki/Message-ID) of an e-mail.
+This is usually there and should identify a complete mail. But it
+won't catch duplicate mails, that are sent multiple times - they might
+have different `Message-ID`s. Also some mails have no such ids and are
+then imported from docspell without any checks.
+
+In later versions, docspell may use the checksum of the generated eml
+file to look for duplicates, too.
+
+
+## How it works
+
+Docspell will go through all folders and download mails in “batches”.
+This size can be set by the admin in the [configuration
+file](@/docs/configure/_index.md#joex) and applies to all these tasks
+(same for all users). This batch only contains the mail headers and
+not the complete mail.
+
+Then each mail is downloaded completely one by one and converted into
+an [eml](https://en.wikipedia.org/wiki/Email#Filename_extensions) file
+which is then submitted to docspell. Then the usual processing
+machinery starts, just like uploading an eml file via the webapp.
+
+The number of folders and the number of mails to import can be limited
+by an admin via the config file. Note that this limit applies to one
+task run only, it is meant to reduce resource allocation of one task.
--- a/website/site/content/docs/webapp/search-bar.png
+++ b/website/site/content/docs/webapp/search-bar.png
--- a/website/site/content/docs/webapp/search-content-results.png
+++ b/website/site/content/docs/webapp/search-content-results.png
--- a/website/site/content/docs/webapp/search-menu.png
+++ b/website/site/content/docs/webapp/search-menu.png
--- a/website/site/content/docs/webapp/sources-form.png
+++ b/website/site/content/docs/webapp/sources-form.png
--- a/website/site/content/docs/webapp/uploading.md
+++ b/website/site/content/docs/webapp/uploading.md
@ -0,0 +1,177 @@
+++
+title = "Uploads"
+weight = 0
+++
+
+This page describes, how files can get into docspell. Technically,
+there is just one way: via http multipart/form-data requests.
+
+
+## Authenticated Upload
+
+From within the web application there is the "Upload Files"
+page. There you can select multiple files to upload. You can also
+specify whether these files should become one item or if every file is
+a separate item.
+
+When you click "Submit" the files are uploaded and stored in the
+database. Then the job executor(s) are notified which immediately
+start processing them.
+
+Go to the top-right menu and click "Processing Queue" to see the
+current state.
+
+This obviously requires an authenticated user. While this is handy for
+ad-hoc uploads, it is very inconvenient for automating it by custom
+scripts. For this the next variant exists.
+
+## Anonymous Upload
+
+It is also possible to upload files without authentication. This
+should make tools that interact with docspell much easier to write.
+
+### Creating Anonymous Uploads
+
+Go to "Collective Settings" and then to the "Source" tab. A *Source*
+identifies an endpoint where files can be uploaded
+anonymously. Creating a new source creates a long unique id which is
+part on an url that can be used to upload files. You can choose any
+time to deactivate or delete the source at which point uploading is
+not possible anymore. The idea is to give this URL away safely. You
+can delete it any time and no passwords or secrets are visible, even
+your username is not visible.
+
+Example screenshot:
+
+{{ figure(file="sources-form.png") }}
+
+This example shows a source with name "test". Besides a description
+and a name that is only used for displaying purposes, a priority and a
+[folder](@/docs/webapp/metadata.md#folders) can be specified.
+
+The priority is used for the processing jobs that are submitted when
+files are uploaded via this endpoint.
+
+The folder is used to place all items, that result from uploads to
+this endpoint, into this folder.
+
+The source endpoint defines two urls:
+
+- `/app/upload/<id>`
+- `/api/v1/open/upload/item/<id>`
+
+The first points to a web page where everyone could upload files into
+your account. You could give this url to people for sending files
+directly into your docspell.
+
+The second url is the API url, which accepts the requests to upload
+files (which is used by the first url).
+
+For example, this url can be used to upload files with curl:
+
+``` bash
+$ curl -XPOST -F file=@test.pdf http://localhost:7880/api/v1/open/upload/item/CqpFTb7UmGe-9nMVPZSmnwc-AHH6nWFh52t-M1JFQ9y7cdH
+{"success":true,"message":"Files submitted."}
+```
+
+You could add more `-F file=@/path/to/your/file.pdf` to upload
+multiple files (note, the `@` is required by curl, so it knows that
+the following is a file).
+
+When files are uploaded to an source endpoint, the items resulting
+from this uploads are marked with the name of the source. So you know
+which source an item originated.
+
+If files are uploaded using the web applications *Upload files* page,
+the source is implicitly set to `webapp`. If you also want to let
+docspell count the files uploaded through the web interface, just
+create a source (can be inactive) with that name (`webapp`).
+
+
+## Integration Endpoint
+
+Another option for uploading files is the special *integration
+endpoint*. This endpoint allows an admin to upload files to any
+collective, that is known by name.
+
+```
+/api/v1/open/integration/item/[collective-name]
+```
+
+The endpoint is behind `/api/v1/open`, so this route is not protected
+by an authentication token (see [REST Api](@/docs/api/_index.md) for
+more information). However, it can be protected via settings in the
+configuration file. The idea is that this endpoint is controlled by an
+administrator and not the user of the application. The admin can
+enable this endpoint and choose between some methods to protect it.
+Then the administrator can upload files to any collective. This might
+be useful to connect other trusted applications to docspell (that run
+on the same host or network).
+
+The endpoint is disabled by default, an admin must change the
+`docspell.server.integration-endpoint.enabled` flag to `true` in the
+[configuration file](@/docs/configure/_index.md#rest-server).
+
+If queried by a `GET` request, it returns whether it is enabled and
+the collective exists.
+
+It is also possible to check for existing files using their sha256
+checksum with:
+
+```
+/api/v1/open/integration/checkfile/[collective-name]/[sha256-checksum]
+```
+
+See the [SMTP gateway](@/docs/tools/smtpgateway.md) or the [consumedir
+script](@/docs/tools/consumedir.md) for examples to use this endpoint.
+
+## The Request
+
+This gives more details about the request for uploads. It is a http
+`multipart/form-data` request, with two possible fields:
+
+- meta
+- file
+
+The `file` field can appear multiple times and is required at least
+once. It is the part containing the file to upload.
+
+The `meta` part is completely optional and can define additional meta
+data, that docspell uses to create items from the given files. It
+allows to transfer structured information together with the
+unstructured binary files.
+
+The `meta` content must be `application/json` containing this
+structure:
+
+``` elm
+{ multiple: Bool
+, direction: Maybe String
+, folder: Maybe String
+}
+```
+
+The `multiple` property is by default `true`. It means that each file
+in the upload request corresponds to a single item. An upload with 5
+files will result in 5 items created. If it is `false`, then docspell
+will create just one item, that will then contain all files.
+
+Furthermore, the direction of the document (one of `incoming` or
+`outgoing`) can be given. It is optional, it can be left out or
+`null`.
+
+A `folder` id can be specified. Each item created by this request will
+be placed into this folder. Errors are logged (for example, the folder
+may have been deleted before the task is executed) and the item is
+then not put into any folder.
+
+This kind of request is very common and most programming languages
+have support for this. For example, here is another curl command
+uploading two files with meta data:
+
+``` bash
+curl -XPOST -F meta='{"multiple":false, "direction": "outgoing"}' \
+            -F file=@letter-en-source.pdf \
+            -F file=@letter-de-source.pdf \
+            http://localhost:7880/api/v1/open/upload/item/CqpFTb7UmGe-9nMVPZSmnwc-AHH6nWFh52t-M1JFQ9y7cdH
+```