2020-07-27 20:13:22 +00:00
|
|
|
|
+++
|
|
|
|
|
title = "Introduction"
|
|
|
|
|
weight = 0
|
2020-08-21 22:18:56 +00:00
|
|
|
|
description = "Gives a short introduction to the goals of docspell and an overview of the components involved."
|
2020-07-27 20:13:22 +00:00
|
|
|
|
insert_anchor_links = "right"
|
|
|
|
|
[extra]
|
|
|
|
|
mktoc = true
|
|
|
|
|
+++
|
|
|
|
|
|
|
|
|
|
# Introduction
|
|
|
|
|
|
|
|
|
|
Docspell aims to be a simple yet effective document organizer that
|
|
|
|
|
makes stowing documents away very quick and finding them later
|
|
|
|
|
reliable (and also fast). It doesn't require technical background or
|
|
|
|
|
studying huge manuals in order to use it. With this in mind, it is
|
|
|
|
|
rather opinionated and more targeted for home use and small/medium
|
|
|
|
|
organizations.
|
|
|
|
|
|
|
|
|
|
Docspell analyzes the text of your files and tries to find metadata
|
|
|
|
|
that will be annotated automatically. This metadata is taken from an
|
|
|
|
|
address book that must be maintained manually. Docspell then looks for
|
|
|
|
|
candidates for:
|
|
|
|
|
|
|
|
|
|
- Correspondents
|
|
|
|
|
- Concerned person or things
|
|
|
|
|
- A date
|
2021-01-25 07:50:46 +00:00
|
|
|
|
- Tags
|
2020-07-27 20:13:22 +00:00
|
|
|
|
|
2021-01-25 07:50:46 +00:00
|
|
|
|
For tags, it sets all that it thinks do apply. For the others, it will
|
|
|
|
|
propose a few candidates and sets the most likely one to your item.
|
2020-07-27 20:13:22 +00:00
|
|
|
|
|
|
|
|
|
This might be wrong, so it is recommended to curate the results.
|
|
|
|
|
However, very often the correct one is either set or within the
|
|
|
|
|
proposals where you fix it by a single click.
|
|
|
|
|
|
|
|
|
|
Besides these properties, there are more metadata you can use to
|
|
|
|
|
organize your files, for example tags, folders and notes.
|
|
|
|
|
|
|
|
|
|
Docspell is also for programmers. Everything is available via a REST
|
|
|
|
|
or HTTP api and can be easily used within your own scripts and tools,
|
|
|
|
|
for example using `curl`. There are also features for "advanced use"
|
|
|
|
|
and many configuration options.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
# Components
|
|
|
|
|
|
|
|
|
|
Docspell consists of multiple components that run in separate
|
|
|
|
|
processes:
|
|
|
|
|
|
|
|
|
|
- REST server
|
|
|
|
|
- JOEX, short for *job executor*
|
|
|
|
|
- Fulltext Search Index (optional, currently Apache SOLR)
|
|
|
|
|
|
|
|
|
|
The REST server provides the Api and the web application. The web
|
|
|
|
|
application is a
|
|
|
|
|
[SPA](https://en.wikipedia.org/wiki/Single-page_application) written
|
|
|
|
|
in [Elm](https://elm-lang.org) and is a client to the REST api. All
|
|
|
|
|
features are available via a http/rest api.
|
|
|
|
|
|
2020-07-31 13:13:07 +00:00
|
|
|
|
The *joex* is the component that does the “heavy work”, executing
|
2020-07-27 20:13:22 +00:00
|
|
|
|
long-running tasks, like processing files or importing your mails
|
|
|
|
|
periodically. While the joex component also exposes a small REST api
|
2020-07-31 13:13:07 +00:00
|
|
|
|
for controlling it, the main user interface is all inside the rest
|
|
|
|
|
server api.
|
2020-07-27 20:13:22 +00:00
|
|
|
|
|
|
|
|
|
The rest server and the job executor can be started multiple times in
|
|
|
|
|
order to scale out. It must be ensured, that all connect to the same
|
2020-07-31 13:13:07 +00:00
|
|
|
|
database. And it is also recommended (though not strictly required),
|
|
|
|
|
that all components can reach each other.
|
2020-07-27 20:13:22 +00:00
|
|
|
|
|
|
|
|
|
The fulltext search index is another separate component, where
|
2020-07-31 13:13:07 +00:00
|
|
|
|
currently only [SOLR](https://lucene.apache.org/solr) is supported.
|
|
|
|
|
Fulltext search is optional, so the SOLR component is not required if
|
|
|
|
|
docspell is run without fulltext search support.
|
2020-07-27 20:13:22 +00:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
# Terms
|
|
|
|
|
|
|
|
|
|
In order to better understand the following pages, some terms are
|
|
|
|
|
explained.
|
|
|
|
|
|
|
|
|
|
## Item
|
|
|
|
|
|
2020-07-31 13:13:07 +00:00
|
|
|
|
An *item* is roughly your document, only that an item may span
|
|
|
|
|
multiple files, which are called *attachments*. An item has *meta
|
|
|
|
|
data* associated:
|
2020-07-27 20:13:22 +00:00
|
|
|
|
|
2020-07-31 13:13:07 +00:00
|
|
|
|
- a *correspondent*: the other side of the communication. It can be
|
2020-07-27 20:13:22 +00:00
|
|
|
|
an organization or a person.
|
2020-07-31 13:13:07 +00:00
|
|
|
|
- a *concerning person* or *equipment*: a person or thing that
|
2020-07-27 20:13:22 +00:00
|
|
|
|
this item is about. Maybe it is an insurance contract about your
|
|
|
|
|
car.
|
2020-07-31 13:13:07 +00:00
|
|
|
|
- *tag*: an item can be tagged with one or more tags (or labels). A
|
2020-07-27 20:13:22 +00:00
|
|
|
|
tag can have a *category*. This is intended for grouping tags, for
|
|
|
|
|
example a category `doctype` could be used to group tags like
|
|
|
|
|
`bill`, `contract`, `receipt` etc. Usually an item is not tagged
|
|
|
|
|
with more than one tag of a category.
|
2020-07-31 13:13:07 +00:00
|
|
|
|
- a *folder*: a folder is similiar to a tag, but an item can only be
|
2020-08-14 21:04:23 +00:00
|
|
|
|
in exactly one folder (or none). Furthermore folders allow to
|
2020-07-27 20:13:22 +00:00
|
|
|
|
associate users, so that items are only visible to the users who are
|
|
|
|
|
members of a folder.
|
2020-07-31 13:13:07 +00:00
|
|
|
|
- an *item date*: this is the date of the document – if this is not
|
2020-07-27 20:13:22 +00:00
|
|
|
|
set, the created date of the item is used.
|
2020-07-31 13:13:07 +00:00
|
|
|
|
- a *due date*: an optional date indicating that something has to be
|
2020-07-27 20:13:22 +00:00
|
|
|
|
done (e.g. paying a bill, submitting it) about this item until this
|
|
|
|
|
date
|
2020-07-31 13:13:07 +00:00
|
|
|
|
- a *direction*: one of "incoming" or "outgoing"
|
|
|
|
|
- a *name*: some item name, defaults to the file name of the
|
2020-07-27 20:13:22 +00:00
|
|
|
|
attachments
|
2020-07-31 13:13:07 +00:00
|
|
|
|
- some *notes*: arbitrary descriptive text. You can use markdown
|
2020-07-27 20:13:22 +00:00
|
|
|
|
here, which is properly formatted in the web application.
|
|
|
|
|
|
|
|
|
|
## Collective
|
|
|
|
|
|
2020-07-31 13:13:07 +00:00
|
|
|
|
The users of the application are part of a *collective*. A
|
|
|
|
|
*collective* is a group of users that share access to the same
|
2020-07-27 20:13:22 +00:00
|
|
|
|
items. The account name is therefore comprised of a *collective name*
|
|
|
|
|
and a *user name*.
|
|
|
|
|
|
|
|
|
|
All users of a collective are equal; they have same permissions to
|
|
|
|
|
access all items. The items don't belong to a user, but to the
|
|
|
|
|
collective.
|
|
|
|
|
|
|
|
|
|
That means, to identify yourself when signing in, you have to give the
|
|
|
|
|
collective name and your user name. By default it is separated by a
|
|
|
|
|
slash `/`, for example `smith/john`. If your user name is the same as
|
|
|
|
|
the collective name, you can omit one; so `smith/smith` can be
|
|
|
|
|
abbreviated to just `smith`.
|
|
|
|
|
|
|
|
|
|
By default, all users can see all items of their collective. A
|
|
|
|
|
*folder* can be used to implement other visibilities: Every user can
|
|
|
|
|
create a folder and associate members. It is possible to put items in
|
|
|
|
|
these folders and docspell shows only items that are either in no
|
|
|
|
|
specific folder or in a folder where the current user is owner or
|
|
|
|
|
member.
|