Commit Graph

269 Commits

Author SHA1 Message Date
9013f2de5b Update scalafmt settings 2021-09-22 17:23:24 +02:00
20a829cf7a Refactoring for migrating to binny library 2021-09-22 14:18:43 +02:00
9785db0683 Change license header of all files 2021-09-21 22:35:38 +02:00
193b81bf7d Fix version check
Refs: #1068
2021-09-21 22:07:19 +02:00
751fa3da5a Add attachments-only filter to uploads
When uploading a file which is an e-mail, this option allows to skip
the mail body when the file is being processed.
2021-08-21 13:49:12 +02:00
5d33b3841a Add a task to check for updates periodically
It must be enabled and configured by the admin.

Refs: #990
2021-08-20 00:25:27 +02:00
90421599ea Fix storing empty-trash task
It was wrongly stored using RPeriodicTask directly, but the higher
level `UserTask` must be used instead, because this ensures a
correctly scoped periodic task using the `updateOneTask` method. Since
this is a system task, it can be given a fixed ID which makes it now
safe even if stored using RPeriodicTask directly.

The bug resulted in multiple empty-trash tasks to be inserted (on each
restart).

Refs: #347
2021-08-20 00:25:25 +02:00
e4fecefaea Reformat with scalafmt 3.0.0 2021-08-19 08:50:30 +02:00
14e4a8f792 Fixup for deleting items
First, when checking for existence of a file, deleted items are not
conisdered.

The working with fulltext search has been changed: deleted items are
removed from fulltext index and are re-added when they are restored.
The fulltext index currently doesn't hold the item state and it would
mean much more work to introduce it into the index (or, worse, to
reprocess the results from the index). Thus, deleted items can only be
searched using database queries. It is probably a very rare use case
when fulltext search should be applied to deleted items. They have
been deleted for a reason and the most likely case is that they are
simply removed.

Refs: #347
2021-08-15 16:00:30 +02:00
f4a2b86ea8 Use a minimum age of items to remove
In order to keep deleted items for a while, the periodic task can now
use a duration to only remove items with a certain age. This can be
used to ensure that a deleted item stays at least X days before it
will be removed from the database.

Refs: #347
2021-08-15 12:32:50 +02:00
31d885ed79 Refactor user tasks to support collective and user scopes
Before, there were periodic tasks run per collective and not user by
making sure that submitter + group are the same value. This is now
encoded in `UserTaskScope` so it is now obvious and errors can be
reduced when using this.
2021-08-14 22:07:56 +02:00
27fd7a5867 Make sure the empty-trash task is started for all collectives 2021-08-14 20:40:04 +02:00
50706c3d6d Add a task implementation to delete items 2021-08-14 19:33:18 +02:00
1901fe1a8c Adopt deprecated APIs from fs2; use fs2.Path 2021-08-07 17:51:56 +02:00
1c0d87527b Log error when setting folder doesn't work 2021-07-17 15:10:00 +02:00
8e5c88fd32 Add copyright header to source files 2021-07-04 10:57:53 +02:00
bd791b4593 Upgrade code base to CE3 2021-06-22 22:53:34 +02:00
ac7d00c28f Refactor re-index task 2021-06-07 21:17:29 +02:00
3ee0846e19 Remove fts_migration table
It is now stored it SOLR instead.
2021-06-07 17:53:47 +02:00
5205ee0623 Store solr migration state in a solr document 2021-06-07 17:53:37 +02:00
bdc7822f50 Add documentation about docker setup 2021-05-31 22:19:49 +02:00
e1bbc2edf5 Apply autoformat 2021-04-10 16:31:58 +02:00
144ea852bf Update fs2-core, fs2-io to 2.5.4 2021-03-31 21:10:42 +02:00
c36073b852 Allow to give human readable summary to user tasks 2021-03-27 22:13:13 +01:00
cc38b850a6 Remove deprecated search routes and some refactoring 2021-03-27 22:13:13 +01:00
f8bd42e5bd Redo pdf conversion and text extraction on reprocess
When processing a new file conversion and text extraction is skipped
if detected to be already done. This prevents running expensive tasks
again after restarting/retrying. When explicitely reprocessing a file,
these tasks should run again and replace the existing results.
2021-03-12 00:45:28 +01:00
a7ee0aa08b Add a flag to processing task to distinguish re-/processing 2021-03-12 00:45:23 +01:00
058c31e1f6 Reprocessing now sets metadata to an item if not in state confirmed
When reprocessing an item, the metadat of all *files* are replaced.
This change now also sets some metadat to an item, but only if the
item is not in state "confirmed". Confirmed items are not touched, but
the metadata of the files is updated.
2021-03-12 00:16:19 +01:00
0229a867af Add a use colum to metadata entities 2021-03-10 23:55:18 +01:00
6a63694a3e Convert unit tests to munit 2021-03-10 19:48:56 +01:00
9013d9264e Add more convenient date parsers and some basic macros 2021-03-01 00:51:01 +01:00
e9ed998e3a Basic poc to search via custom query 2021-03-01 00:51:01 +01:00
186014a1c6 Refactor search to separate between a base query and user query
The `findBase` is adding only strictly required conditions. Everything
else comes from the user.
2021-03-01 00:51:01 +01:00
e6d9ce2c37 Remove obsolete type capabilities
These are now detected by the new scala compiler and lead to compile
errors.
2021-03-01 00:16:30 +01:00
d7bc963450 Cleanup nodes that are not reachable anymore 2021-02-18 00:37:18 +01:00
48eee00c0b Allow person to be correspondent, concerning or both 2021-02-16 22:49:55 +01:00
d99ce76d89 Remove person suggestion if it doesn't match with organization 2021-02-16 00:29:54 +01:00
dd935454c9 First version of new ui based on tailwind
This drops fomantic-ui as css toolkit and introduces tailwindcss. With
tailwind there are no predefined components, but it's very easy to
create those. So customizing the look&feel is much simpler, most of
the time no additional css is needed.

This requires a complete rewrite of the markup + styles. Luckily all
logic can be kept as is. The now old ui is not removed, it is still
available by using a request header `Docspell-Ui` with a value of `1`
for the old ui and `2` for the new ui.

Another addition is "dev mode", where docspell serves assets with a
no-cache header, to disable browser caching. This makes developing a
lot easier.
2021-02-14 01:46:13 +01:00
96612e0e59 Refactor scan mailbox form and add flag for post-processing
Mails are filtered once by using an imap search and then by some globs
to filter files and subjects. Imap can search by subject via a
string-contains, but not via globs or patterns (afaik). The subject
filter is applied to all downloaded mail headers. Now for post
processing (moving to some target folder or deleting), it can be
chosen to post-process all "seen" mails or only those that matched all
filters.
2021-01-24 01:46:31 +01:00
c7e850116f Make the text length limit optional 2021-01-22 23:06:50 +01:00
4cba96f390 Always return classifier results as suggestion
The classifier results are spliced into the suggestion list at second
place. When linking they are only used if nlp didn't find anything.
2021-01-21 21:05:28 +01:00
9957c3267e Add constraints from config to classifier training
For large and/or many documents, training the classifier can lead to
OOM errors. Some limits have been set by default.
2021-01-21 17:46:39 +01:00
a6c31be22f Update documentation 2021-01-20 22:47:15 +01:00
85ddc61d9d Move date proposal setting to nlp config 2021-01-20 19:17:29 +01:00
b12d965223 Improve logging 2021-01-20 00:40:58 +01:00
27c24c128d Store tags guessed with classifier in database 2021-01-20 00:30:40 +01:00
9d83cb7fe4 Store item based proposals in separate table
Classifier don't work on each attachment, but on all. So the results
must not be stored at an attachment. This reverts some previous
changes to put the classifier results for item entities into its own
table.
2021-01-19 23:48:09 +01:00
75573c905e Use classifier results as fallback when linking proposed metadata 2021-01-19 23:13:34 +01:00
8455d1badf Lookup results from classifier
The model may be out of date, data may change. Then it should be
looked up to fetch the id to be compatible with next stages.
2021-01-19 22:56:01 +01:00
1cd3441462 Run classifier for item entities (concerned, correspondent)
Store the results separately from nlp results in attachment metadata.
2021-01-19 22:08:29 +01:00