Commit Graph

172 Commits

Author SHA1 Message Date
Eike Kettner
a4a84abae5 Show errors from failed register request
Also include a `@` in the valid chars for "idents". This allows to use
an e-mail address as username.
2021-03-10 22:14:55 +01:00
Eike Kettner
6a63694a3e Convert unit tests to munit 2021-03-10 19:48:56 +01:00
Eike Kettner
9991ad5fcc Add latvian language 2021-03-09 00:23:17 +01:00
Eike Kettner
698ff58aa3 Provide a more convenient interface to search 2021-03-01 11:50:07 +01:00
Eike Kettner
e9ed998e3a Basic poc to search via custom query 2021-03-01 00:51:01 +01:00
Eike Kettner
20ccdda609 Add a notes field to equipments 2021-02-17 22:39:07 +01:00
Eike Kettner
48eee00c0b Allow person to be correspondent, concerning or both 2021-02-16 22:49:55 +01:00
Eike Kettner
d99ce76d89 Remove person suggestion if it doesn't match with organization 2021-02-16 00:29:54 +01:00
Eike Kettner
eb308cfa85 Remove duplicate candidates when creating list of suggestions 2021-02-15 23:56:09 +01:00
Eike Kettner
dd935454c9 First version of new ui based on tailwind
This drops fomantic-ui as css toolkit and introduces tailwindcss. With
tailwind there are no predefined components, but it's very easy to
create those. So customizing the look&feel is much simpler, most of
the time no additional css is needed.

This requires a complete rewrite of the markup + styles. Luckily all
logic can be kept as is. The now old ui is not removed, it is still
available by using a request header `Docspell-Ui` with a value of `1`
for the old ui and `2` for the new ui.

Another addition is "dev mode", where docspell serves assets with a
no-cache header, to disable browser caching. This makes developing a
lot easier.
2021-02-14 01:46:13 +01:00
Eike Kettner
96612e0e59 Refactor scan mailbox form and add flag for post-processing
Mails are filtered once by using an imap search and then by some globs
to filter files and subjects. Imap can search by subject via a
string-contains, but not via globs or patterns (afaik). The subject
filter is applied to all downloaded mail headers. Now for post
processing (moving to some target folder or deleting), it can be
chosen to post-process all "seen" mails or only those that matched all
filters.
2021-01-24 01:46:31 +01:00
mergify[bot]
8dd1672c8c
Merge pull request #583 from eikek/fix-baseurl-setting
Render baseurl without trailing slash
2021-01-21 23:44:14 +00:00
Eike Kettner
0ec620fcf0 Render baseurl without trailing slash
The webapp expects it like this currently, because the url is only a
string.
2021-01-21 21:42:08 +01:00
Eike Kettner
4cba96f390 Always return classifier results as suggestion
The classifier results are spliced into the suggestion list at second
place. When linking they are only used if nlp didn't find anything.
2021-01-21 21:05:28 +01:00
Eike Kettner
5c487ef7a9 Refactor running classifier in text analysis 2021-01-19 21:30:02 +01:00
Eike Kettner
a6f29153c4 Control what tag categories to use for auto-tagging 2021-01-19 01:20:13 +01:00
Eike Kettner
3f75af0807 Add 9 more lanugages to the list of document lanugages 2021-01-18 17:41:40 +01:00
Eike Kettner
26dff18ae0 Add spanish as an example
Adding a new language without nlp requires now only to fill out the
pieces:

- define a list of month names to support date recognition
- add it to joex' dockerfile to be available for tesseract
- update the solr migration/field definitions
- update the elm file so it shows up on the client
2021-01-18 17:41:40 +01:00
Eike Kettner
f01646aeb5 Reorganize nlp pipeline and add nlp-unsupported language italian
Improves and reorganizes how nlp pipelines are setup. Now users can
choose from many options, depending on their hardware and usage
scenario.

This is the base to use more languages without depending on what
stanford-nlp supports. Support then is involves to text extraction and
simple regex-ner processing.
2021-01-18 17:41:40 +01:00
Eike Kettner
aa937797be Choose nlp mode in config file 2021-01-17 22:56:33 +01:00
Eike Kettner
d712f8303d Make glob matching case-insensitive by default 2021-01-09 13:23:15 +01:00
Eike Kettner
b08e88cd69 Add (inofficial) routes to get system information 2021-01-05 20:54:53 +01:00
Eike Kettner
668abf2140 Add a reset-password admin route 2021-01-04 20:59:31 +01:00
Eike Kettner
77627534bc Improve on basic search summary 2020-12-15 23:37:02 +01:00
Eike Kettner
e3f6892abd Convert job record 2020-12-15 21:03:46 +01:00
Eike Kettner
290989f67f Reorder correspondent person suggestion based on org relationship 2020-12-01 23:39:45 +01:00
Eike Kettner
3fabe0a582 Update to Scala 2.13.4 2020-11-27 20:26:24 +01:00
Eike Kettner
5fe532001b Allow to specify document lanugage with the request 2020-11-23 20:49:01 +01:00
Eike Kettner
7712e02d2d Don't allow empty custom field values 2020-11-23 10:38:59 +01:00
Eike Kettner
93295d63a5 Change custom field values for a single item 2020-11-22 21:41:09 +01:00
Eike Kettner
62313ab03a Add and change custom fields 2020-11-22 21:41:09 +01:00
Eike Kettner
248ad04dd0 Prepare custom fields 2020-11-22 21:41:09 +01:00
Eike Kettner
5034e12bec Add a subject filter to scan-mailbox args 2020-11-13 23:15:20 +01:00
Eike Kettner
4fd6e02ec0 Improve glob and filter archive entries 2020-11-11 21:01:23 +01:00
Eike Kettner
55a6f7aaf6 Add more properties to upload meta data 2020-11-11 21:01:23 +01:00
Eike Kettner
a21a97f7d5 Add a simple glob data type 2020-11-10 22:44:08 +01:00
Eike Kettner
29455d638c Add startup task to find page counts of existing files 2020-11-09 20:35:35 +01:00
Eike Kettner
f4e50c5229 Provide endpoints to submit tasks to re-generate previews
The scaling factor can be given in the config file. When this changes,
images can be regenerated via POSTing to certain endpoints. It is
possible to regenerate just one attachment preview or all within a
collective.
2020-11-09 09:00:02 +01:00
Eike Kettner
709848244c Create tasks to generate all previews
There is a task to generate preview images per attachment. It can
either add them (if not present yet) or overwrite them (e.g. some
config has changed).

There is a task that selects all attachments without previews and
submits a task to create it. This is submitted on start automatically
to generate previews for all existing attachments.
2020-11-08 23:46:02 +01:00
Eike Kettner
ef7cb4e779 Create a preview image of all files during processing 2020-11-08 01:25:59 +01:00
Eike Kettner
0114bb4d72 Use source name from config file for integration endpoint uploads
Fixes: #389
2020-10-26 22:37:30 +01:00
Eike Kettner
f6f63000be Prepend a duplicate check when uploading files 2020-09-23 23:37:00 +02:00
Eike Kettner
d8bb6dcba3 Dynamically configure cookie and base-url
When `base-url` is the default (i.e. localhost), the cookie is now
configured with the domain doing the request and the webapp is
configured to run requests against the host in the address bar of the
browser.
2020-09-13 14:05:20 +02:00
Eike Kettner
c658677032 Autoformat 2020-09-09 00:29:32 +02:00
Eike Kettner
76ccfb8a81 Only learn from confirmed items
Text classification should only learn from confirmed items. Log if
classification is disabled when processing an item.
2020-09-07 13:04:40 +02:00
Eike Kettner
06879456a6 Change job priority on queue page 2020-09-05 18:50:58 +02:00
Eike Kettner
8c4f2e702b Add classifier settings 2020-09-02 18:28:14 +02:00
Eike Kettner
3473cbb773 Use collective data with NER annotation 2020-08-25 20:40:44 +02:00
Eike Kettner
96d2f948f2 Use collective's addressbook to configure regexner 2020-08-24 14:40:52 +02:00
Eike Kettner
8628a0a8b3 Allow configuring stanford-ner and cache based on collective 2020-08-24 10:55:59 +02:00
Eike Kettner
fdb46da26d Add french language and upgrade stanford-nlp to 4.0.0 2020-08-23 17:48:42 +02:00
Eike Kettner
3986487f11 Add api docs and cleanup 2020-08-13 21:22:54 +02:00
Eike Kettner
41ea071555 Add a task to convert all pdfs that have not been converted 2020-08-13 01:06:13 +02:00
Eike Kettner
07e9a9767e Add a task to re-process files of an item 2020-08-12 22:29:56 +02:00
Eike Kettner
45b0deeced Print solr url on start
This is useful info to see which url has been selected, same as db
connection.
2020-08-01 15:59:14 +02:00
Eike Kettner
5b01c93711 Add a folder-id to item processing
This allows to define a folder when uploading files. All generated
items are associated to this folder on creation.
2020-07-14 23:18:39 +02:00
Eike Kettner
347a029af8 Scalafix organize-imports 2020-06-28 21:20:47 +02:00
Eike Kettner
41c0f70d3b Fix cancelling jobs
A request to cancel a job was not processed correctly. The cancelling
routine of a task must run, regardless of the (non-final) state. Now
it works like this: if a job is currently running, it is interrupted
and its cancel routine is invoked. It then enters "cancelled" state.
If it is stuck, it is loaded and only its cancel routine is run. If it
is in a final state or waiting, it is removed from the queue.
2020-06-26 23:08:27 +02:00
Eike Kettner
d79ae6233a Restrict proposals for due date
Avoid dates too far in the future.
2020-06-26 16:58:17 +02:00
Eike Kettner
15c0fb4395 Merge branch 'master' into fts 2020-06-23 00:32:27 +02:00
Eike Kettner
e06a3f8fdd ScalafmtAll 2020-06-23 00:18:59 +02:00
Eike Kettner
0d8b03fc61 Add backend operations for re-creating the full-text index 2020-06-21 15:46:51 +02:00
Eike Kettner
7609b2b7c3 Run scalafmtAll 2020-06-20 23:03:51 +02:00
Eike Kettner
3576c45d1a First basic working solr search 2020-06-20 02:18:49 +02:00
Eike Kettner
146d1b0562 Make data to index more flexible and extensible 2020-06-17 23:20:46 +02:00
Eike Kettner
897d91475e Update scalafmt-core to 2.6.0 2020-06-17 19:53:56 +02:00
Eike Kettner
4b0eb650f2 Rename package to avoid name clashes 2020-05-25 16:22:09 +02:00
Eike Kettner
ee394eae86 Try streamline the different impls for MimeType 2020-05-25 09:24:24 +02:00
Eike Kettner
f4949446e3 Allow to specify an item id to amend files to existing items 2020-05-23 20:15:55 +02:00
Eike Kettner
25d089da6c Update state and proposals only on invalid items
Invalid items are those that are not ready, and not shown to the user.
When changing metadata, it should only be changed, if the item was not
already shown to the user.
2020-05-23 15:46:24 +02:00
Eike Kettner
f74f8e5198 Add new way for uploading files to any collective
Applications running next to docspell may want a way to upload files
to any collective for integration purposes. This endpoint can be used
for this. It is disabled by default and can be enabled via the
configuration file.
2020-05-23 14:29:24 +02:00
Eike Kettner
9f9dd6c0fb Change routes for scan-mailbox task to allow multiple tasks per user 2020-05-21 22:04:45 +02:00
Eike Kettner
f2d67dc816 Initial impl of import from mailbox user task 2020-05-20 17:52:38 +02:00
Eike Kettner
6e8582ea80 Implement scan-mailbox form and routes 2020-05-20 17:52:38 +02:00
Eike Kettner
5d5311913c Add ScanMailboxArgs 2020-05-20 17:52:38 +02:00
Eike Kettner
d65c1e0d36 Use date from e-mails to set item date 2020-05-17 11:58:51 +02:00
Eike Kettner
3e10e2175a Sort by weights better and save them 2020-05-17 11:58:51 +02:00
Eike Kettner
c41cdeefec Update scalafmt to 2.5.1 + scalafmtAll 2020-05-04 23:53:57 +02:00
Eike Kettner
0a1b3fcf95 Set list-id header for notification mails 2020-04-30 21:23:56 +02:00
Eike Kettner
75a66ecb86 Update http4s to 0.21.4 2020-04-29 01:05:13 +02:00
Eike Kettner
84e0ebf1a2 Add a flag for restricting overdue items 2020-04-23 21:37:03 +02:00
Eike Kettner
d52efdfcf0 Improve mail template 2020-04-22 23:41:09 +02:00
Eike Kettner
e1f9ae2629 Include links to items into mail template 2020-04-22 21:53:25 +02:00
Eike Kettner
2723d6b43b Implement notify-due-items task 2020-04-22 21:08:45 +02:00
Eike Kettner
3524904faf Add routes to check calendar events 2020-04-22 21:08:45 +02:00
Eike Kettner
ad772c0c25 Server-side stub impl for notify-due-items 2020-04-22 21:08:45 +02:00
Eike Kettner
1206105f0b Fix several bugs with handling e-mail files
- When converting from html->pdf, the wkhtmltopdf program exits with
  errors if the document contains invalid links. The content is now
  cleaned before handed to wkhtmltopdf.
- Update emil library which fixes a bug when reading mails without
  explicit transfer encoding (8bit)
- Add a info header to converted mails
2020-04-07 22:38:25 +02:00
Eike Kettner
14a25fe23e Fix serializing mediatype parameters 2020-03-27 21:50:06 +01:00
Eike Kettner
9656ba62f4 scalafmtAll 2020-03-26 18:26:00 +01:00
Eike Kettner
0b80572664 Fix encodings for mails with non-utf8 html parts 2020-03-24 23:40:29 +01:00
Eike Kettner
cf7ccd572c Improve handling encodings
Html and text files are not fixed to be UTF-8. The encoding is now
detected, which may not work for all files. Default/fallback will be
utf-8.

There is still a problem with mails that contain html parts not in
utf8 encoding. The mail text is always returned as a string and the
original encoding is lost. Then the html is stored using utf-8 bytes,
but wkhtmltopdf reads it using latin1. It seems that the `--encoding`
setting doesn't override encoding provided by the document.
2020-03-23 22:51:28 +01:00
Eike Kettner
cba466ed47 Set item due date candidate
After processing, set the due date of an item to the first candidate.
The earliest due date is considered best match.
2020-03-20 22:39:09 +01:00
Eike Kettner
6b1156182c Add support for eml (rfc822 email) files 2020-03-19 22:42:40 +01:00
Eike Kettner
f0449dd2ce Properly initialize thread pools 2020-03-17 22:37:12 +01:00
Eike Kettner
00ca6b5697 Improve text analysis
- Search for consecutive labels

- Sort list of candidates by a weight

- Search for organizations using person labels
2020-03-17 22:34:50 +01:00
Eike Kettner
854a596da3 Integrate periodic tasks
The first use case for periodic task is the cleanup of expired
invitation keys. This is part of a house-keeping periodic task.
2020-03-08 22:49:49 +01:00
Eike Kettner
616c333fa5 Implement storage routines for periodic scheduler 2020-03-08 13:56:23 +01:00
Eike Kettner
1e598bd902 Sketch a scheduler for running periodic tasks
Periodic tasks are special in that they are usually kept around and
started based on a schedule. A new component checks periodic tasks and
submits them in the queue once they are due.

In order to avoid duplicate periodic jobs, the tracker of a job is
used to store the periodic job id. Each time a periodic task is due,
it is first checked if there is a job running (or queued) for this
task.
2020-03-08 12:55:03 +01:00
Eike Kettner
2f87065b2e sbt scalafmtAll 2020-02-25 20:55:00 +01:00
Eike Kettner
fbe0c1aec5 Allow more chars for mimetype 2020-02-20 00:39:31 +01:00