Commit Graph

258 Commits

Author SHA1 Message Date
25d089da6c Update state and proposals only on invalid items
Invalid items are those that are not ready, and not shown to the user.
When changing metadata, it should only be changed, if the item was not
already shown to the user.
2020-05-23 15:46:24 +02:00
855d4eefa8 Set progress in a linear way between each step 2020-05-23 15:33:58 +02:00
d9782582d8 Use max-mails setting with higher priority
The `mail-chunk-size` is set to its configured value or `max-mails`
whichever is lower.
2020-05-20 22:44:29 +02:00
c0259dba7e Allow to enable debug flag for javamail 2020-05-20 22:15:25 +02:00
2858d6b853 Notify job executors at the end of the task 2020-05-20 19:44:45 +02:00
31a1abf395 Add server limits to importing mails task 2020-05-20 17:52:38 +02:00
f2d67dc816 Initial impl of import from mailbox user task 2020-05-20 17:52:38 +02:00
852455c610 Add upload operation to task arguments 2020-05-20 17:52:38 +02:00
a4be63fd77 Add stub for scan-mailbox task 2020-05-20 17:52:38 +02:00
d65c1e0d36 Use date from e-mails to set item date 2020-05-17 11:58:51 +02:00
3e10e2175a Sort by weights better and save them 2020-05-17 11:58:51 +02:00
5d6658770e Update emil-common, emil-doobie, ... to 0.6.0 2020-05-17 11:55:53 +02:00
6747a86fea Simplify jsoup sanitizer to reuse from emil 2020-05-14 23:56:08 +02:00
9c882e1be9 Fix package name 2020-05-10 21:03:12 +02:00
bd5066740d Joex depends on backend module
The job executor depends on backend module, since it may control the
application via user tasks. The `ONode` can now be moved from the
store module into the backend module.
2020-05-10 21:03:12 +02:00
c41cdeefec Update scalafmt to 2.5.1 + scalafmtAll 2020-05-04 23:53:57 +02:00
0a1b3fcf95 Set list-id header for notification mails 2020-04-30 21:23:56 +02:00
75a66ecb86 Update http4s to 0.21.4 2020-04-29 01:05:13 +02:00
fa10fe3fae Update scala to 2.13.2 2020-04-24 22:24:31 +02:00
315ea63f44 Improve notify mail template 2020-04-23 23:17:34 +02:00
84e0ebf1a2 Add a flag for restricting overdue items 2020-04-23 21:37:03 +02:00
d52efdfcf0 Improve mail template 2020-04-22 23:41:09 +02:00
ffc1cdee51 Sort due items by their earliest due date 2020-04-22 22:21:28 +02:00
e1f9ae2629 Include links to items into mail template 2020-04-22 21:53:25 +02:00
2723d6b43b Implement notify-due-items task 2020-04-22 21:08:45 +02:00
ad772c0c25 Server-side stub impl for notify-due-items 2020-04-22 21:08:45 +02:00
1206105f0b Fix several bugs with handling e-mail files
- When converting from html->pdf, the wkhtmltopdf program exits with
  errors if the document contains invalid links. The content is now
  cleaned before handed to wkhtmltopdf.
- Update emil library which fixes a bug when reading mails without
  explicit transfer encoding (8bit)
- Add a info header to converted mails
2020-04-07 22:38:25 +02:00
6a1297fc95 Add a limit for text analysis 2020-03-27 22:54:49 +01:00
9656ba62f4 scalafmtAll 2020-03-26 18:26:00 +01:00
09ea724c13 Store message-id of eml files 2020-03-25 22:00:51 +01:00
e305b46708 Extract tnef attachments and fix incomplete html
The wkhtmltopdf requires the content encoding set correctly in the
document.
2020-03-24 23:40:29 +01:00
0b80572664 Fix encodings for mails with non-utf8 html parts 2020-03-24 23:40:29 +01:00
cf7ccd572c Improve handling encodings
Html and text files are not fixed to be UTF-8. The encoding is now
detected, which may not work for all files. Default/fallback will be
utf-8.

There is still a problem with mails that contain html parts not in
utf8 encoding. The mail text is always returned as a string and the
original encoding is lost. Then the html is stored using utf-8 bytes,
but wkhtmltopdf reads it using latin1. It seems that the `--encoding`
setting doesn't override encoding provided by the document.
2020-03-23 22:51:28 +01:00
cba466ed47 Set item due date candidate
After processing, set the due date of an item to the first candidate.
The earliest due date is considered best match.
2020-03-20 22:39:09 +01:00
6b1156182c Add support for eml (rfc822 email) files 2020-03-19 22:42:40 +01:00
4ed7a137f7 Add support for archive files
Each attachment is now first extracted into potentially multiple ones,
if it is recognized as an archive. This is the first step in
processing. The original archive file is also stored and the resulting
attachments are associated to their original archive.

First support is implemented for zip files.
2020-03-19 22:42:27 +01:00
f0449dd2ce Properly initialize thread pools 2020-03-17 22:37:12 +01:00
00ca6b5697 Improve text analysis
- Search for consecutive labels

- Sort list of candidates by a weight

- Search for organizations using person labels
2020-03-17 22:34:50 +01:00
718e44a21c Add cleanup jobs task 2020-03-09 20:24:00 +01:00
854a596da3 Integrate periodic tasks
The first use case for periodic task is the cleanup of expired
invitation keys. This is part of a house-keeping periodic task.
2020-03-08 22:49:49 +01:00
616c333fa5 Implement storage routines for periodic scheduler 2020-03-08 13:56:23 +01:00
1e598bd902 Sketch a scheduler for running periodic tasks
Periodic tasks are special in that they are usually kept around and
started based on a schedule. A new component checks periodic tasks and
submits them in the queue once they are due.

In order to avoid duplicate periodic jobs, the tracker of a job is
used to store the periodic job id. Each time a periodic task is due,
it is first checked if there is a job running (or queued) for this
task.
2020-03-08 12:55:03 +01:00
2f87065b2e sbt scalafmtAll 2020-02-25 20:55:00 +01:00
ec419c7bfd Adopt nix modules to new config 2020-02-22 12:40:56 +01:00
3f316ab4d0 Update config file doc 2020-02-20 21:10:00 +01:00
97305d27ff Integrate support for more files into processing and upload
The restriction that only pdf files can be uploaded is removed. All
files can now be uploaded. The processing may not process all. It is
still possible to restrict file uploads by types via a configuration.
2020-02-19 23:27:00 +01:00
0dcc00836b Make logger configurable in system commands 2020-02-18 12:02:43 +01:00
bd605b8c94 Add first drafts for converting 2020-02-18 01:31:22 +01:00
e0682464b5 Configure pdf extraction; move Logger and DataType to common 2020-02-17 14:01:36 +01:00
3d615181e0 Early draft for text extraction 2020-02-17 01:57:22 +01:00