Commit Graph

165 Commits

Author SHA1 Message Date
Eike Kettner
362e1a5e14 Fix compile errors in test code 2020-04-07 23:00:25 +02:00
Eike Kettner
1206105f0b Fix several bugs with handling e-mail files
- When converting from html->pdf, the wkhtmltopdf program exits with
  errors if the document contains invalid links. The content is now
  cleaned before handed to wkhtmltopdf.
- Update emil library which fixes a bug when reading mails without
  explicit transfer encoding (8bit)
- Add a info header to converted mails
2020-04-07 22:38:25 +02:00
Eike Kettner
63161b5bdf Add docker setup to quickstart 2020-03-31 22:56:51 +02:00
Eike Kettner
efc73c1060 Set version to 0.5.0-SNAPSHOT 2020-03-28 23:52:15 +01:00
Eike Kettner
c77ead3921 Set version to 0.4.0 2020-03-28 21:44:14 +01:00
Eike Kettner
e331808ecf Update microsite 2020-03-28 21:44:14 +01:00
Eike Kettner
6a1297fc95 Add a limit for text analysis 2020-03-27 22:54:49 +01:00
Eike Kettner
14a25fe23e Fix serializing mediatype parameters 2020-03-27 21:50:06 +01:00
Eike Kettner
aed5dfaff6 Fix mimetype extractors 2020-03-27 21:49:55 +01:00
Eike Kettner
75405dbcba Update documentation 2020-03-27 20:16:18 +01:00
Eike Kettner
16edf84752 Setup new site 2020-03-27 00:35:15 +01:00
Eike Kettner
9656ba62f4 scalafmtAll 2020-03-26 18:26:00 +01:00
Eike Kettner
09ea724c13 Store message-id of eml files 2020-03-25 22:00:51 +01:00
Eike Kettner
43efb4e6ba Use doobie support from emil project 2020-03-24 23:40:29 +01:00
Eike Kettner
e305b46708 Extract tnef attachments and fix incomplete html
The wkhtmltopdf requires the content encoding set correctly in the
document.
2020-03-24 23:40:29 +01:00
Eike Kettner
0b80572664 Fix encodings for mails with non-utf8 html parts 2020-03-24 23:40:29 +01:00
Eike Kettner
cf7ccd572c Improve handling encodings
Html and text files are not fixed to be UTF-8. The encoding is now
detected, which may not work for all files. Default/fallback will be
utf-8.

There is still a problem with mails that contain html parts not in
utf8 encoding. The mail text is always returned as a string and the
original encoding is lost. Then the html is stored using utf-8 bytes,
but wkhtmltopdf reads it using latin1. It seems that the `--encoding`
setting doesn't override encoding provided by the document.
2020-03-23 22:51:28 +01:00
Eike Kettner
b265421a46 Allow to use the browser's pdf viewer
The viewerjs library has some limitations. Sometimes PDFs are quite
blurry and some content is displayed scrambled. Switching to the
browsers build-in PDF viewer (for chromium and firefox) fixes this. So
while on mobile the viewerjs is the only working viewer, for desktop
use it might be desireable to use the browsers builtin viewer instead.
2020-03-22 22:03:43 +01:00
Eike Kettner
75ead33652 Provide a download link to the original archive file 2020-03-22 21:48:49 +01:00
Eike Kettner
7e6eec9533 Include archive infos in item detail 2020-03-22 21:35:50 +01:00
Eike Kettner
cbc95b11e6 Add routes to retrive the archive of an attachment 2020-03-22 21:21:49 +01:00
Eike Kettner
9a99c852a8 Fix typo in search menu 2020-03-22 21:08:01 +01:00
Eike Kettner
3703dce9a6 Update fs2 to 2.3.0 2020-03-20 22:47:09 +01:00
Eike Kettner
cba466ed47 Set item due date candidate
After processing, set the due date of an item to the first candidate.
The earliest due date is considered best match.
2020-03-20 22:39:09 +01:00
Eike Kettner
74a6cf1dd1 Remove unused migration directory 2020-03-19 22:43:41 +01:00
Eike Kettner
b1a1a2b837 Add archives to collective insights 2020-03-19 22:43:18 +01:00
Eike Kettner
d78bd4142c Update documentation 2020-03-19 22:42:58 +01:00
Eike Kettner
439aaee27b Search archives when looking for files via checksum 2020-03-19 22:42:48 +01:00
Eike Kettner
6b1156182c Add support for eml (rfc822 email) files 2020-03-19 22:42:40 +01:00
Eike Kettner
4ed7a137f7 Add support for archive files
Each attachment is now first extracted into potentially multiple ones,
if it is recognized as an archive. This is the first step in
processing. The original archive file is also stored and the resulting
attachments are associated to their original archive.

First support is implemented for zip files.
2020-03-19 22:42:27 +01:00
Eike Kettner
10f3d5b7ed Fix bug to select other attachments 2020-03-17 22:37:43 +01:00
Eike Kettner
f0449dd2ce Properly initialize thread pools 2020-03-17 22:37:12 +01:00
Eike Kettner
00ca6b5697 Improve text analysis
- Search for consecutive labels

- Sort list of candidates by a weight

- Search for organizations using person labels
2020-03-17 22:34:50 +01:00
Eike Kettner
718e44a21c Add cleanup jobs task 2020-03-09 20:24:00 +01:00
Eike Kettner
854a596da3 Integrate periodic tasks
The first use case for periodic task is the cleanup of expired
invitation keys. This is part of a house-keeping periodic task.
2020-03-08 22:49:49 +01:00
Eike Kettner
616c333fa5 Implement storage routines for periodic scheduler 2020-03-08 13:56:23 +01:00
Eike Kettner
1e598bd902 Sketch a scheduler for running periodic tasks
Periodic tasks are special in that they are usually kept around and
started based on a schedule. A new component checks periodic tasks and
submits them in the queue once they are due.

In order to avoid duplicate periodic jobs, the tracker of a job is
used to store the periodic job id. Each time a periodic task is due,
it is first checked if there is a job running (or queued) for this
task.
2020-03-08 12:55:03 +01:00
Eike Kettner
9b28858d06 Create a simple client for joex in its api module
This client can be used within the backend app and later in other
modules. The `OJoex` object is replaced with a better implementation
where the http client is initialized once on app start.
2020-03-03 23:07:49 +01:00
Eike Kettner
42c59179b8 Fix search by checksum to include source files 2020-03-02 20:56:32 +01:00
Eike Kettner
867b59ac10 Fix link in doc menu 2020-03-01 14:08:21 +01:00
Eike Kettner
d8bbcb1409 Fix front-page links for microsite
The links work while testing locally with jekyll. Must be checked at
the published site.
2020-03-01 09:45:38 +01:00
Eike Kettner
b7f2c051f4 Set next version to 0.4.0-SNAPSHOT 2020-02-28 21:19:01 +01:00
Eike Kettner
aa3b9258c4 Set version to 0.3.0 2020-02-28 20:52:39 +01:00
Eike Kettner
3f53779ae4 Change documentation side menu and front 2020-02-28 20:52:39 +01:00
Eike Kettner
ad8d64eded Fix microsite and add changelog 2020-02-27 23:59:03 +01:00
Eike Kettner
1bb464b9ed Extend tools/ds.sh to check for file existence 2020-02-27 20:03:46 +01:00
Eike Kettner
902fd63125 Fix initializing concerned equipment 2020-02-26 20:43:16 +01:00
Eike Kettner
2f87065b2e sbt scalafmtAll 2020-02-25 20:55:00 +01:00
Eike Kettner
c8d090ae28 Remove small notes form field in favor for the new one 2020-02-24 22:34:32 +01:00
Eike Kettner
381de1e198 Show project version in the documentation 2020-02-24 20:59:15 +01:00