Commit Graph

136 Commits

Author SHA1 Message Date
Eike Kettner
4ed7a137f7 Add support for archive files
Each attachment is now first extracted into potentially multiple ones,
if it is recognized as an archive. This is the first step in
processing. The original archive file is also stored and the resulting
attachments are associated to their original archive.

First support is implemented for zip files.
2020-03-19 22:42:27 +01:00
Eike Kettner
10f3d5b7ed Fix bug to select other attachments 2020-03-17 22:37:43 +01:00
Eike Kettner
f0449dd2ce Properly initialize thread pools 2020-03-17 22:37:12 +01:00
Eike Kettner
00ca6b5697 Improve text analysis
- Search for consecutive labels

- Sort list of candidates by a weight

- Search for organizations using person labels
2020-03-17 22:34:50 +01:00
Eike Kettner
718e44a21c Add cleanup jobs task 2020-03-09 20:24:00 +01:00
Eike Kettner
854a596da3 Integrate periodic tasks
The first use case for periodic task is the cleanup of expired
invitation keys. This is part of a house-keeping periodic task.
2020-03-08 22:49:49 +01:00
Eike Kettner
616c333fa5 Implement storage routines for periodic scheduler 2020-03-08 13:56:23 +01:00
Eike Kettner
1e598bd902 Sketch a scheduler for running periodic tasks
Periodic tasks are special in that they are usually kept around and
started based on a schedule. A new component checks periodic tasks and
submits them in the queue once they are due.

In order to avoid duplicate periodic jobs, the tracker of a job is
used to store the periodic job id. Each time a periodic task is due,
it is first checked if there is a job running (or queued) for this
task.
2020-03-08 12:55:03 +01:00
Eike Kettner
9b28858d06 Create a simple client for joex in its api module
This client can be used within the backend app and later in other
modules. The `OJoex` object is replaced with a better implementation
where the http client is initialized once on app start.
2020-03-03 23:07:49 +01:00
Eike Kettner
42c59179b8 Fix search by checksum to include source files 2020-03-02 20:56:32 +01:00
Eike Kettner
867b59ac10 Fix link in doc menu 2020-03-01 14:08:21 +01:00
Eike Kettner
d8bbcb1409 Fix front-page links for microsite
The links work while testing locally with jekyll. Must be checked at
the published site.
2020-03-01 09:45:38 +01:00
Eike Kettner
b7f2c051f4 Set next version to 0.4.0-SNAPSHOT 2020-02-28 21:19:01 +01:00
Eike Kettner
aa3b9258c4 Set version to 0.3.0 2020-02-28 20:52:39 +01:00
Eike Kettner
3f53779ae4 Change documentation side menu and front 2020-02-28 20:52:39 +01:00
Eike Kettner
ad8d64eded Fix microsite and add changelog 2020-02-27 23:59:03 +01:00
Eike Kettner
1bb464b9ed Extend tools/ds.sh to check for file existence 2020-02-27 20:03:46 +01:00
Eike Kettner
902fd63125 Fix initializing concerned equipment 2020-02-26 20:43:16 +01:00
Eike Kettner
2f87065b2e sbt scalafmtAll 2020-02-25 20:55:00 +01:00
Eike Kettner
c8d090ae28 Remove small notes form field in favor for the new one 2020-02-24 22:34:32 +01:00
Eike Kettner
381de1e198 Show project version in the documentation 2020-02-24 20:59:15 +01:00
Eike Kettner
25c3f2b541 Add more explaining tooltips 2020-02-24 15:18:42 +01:00
Eike Kettner
478797e2a4 Add a help link to the main menu 2020-02-24 15:11:58 +01:00
Eike Kettner
36093c5d52 Add reverse proxy doc 2020-02-24 15:11:50 +01:00
Eike Kettner
cc16b0c024 Fix query to also work with mariadb 2020-02-24 13:34:54 +01:00
Eike Kettner
5f32eadaba Fix dropdown in source create view 2020-02-23 23:01:48 +01:00
Eike Kettner
661cc3e65f Fix deleting attachments (again) 2020-02-23 20:18:13 +01:00
Eike Kettner
d937e0501a Add source files to collective insights 2020-02-23 20:17:53 +01:00
Eike Kettner
be8eacdbe9 Display full attachment name in title tooltip 2020-02-23 14:36:19 +01:00
Eike Kettner
1f431c3222 Make link to original file enabled if files are different 2020-02-23 14:33:22 +01:00
Eike Kettner
957073fe62 Return info about original files in item detail
This adds data to the current rest api.
2020-02-23 14:25:32 +01:00
Eike Kettner
ec419c7bfd Adopt nix modules to new config 2020-02-22 12:40:56 +01:00
Eike Kettner
74a037887d Fix deleting items and attachments to also remove the binary files 2020-02-22 00:54:55 +01:00
Eike Kettner
8cfecfb3dd Update docs 2020-02-22 00:48:58 +01:00
Eike Kettner
98576a5fb5 Add link to original file 2020-02-20 22:40:27 +01:00
Eike Kettner
72fd3b1a25 Implement downloading original file 2020-02-20 22:33:57 +01:00
Eike Kettner
39809f9d05 Sketch route for retrieving original file 2020-02-20 22:12:27 +01:00
Eike Kettner
7fe8843893 Update documentation sites 2020-02-20 21:43:37 +01:00
Eike Kettner
3f316ab4d0 Update config file doc 2020-02-20 21:10:00 +01:00
Eike Kettner
fbe0c1aec5 Allow more chars for mimetype 2020-02-20 00:39:31 +01:00
Eike Kettner
97305d27ff Integrate support for more files into processing and upload
The restriction that only pdf files can be uploaded is removed. All
files can now be uploaded. The processing may not process all. It is
still possible to restrict file uploads by types via a configuration.
2020-02-19 23:27:00 +01:00
Eike Kettner
9b1349734e Convert some files to pdf 2020-02-19 02:03:10 +01:00
Eike Kettner
5869e2ee6e Streamline extern-conv stdin/infile 2020-02-18 12:43:47 +01:00
Eike Kettner
0dcc00836b Make logger configurable in system commands 2020-02-18 12:02:43 +01:00
Eike Kettner
bd605b8c94 Add first drafts for converting 2020-02-18 01:31:22 +01:00
Eike Kettner
c665c212a0 Early draft for running wkhtmltopdf 2020-02-17 14:02:23 +01:00
Eike Kettner
e0682464b5 Configure pdf extraction; move Logger and DataType to common 2020-02-17 14:01:36 +01:00
Eike Kettner
3d615181e0 Early draft for text extraction 2020-02-17 01:57:22 +01:00
Eike Kettner
8143a4edcc Adding extraction primitives 2020-02-16 21:37:26 +01:00
Eike Kettner
851ee7ef0f Reorganize processing code
Use separate modules for

- text extraction
- conversion to pdf
- text analysis
2020-02-15 21:25:25 +01:00