Commit Graph

87 Commits

Author SHA1 Message Date
eikek
c2ff9b82be
Merge pull request #96 from scala-steward/update/tika-core-1.24.1
Update tika-core to 1.24.1
2020-04-21 23:01:27 +02:00
Scala Steward
38dfcd6cd0
Update tika-core to 1.24.1 2020-04-21 22:52:50 +02:00
Scala Steward
46fcc07ce8
Update flexmark, ... to 0.61.16 2020-04-21 22:52:45 +02:00
eikek
7f67693003
Merge pull request #95 from scala-steward/update/flyway-core-6.4.0
Update flyway-core to 6.4.0
2020-04-21 21:43:40 +02:00
Scala Steward
75d423dce5
Update flyway-core to 6.4.0 2020-04-21 20:44:27 +02:00
Scala Steward
61edac7460
Update stanford-corenlp to 4.0.0 2020-04-21 16:29:37 +02:00
Scala Steward
0323045715
Update flexmark, ... to 0.61.12 2020-04-18 08:10:50 +02:00
Scala Steward
d3a7eff939
Update flexmark, ... to 0.61.10 2020-04-17 00:11:20 +02:00
Scala Steward
9c3a0d7320
Update jquery to 3.5.0 2020-04-15 00:16:04 +02:00
Scala Steward
b0692e6826
Update flexmark, ... to 0.61.8 2020-04-14 04:09:29 +02:00
Scala Steward
c9e6496236
Update flexmark, ... to 0.61.6 2020-04-12 08:10:20 +02:00
Scala Steward
eb0733f47a
Update flexmark, ... to 0.61.4 2020-04-11 06:13:03 +02:00
Scala Steward
5de6874cb0
Update flexmark, ... to 0.61.2 2020-04-08 22:10:11 +02:00
Eike Kettner
3ffc3fb317 Fix a classpath issue
Remove javax.activation from stanford-nlp artifact. There are new
coordinates now as the java-mail library and the activation library
have been moved to eclipse jakarta.
2020-04-07 22:38:25 +02:00
Eike Kettner
1206105f0b Fix several bugs with handling e-mail files
- When converting from html->pdf, the wkhtmltopdf program exits with
  errors if the document contains invalid links. The content is now
  cleaned before handed to wkhtmltopdf.
- Update emil library which fixes a bug when reading mails without
  explicit transfer encoding (8bit)
- Add a info header to converted mails
2020-04-07 22:38:25 +02:00
eikek
12672938a0
Merge pull request #78 from scala-steward/update/calev-core-0.3.0
Update calev-core, calev-fs2 to 0.3.0
2020-04-07 09:17:39 +02:00
Scala Steward
d77ab41da1
Update minitest, minitest-laws to 2.8.2 2020-04-07 00:55:52 +02:00
Scala Steward
8fc09cb92e
Update calev-core, calev-fs2 to 0.3.0 2020-04-07 00:55:37 +02:00
Scala Steward
b86fb4042c
Update flyway-core to 6.3.3 2020-04-06 20:11:55 +02:00
Scala Steward
530a7d89f3
Update doobie-core, doobie-hikari to 0.9.0 2020-04-06 00:15:28 +02:00
Scala Steward
28601e6143
Update minitest, minitest-laws to 2.8.1 2020-04-05 16:11:22 +02:00
Scala Steward
45ba0a40ef
Update flexmark, ... to 0.61.0 2020-04-04 20:09:33 +02:00
Scala Steward
11cc9edbd7
Update http4s-blaze-client, ... to 0.21.3 2020-04-02 08:14:42 +02:00
Scala Steward
cea5f05c70
Update calev-core, calev-fs2 to 0.2.0 2020-04-02 02:10:32 +02:00
Scala Steward
9f4f71d4a6
Update postgresql to 42.2.12 2020-04-01 02:37:29 +02:00
Scala Steward
6d2baba052
Update bitpeace-core to 0.4.5 2020-03-28 16:14:53 +01:00
Scala Steward
fe2b27bd49
Update http4s-blaze-client, ... to 0.21.2 2020-03-25 04:13:16 +01:00
Eike Kettner
0b80572664 Fix encodings for mails with non-utf8 html parts 2020-03-24 23:40:29 +01:00
Scala Steward
519a39c991
Update flyway-core to 6.3.2 2020-03-24 14:17:49 +01:00
Eike Kettner
cf7ccd572c Improve handling encodings
Html and text files are not fixed to be UTF-8. The encoding is now
detected, which may not work for all files. Default/fallback will be
utf-8.

There is still a problem with mails that contain html parts not in
utf8 encoding. The mail text is always returned as a string and the
original encoding is lost. Then the html is stored using utf-8 bytes,
but wkhtmltopdf reads it using latin1. It seems that the `--encoding`
setting doesn't override encoding provided by the document.
2020-03-23 22:51:28 +01:00
Eike Kettner
3703dce9a6 Update fs2 to 2.3.0 2020-03-20 22:47:09 +01:00
Scala Steward
6477745b77
Update mariadb-java-client to 2.6.0 2020-03-20 09:08:34 +01:00
Scala Steward
736a28bc4f
Update tika-core to 1.24 2020-03-17 16:43:17 +01:00
Scala Steward
b6bed9a629
Update emil-common, emil-javamail to 0.3.0 2020-03-15 22:20:18 +01:00
Scala Steward
0076b3782d
Update flyway-core to 6.3.1 2020-03-13 18:25:20 +01:00
Scala Steward
80a6b920d9
Update postgresql to 42.2.11 2020-03-11 00:19:53 +01:00
Scala Steward
e39a983c96
Update bitpeace-core to 0.4.4 2020-03-10 02:25:12 +01:00
eikek
a07a6ff376
Merge pull request #45 from eikek/feature/background-tasks
Feature/background tasks
2020-03-09 20:59:46 +01:00
Eike Kettner
854a596da3 Integrate periodic tasks
The first use case for periodic task is the cleanup of expired
invitation keys. This is part of a house-keeping periodic task.
2020-03-08 22:49:49 +01:00
Eike Kettner
1e598bd902 Sketch a scheduler for running periodic tasks
Periodic tasks are special in that they are usually kept around and
started based on a schedule. A new component checks periodic tasks and
submits them in the queue once they are due.

In order to avoid duplicate periodic jobs, the tracker of a job is
used to store the periodic job id. Each time a periodic task is due,
it is first checked if there is a job running (or queued) for this
task.
2020-03-08 12:55:03 +01:00
Scala Steward
b692b31c9f
Update flyway-core to 6.3.0 2020-03-05 12:18:50 +01:00
Scala Steward
f7483e8476
Update pureconfig to 0.12.3 2020-02-29 16:51:38 +01:00
Scala Steward
43f90cdf32
Update poi, poi-ooxml, poi-scratchpad to 4.1.2 2020-02-27 22:12:23 +01:00
Scala Steward
1ca7deb7bd
Update pdfbox to 2.0.19 2020-02-25 00:10:43 +01:00
Eike Kettner
98f8b1a4b8 Merge remote-tracking branch 'origin/master' into feature/file-types 2020-02-24 23:07:24 +01:00
Scala Steward
f03c893148
Update flyway-core to 6.2.4 2020-02-20 16:16:42 +01:00
Eike Kettner
9b1349734e Convert some files to pdf 2020-02-19 02:03:10 +01:00
Eike Kettner
756f8bcb4c Merge remote-tracking branch 'origin/master' into feature/file-types 2020-02-16 21:53:28 +01:00
Eike Kettner
8143a4edcc Adding extraction primitives 2020-02-16 21:37:26 +01:00
Eike Kettner
851ee7ef0f Reorganize processing code
Use separate modules for

- text extraction
- conversion to pdf
- text analysis
2020-02-15 21:25:25 +01:00