Commit Graph

11 Commits

Author SHA1 Message Date
Eike Kettner
3d49ceaab5 Use ocrmypdf tool to create pdf/a during conversion
- Use another external tool to convert pdf to pdf which also adds the
  extracted text as another layer into the pdf

- Although not used, the external conversion routine will now check
  for an existing text file that is named as the pdf file with extension
  `.txt`. If present it is included in the conversion result and will be
  used as the extracted text.

- text extraction for pdf files happens now on the converted file,
  because it may already contain the text from the conversion step and
  thus avoids running OCR twice.

- All errors during conversion are not fatal; processing continues
  without a converted file.
2020-07-18 17:19:29 +02:00
Eike Kettner
7b922fec94 Update documentation and fix changelog wording 2020-06-29 20:37:52 +02:00
Eike Kettner
d5c9923a6d Add a route that only searches the full-text index
It returns the results in the same order as received from the index to
preserve the relevance ordering.
2020-06-24 00:03:17 +02:00
Eike Kettner
c7f598e3b0 Initial module setup 2020-06-17 23:20:46 +02:00
Eike Kettner
e331808ecf Update microsite 2020-03-28 21:44:14 +01:00
Eike Kettner
d78bd4142c Update documentation 2020-03-19 22:42:58 +01:00
Eike Kettner
854a596da3 Integrate periodic tasks
The first use case for periodic task is the cleanup of expired
invitation keys. This is part of a house-keeping periodic task.
2020-03-08 22:49:49 +01:00
Eike Kettner
8143a4edcc Adding extraction primitives 2020-02-16 21:37:26 +01:00
Eike Kettner
919381be1e More research on how to create pdfs from other files 2020-02-15 13:57:21 +01:00
Eike Kettner
3026f199f7 Some research on pdf conversion 2020-02-11 22:41:44 +01:00
Eike Kettner
57e274e2b0 Upgrade microsite 2019-12-30 02:33:46 +01:00