Commit Graph

419 Commits

Author SHA1 Message Date
Eike Kettner
cec4948710 Add pdf meta data to extracted text to add it to full-text index 2020-07-19 01:07:49 +02:00
Eike Kettner
209c068436 Use keywords in pdfs to search for existing tags
During processing, keywords stored in PDF metadata are used to look
them up in the tag database and associate any existing tags to the
item.

See #175
2020-07-19 00:28:04 +02:00
Eike Kettner
da68405f9b Extract meta data from pdfs using pdfbox 2020-07-18 23:04:46 +02:00
Eike Kettner
bd20165d1a Use given folder-id when adding initial fts docs 2020-07-18 23:04:01 +02:00
Eike Kettner
3d49ceaab5 Use ocrmypdf tool to create pdf/a during conversion
- Use another external tool to convert pdf to pdf which also adds the
  extracted text as another layer into the pdf

- Although not used, the external conversion routine will now check
  for an existing text file that is named as the pdf file with extension
  `.txt`. If present it is included in the conversion result and will be
  used as the extracted text.

- text extraction for pdf files happens now on the converted file,
  because it may already contain the text from the conversion step and
  thus avoids running OCR twice.

- All errors during conversion are not fatal; processing continues
  without a converted file.
2020-07-18 17:19:29 +02:00
Eike Kettner
99210365ce Update documentation for folders 2020-07-17 00:02:25 +02:00
Eike Kettner
94089fd0b6 Fix decoding joex responses in JoexClient 2020-07-15 20:45:07 +02:00
Eike Kettner
c697501571 Add folders sql changeset for mariadb 2020-07-14 23:22:52 +02:00
Eike Kettner
25538d6a59 Allow to set a folder when importing mailboxes 2020-07-14 23:18:39 +02:00
Eike Kettner
225877a40c Show folder in item detail view 2020-07-14 23:18:39 +02:00
Eike Kettner
ca5b7b999f Update source form to specify folder 2020-07-14 23:18:39 +02:00
Eike Kettner
5b01c93711 Add a folder-id to item processing
This allows to define a folder when uploading files. All generated
items are associated to this folder on creation.
2020-07-14 23:18:39 +02:00
Eike Kettner
ec7f027b4e Fix postgres changeset for folders 2020-07-12 16:15:02 +02:00
Eike Kettner
259526a088 Organize imports 2020-07-12 13:51:52 +02:00
Eike Kettner
22fa1dba13 Apply folder restriction to fulltext only search
And update index when folder changes.
2020-07-12 13:50:45 +02:00
Eike Kettner
aeba4ba913 Refactor full-text migrations and add folder to solr schema 2020-07-12 13:50:14 +02:00
Eike Kettner
e387b5513f Remove items in non-member folders from sql search results 2020-07-11 22:25:56 +02:00
Eike Kettner
5b95fddf3d Make item queries depend on the account-id
Now the user is required, too, to list items.
2020-07-11 21:54:51 +02:00
Eike Kettner
e66c501056 Extend dropdown to display additional option info
Use this to display folder information when setting the folder on an
item.
2020-07-11 17:56:08 +02:00
Eike Kettner
0df541f30a Allow to search by folders 2020-07-11 16:52:13 +02:00
Eike Kettner
86443e10a6 Set the folder of an item 2020-07-11 12:57:17 +02:00
Eike Kettner
5bde78083a Hide delete button when creating new folder 2020-07-11 11:54:23 +02:00
Eike Kettner
2ab0b5e222 Rename space -> folder 2020-07-11 11:54:23 +02:00
Eike Kettner
0365c1980a Show new data about spaces in web-ui 2020-07-11 01:30:29 +02:00
Eike Kettner
60a08fc786 Return member count and if current user is owner or member 2020-07-11 01:30:29 +02:00
Eike Kettner
ea4ab11195 Allow to only return owning spaces 2020-07-11 01:30:28 +02:00
Eike Kettner
6c304b4e7a Manage spaces in web-ui 2020-07-11 01:30:28 +02:00
Eike Kettner
752a94a9e2 Implement space operations 2020-07-11 01:30:28 +02:00
Eike Kettner
0e8c9b1819 Initial outline for managing spaces 2020-07-11 01:30:28 +02:00
Eike Kettner
d43e17d9fb Transport user-id to client 2020-07-11 01:30:28 +02:00
Eike Kettner
c12201c4a5 Add routes to manage spaces 2020-07-11 01:30:28 +02:00
Eike Kettner
7ec0fc2593 Add endpoints for managing spaces to openapi spec 2020-07-11 01:30:28 +02:00
Eike Kettner
13ad5e3219 Setup space entities 2020-07-11 01:30:28 +02:00
Eike Kettner
fadd21944f Set version to 0.9.0-SNAPSHOT 2020-06-29 21:04:15 +02:00
Eike Kettner
8998706598 Set version to 0.8.0 2020-06-29 20:37:52 +02:00
Eike Kettner
7b922fec94 Update documentation and fix changelog wording 2020-06-29 20:37:52 +02:00
Eike Kettner
347a029af8 Scalafix organize-imports 2020-06-28 21:20:47 +02:00
Eike Kettner
5bad157b9e Change link on home page 2020-06-28 19:34:28 +02:00
Eike Kettner
82104ff148 Update documentation and changelog 2020-06-28 14:45:04 +02:00
Eike Kettner
d3b3c6289b Prepare docker setup for fulltext search 2020-06-28 13:37:39 +02:00
Eike Kettner
8500d4d804 Extend consumedir.sh to work with integration endpoint
Now running one consumedir script can upload files to multiple
collectives separately.
2020-06-28 00:08:37 +02:00
Eike Kettner
41c0f70d3b Fix cancelling jobs
A request to cancel a job was not processed correctly. The cancelling
routine of a task must run, regardless of the (non-final) state. Now
it works like this: if a job is currently running, it is interrupted
and its cancel routine is invoked. It then enters "cancelled" state.
If it is stuck, it is loaded and only its cancel routine is run. If it
is in a final state or waiting, it is removed from the queue.
2020-06-26 23:08:27 +02:00
Eike Kettner
23477e34f9 Change columns from timestamp to datetime
In MariaDB the timestamp has some properties that make it a not a good
fit.
2020-06-26 17:07:00 +02:00
Eike Kettner
d79ae6233a Restrict proposals for due date
Avoid dates too far in the future.
2020-06-26 16:58:17 +02:00
Eike Kettner
91da3b149e Reducing default retries to 2
Many errors cannot be recovered from by retrying. There is currently
no way to distinguish these states so it is now set to a lower value
to have not long wait times until an item arrives.
2020-06-25 23:57:01 +02:00
mergify[bot]
50b3554c9a
Merge pull request #160 from eikek/fts
Fts
2020-06-25 21:19:42 +00:00
Eike Kettner
dc8f1a0387 Fix global re-index task to re-create the schema
Otherwise new instances could not be re-indexed.
2020-06-25 23:02:06 +02:00
Eike Kettner
4a41168bbb Allow a collective to re-index their data
If something goes wrong, this might be necessary.
2020-06-25 21:52:38 +02:00
Eike Kettner
2a98c2ca42 Fix openapi spec for joex 2020-06-25 08:43:02 +02:00
Eike Kettner
c81b92af6d Documentation updates 2020-06-25 01:36:26 +02:00