Commit Graph

419 Commits

Author SHA1 Message Date
cec4948710 Add pdf meta data to extracted text to add it to full-text index 2020-07-19 01:07:49 +02:00
209c068436 Use keywords in pdfs to search for existing tags
During processing, keywords stored in PDF metadata are used to look
them up in the tag database and associate any existing tags to the
item.

See #175
2020-07-19 00:28:04 +02:00
da68405f9b Extract meta data from pdfs using pdfbox 2020-07-18 23:04:46 +02:00
bd20165d1a Use given folder-id when adding initial fts docs 2020-07-18 23:04:01 +02:00
3d49ceaab5 Use ocrmypdf tool to create pdf/a during conversion
- Use another external tool to convert pdf to pdf which also adds the
  extracted text as another layer into the pdf

- Although not used, the external conversion routine will now check
  for an existing text file that is named as the pdf file with extension
  `.txt`. If present it is included in the conversion result and will be
  used as the extracted text.

- text extraction for pdf files happens now on the converted file,
  because it may already contain the text from the conversion step and
  thus avoids running OCR twice.

- All errors during conversion are not fatal; processing continues
  without a converted file.
2020-07-18 17:19:29 +02:00
99210365ce Update documentation for folders 2020-07-17 00:02:25 +02:00
94089fd0b6 Fix decoding joex responses in JoexClient 2020-07-15 20:45:07 +02:00
c697501571 Add folders sql changeset for mariadb 2020-07-14 23:22:52 +02:00
25538d6a59 Allow to set a folder when importing mailboxes 2020-07-14 23:18:39 +02:00
225877a40c Show folder in item detail view 2020-07-14 23:18:39 +02:00
ca5b7b999f Update source form to specify folder 2020-07-14 23:18:39 +02:00
5b01c93711 Add a folder-id to item processing
This allows to define a folder when uploading files. All generated
items are associated to this folder on creation.
2020-07-14 23:18:39 +02:00
ec7f027b4e Fix postgres changeset for folders 2020-07-12 16:15:02 +02:00
259526a088 Organize imports 2020-07-12 13:51:52 +02:00
22fa1dba13 Apply folder restriction to fulltext only search
And update index when folder changes.
2020-07-12 13:50:45 +02:00
aeba4ba913 Refactor full-text migrations and add folder to solr schema 2020-07-12 13:50:14 +02:00
e387b5513f Remove items in non-member folders from sql search results 2020-07-11 22:25:56 +02:00
5b95fddf3d Make item queries depend on the account-id
Now the user is required, too, to list items.
2020-07-11 21:54:51 +02:00
e66c501056 Extend dropdown to display additional option info
Use this to display folder information when setting the folder on an
item.
2020-07-11 17:56:08 +02:00
0df541f30a Allow to search by folders 2020-07-11 16:52:13 +02:00
86443e10a6 Set the folder of an item 2020-07-11 12:57:17 +02:00
5bde78083a Hide delete button when creating new folder 2020-07-11 11:54:23 +02:00
2ab0b5e222 Rename space -> folder 2020-07-11 11:54:23 +02:00
0365c1980a Show new data about spaces in web-ui 2020-07-11 01:30:29 +02:00
60a08fc786 Return member count and if current user is owner or member 2020-07-11 01:30:29 +02:00
ea4ab11195 Allow to only return owning spaces 2020-07-11 01:30:28 +02:00
6c304b4e7a Manage spaces in web-ui 2020-07-11 01:30:28 +02:00
752a94a9e2 Implement space operations 2020-07-11 01:30:28 +02:00
0e8c9b1819 Initial outline for managing spaces 2020-07-11 01:30:28 +02:00
d43e17d9fb Transport user-id to client 2020-07-11 01:30:28 +02:00
c12201c4a5 Add routes to manage spaces 2020-07-11 01:30:28 +02:00
7ec0fc2593 Add endpoints for managing spaces to openapi spec 2020-07-11 01:30:28 +02:00
13ad5e3219 Setup space entities 2020-07-11 01:30:28 +02:00
fadd21944f Set version to 0.9.0-SNAPSHOT 2020-06-29 21:04:15 +02:00
8998706598 Set version to 0.8.0 2020-06-29 20:37:52 +02:00
7b922fec94 Update documentation and fix changelog wording 2020-06-29 20:37:52 +02:00
347a029af8 Scalafix organize-imports 2020-06-28 21:20:47 +02:00
5bad157b9e Change link on home page 2020-06-28 19:34:28 +02:00
82104ff148 Update documentation and changelog 2020-06-28 14:45:04 +02:00
d3b3c6289b Prepare docker setup for fulltext search 2020-06-28 13:37:39 +02:00
8500d4d804 Extend consumedir.sh to work with integration endpoint
Now running one consumedir script can upload files to multiple
collectives separately.
2020-06-28 00:08:37 +02:00
41c0f70d3b Fix cancelling jobs
A request to cancel a job was not processed correctly. The cancelling
routine of a task must run, regardless of the (non-final) state. Now
it works like this: if a job is currently running, it is interrupted
and its cancel routine is invoked. It then enters "cancelled" state.
If it is stuck, it is loaded and only its cancel routine is run. If it
is in a final state or waiting, it is removed from the queue.
2020-06-26 23:08:27 +02:00
23477e34f9 Change columns from timestamp to datetime
In MariaDB the timestamp has some properties that make it a not a good
fit.
2020-06-26 17:07:00 +02:00
d79ae6233a Restrict proposals for due date
Avoid dates too far in the future.
2020-06-26 16:58:17 +02:00
91da3b149e Reducing default retries to 2
Many errors cannot be recovered from by retrying. There is currently
no way to distinguish these states so it is now set to a lower value
to have not long wait times until an item arrives.
2020-06-25 23:57:01 +02:00
50b3554c9a Merge pull request #160 from eikek/fts
Fts
2020-06-25 21:19:42 +00:00
dc8f1a0387 Fix global re-index task to re-create the schema
Otherwise new instances could not be re-indexed.
2020-06-25 23:02:06 +02:00
4a41168bbb Allow a collective to re-index their data
If something goes wrong, this might be necessary.
2020-06-25 21:52:38 +02:00
2a98c2ca42 Fix openapi spec for joex 2020-06-25 08:43:02 +02:00
c81b92af6d Documentation updates 2020-06-25 01:36:26 +02:00