Commit Graph

281 Commits

Author SHA1 Message Date
eikek
733096f979 Improve querying item results
The join to retrieve the attachment count per item turned out to be
very expensive. And it is not used anymore for the result, only to
support fulltext search. It is now removed from the query. The
DISTINCT keyword is also removed, because it is not necessary and it
is expensive. With the DISTINCT removed, a new index (provided in the
previous commit) can now be used to avoid sorting items.
2021-08-29 11:09:06 +02:00
eikek
cf88f5c2de Allow to specify ordering when retrieving meta data
The query now searches in more fields. For example, when getting a
list of tags, the query is applied to the tag name *and* category.
When listing persons, the query now also looks in the associated
organization name.

This has been used to make some headers in the meta data tables
clickable to sort the list accordingly.

Refs: #965, #538
2021-08-27 23:04:54 +02:00
eikek
993a391c13 Add the attachment-only option to a source
The upload request can now contain a boolean for importing only
attachments when e-mails are uploaded. This option is now also added
to a source url.

Refs: #983
2021-08-23 14:19:11 +02:00
mergify[bot]
45f6357f49
Merge pull request #1024 from eikek/enhance-search-mode
Enhance search mode to search in all items
2021-08-21 13:56:43 +00:00
eikek
d5022f883e Enhance search mode to search in all items 2021-08-21 15:45:14 +02:00
eikek
751fa3da5a Add attachments-only filter to uploads
When uploading a file which is an e-mail, this option allows to skip
the mail body when the file is being processed.
2021-08-21 13:49:12 +02:00
Scala Steward
e4fecefaea
Reformat with scalafmt 3.0.0 2021-08-19 08:50:30 +02:00
eikek
c7c488f0cc Fix position of merged attachments 2021-08-16 15:05:26 +02:00
eikek
a923351b09 Fix item merge when merging tags and text fields
Also hard delete the remaining items. They are empty (have no
attachments), because data is moved if possible. Doesn't make much
sense to keep them, because restoring them isn't much useful.
2021-08-16 14:40:52 +02:00
eikek
85085ec173 Implement item merge 2021-08-16 12:32:54 +02:00
eikek
14e4a8f792 Fixup for deleting items
First, when checking for existence of a file, deleted items are not
conisdered.

The working with fulltext search has been changed: deleted items are
removed from fulltext index and are re-added when they are restored.
The fulltext index currently doesn't hold the item state and it would
mean much more work to introduce it into the index (or, worse, to
reprocess the results from the index). Thus, deleted items can only be
searched using database queries. It is probably a very rare use case
when fulltext search should be applied to deleted items. They have
been deleted for a reason and the most likely case is that they are
simply removed.

Refs: #347
2021-08-15 16:00:30 +02:00
eikek
f4a2b86ea8 Use a minimum age of items to remove
In order to keep deleted items for a while, the periodic task can now
use a duration to only remove items with a certain age. This can be
used to ensure that a deleted item stays at least X days before it
will be removed from the database.

Refs: #347
2021-08-15 12:32:50 +02:00
eikek
31d885ed79 Refactor user tasks to support collective and user scopes
Before, there were periodic tasks run per collective and not user by
making sure that submitter + group are the same value. This is now
encoded in `UserTaskScope` so it is now obvious and errors can be
reduced when using this.
2021-08-14 22:07:56 +02:00
eikek
27fd7a5867 Make sure the empty-trash task is started for all collectives 2021-08-14 20:40:04 +02:00
eikek
50706c3d6d Add a task implementation to delete items 2021-08-14 19:33:18 +02:00
eikek
4901276c66 Change "empty trash" settings for a collective and submit the job 2021-08-14 19:33:15 +02:00
eikek
f999662905 Add routes to restore deleted items 2021-08-14 16:48:03 +02:00
eikek
edb344314f Use an enum instead of a boolean to differentiate search
It's not very likely to have more modes of search besides normal and
trashed, but got surprised in that way quite often and it's nicer this
way anyways.
2021-08-14 15:11:48 +02:00
eikek
a7b74bd5ae Allow to search in soft-deleted items
A new query/request parameter can be used to apply a search to only
soft-deleted items.

The query expression `Trashed` has been introduced which selects only
items with state `Deleted`. This is another option an analog to
`ValidItemStates` (both cannot be used together as they would select
no items). This new query node is not added to the parser, because
users may not use it in their own queries - it must be part of the
"fixed" query so the application can control in which subset to search
(it would otherwise be possible to select any items).
2021-08-14 14:53:05 +02:00
eikek
cb777e30c0 Delete items by introducing a deleted state
When deleting items via the http api, they are not deleted anymore but
a new status "Deleted" is set. The collective insights contains now a
count separately for deleted items.
2021-08-14 14:18:03 +02:00
eikek
48d13a35fc Fix search summary to restrict on valid items 2021-08-14 14:09:07 +02:00
eikek
fcef52856a Allow tag ids or tag names when replacing tags 2021-07-25 21:26:22 +02:00
eikek
916217df4f Make convert-all-pdfs an admin endpoint 2021-07-25 01:25:24 +02:00
eikek
1913877de1 The id must be recreated for each job, obviously
Fixes: #938
2021-07-16 21:14:47 +02:00
eikek
f7eed33545 Return a 404 if a source was not found when checking a file 2021-07-08 21:17:48 +02:00
eikek
1120434cd9 Replace generating preview images with an admin endpoint
It doesn't make much sense to have this per collective, because this
is triggered by an admin after changing the server config file. So it
is now implemented as an admin endpoint that affects all files.
2021-07-04 21:37:34 +02:00
eikek
8e5c88fd32 Add copyright header to source files 2021-07-04 10:57:53 +02:00
eikek
bd791b4593 Upgrade code base to CE3 2021-06-22 22:53:34 +02:00
Eike Kettner
25788a0b23 Add routes for storing/retrieving client settings 2021-05-27 21:34:05 +02:00
Eike Kettner
a1a93e5ca6 Fixes searching items with fulltext
When using fulltext only search, then only the index must be searched.
This wasn't working anymore, because the routes added a query to
always select valid items (those not being processed). But this lead
to the downstream code to always consult the database, too. Since the
routes are using a "simple-search" interface, this is now adding the
valid-state condition if applicable. There are still more low-level
interfaces that can be used when searching should be done differently.

Closes: #823
2021-05-23 14:14:25 +02:00
Stefan Scheidewig
558197e415 Fixed the imports 2021-04-15 20:49:34 +02:00
Stefan Scheidewig
fa34312020 Implemented endpoint to delete multiple attachments 2021-04-15 18:05:01 +02:00
Eike Kettner
3e0914ece7 Correctly count tag categories
If multiple tags of the same category are applied to the same item,
just summing tag counts will produce the wrong results as now items
are counted multiple times.
2021-04-11 14:34:44 +02:00
Eike Kettner
c36073b852 Allow to give human readable summary to user tasks 2021-03-27 22:13:13 +01:00
Eike Kettner
cc38b850a6 Remove deprecated search routes and some refactoring 2021-03-27 22:13:13 +01:00
Eike Kettner
a7ee0aa08b Add a flag to processing task to distinguish re-/processing 2021-03-12 00:45:23 +01:00
Eike Kettner
7b1ec97c97 Fix sort when using fulltext only 2021-03-08 00:47:15 +01:00
Eike Kettner
63d146c2de Resolve fulltext search queries the same way as before
For now, fulltext search is only possible when being the only term or
inside the root AND expression.
2021-03-07 09:40:47 +01:00
Eike Kettner
dadab0d308 Implement search by query in endpoints 2021-03-01 15:31:02 +01:00
Eike Kettner
698ff58aa3 Provide a more convenient interface to search 2021-03-01 11:50:07 +01:00
Eike Kettner
9013d9264e Add more convenient date parsers and some basic macros 2021-03-01 00:51:01 +01:00
Eike Kettner
e9ed998e3a Basic poc to search via custom query 2021-03-01 00:51:01 +01:00
Eike Kettner
186014a1c6 Refactor search to separate between a base query and user query
The `findBase` is adding only strictly required conditions. Everything
else comes from the user.
2021-03-01 00:51:01 +01:00
Eike Kettner
7ef3185659 Add language to a source
Allows to define upload urls for different languages.
2021-02-18 23:34:42 +01:00
Eike Kettner
4cba96f390 Always return classifier results as suggestion
The classifier results are spliced into the suggestion list at second
place. When linking they are only used if nlp didn't find anything.
2021-01-21 21:05:28 +01:00
Eike Kettner
668abf2140 Add a reset-password admin route 2021-01-04 20:59:31 +01:00
Eike Kettner
0cfd8974d3 Add a flag to imap settings to enable/disable oauth2 scheme 2021-01-04 11:03:04 +01:00
Eike Kettner
97dfcece97 Fix duplicate check on restarts
Issue: #530
2021-01-02 21:18:05 +01:00
Eike Kettner
6346bf6a34 Add summary for fulltext searches 2020-12-17 00:11:33 +01:00
Eike Kettner
77627534bc Improve on basic search summary 2020-12-15 23:37:02 +01:00
Eike Kettner
4ca6dfccae Get basic search summary 2020-12-15 23:10:13 +01:00
Eike Kettner
2dff686fa0 Introduce unit condition 2020-12-15 21:03:47 +01:00
Eike Kettner
80406cabc2 Refactoring some code into separate files 2020-12-15 21:03:47 +01:00
Eike Kettner
266fec9eb5 Convert find items query 2020-12-15 21:03:46 +01:00
Eike Kettner
a355767fdb Convert all query libs besides QItem 2020-12-15 21:03:46 +01:00
Eike Kettner
e3f6892abd Convert job record 2020-12-15 21:03:46 +01:00
Eike Kettner
0b6f965fcb Fix rememberme for missing local storage 2020-12-04 22:57:21 +01:00
Eike Kettner
a0642905db Use remember-me cookie if present 2020-12-04 17:59:25 +01:00
Eike Kettner
c10c1fad72 Prepare remember-me authentication variant 2020-12-04 17:59:25 +01:00
Eike Kettner
fc2668feee Allow to connect a person to an organization 2020-12-01 23:39:45 +01:00
Eike Kettner
7052bc6b8e Add cc and bcc to item mail 2020-11-28 01:36:59 +01:00
Eike Kettner
0919eec3c2 Improve field query and fix mariadb's pickiness with parens
If no query is given, don't search with `like '%'`. MariaDB doesn't
want parens around columns in the GROUP BY clause.
2020-11-25 21:08:49 +01:00
Eike Kettner
a18ac17f0c Search with wildcards for custom fields 2020-11-24 21:44:27 +01:00
Eike Kettner
5fe532001b Allow to specify document lanugage with the request 2020-11-23 20:49:01 +01:00
Eike Kettner
066c856981 Allow to search for custom field values 2020-11-22 21:41:09 +01:00
Eike Kettner
1aefff37aa Return custom field values with item details 2020-11-22 21:41:09 +01:00
Eike Kettner
af1cca7d83 Fix condition for deleting custom field value 2020-11-22 21:41:09 +01:00
Eike Kettner
8d35d100d6 Change custom fields for multiple items 2020-11-22 21:41:09 +01:00
Eike Kettner
93295d63a5 Change custom field values for a single item 2020-11-22 21:41:09 +01:00
Eike Kettner
62313ab03a Add and change custom fields 2020-11-22 21:41:09 +01:00
Eike Kettner
248ad04dd0 Prepare custom fields 2020-11-22 21:41:09 +01:00
Eike Kettner
04ba14f802 Amend source form with tags and file-filter
Allow to define tags and a file filter per source.
2020-11-12 22:37:28 +01:00
Eike Kettner
4fd6e02ec0 Improve glob and filter archive entries 2020-11-11 21:01:23 +01:00
Eike Kettner
55a6f7aaf6 Add more properties to upload meta data 2020-11-11 21:01:23 +01:00
Eike Kettner
29455d638c Add startup task to find page counts of existing files 2020-11-09 20:35:35 +01:00
Eike Kettner
f4e50c5229 Provide endpoints to submit tasks to re-generate previews
The scaling factor can be given in the config file. When this changes,
images can be regenerated via POSTing to certain endpoints. It is
possible to regenerate just one attachment preview or all within a
collective.
2020-11-09 09:00:02 +01:00
Eike Kettner
757ad31165 Add a route to get the item preview
This is the first available preview of an attachment wrt position. If
all attachments have a preview image, the preview of the first
attachment is returned.
2020-11-08 15:12:56 +01:00
Eike Kettner
d376ef3ef1 Add simple route to get the preview image 2020-11-08 13:33:39 +01:00
Eike Kettner
f4c79c72ae Allow to remove tags from multiple items 2020-10-31 14:42:17 +01:00
Eike Kettner
998aad5627 Delete multiple items 2020-10-26 14:46:04 +01:00
Eike Kettner
9193d7ca51 Send multiple items to reprocessing 2020-10-26 14:03:56 +01:00
Eike Kettner
26e89bf84e Edit org/person/equipment of multiple items 2020-10-26 13:35:47 +01:00
Eike Kettner
2e6026b817 Edit dates of multiple items 2020-10-26 13:16:03 +01:00
Eike Kettner
d4043634ac Edit direction of multiple items 2020-10-26 12:48:15 +01:00
Eike Kettner
42c989a6cd Edit folder of multiple items 2020-10-26 12:39:44 +01:00
Eike Kettner
17472fa4ca Edit name of multiple items 2020-10-26 12:17:55 +01:00
Eike Kettner
7ad37c8d26 Editing tags for multiple items 2020-10-26 11:54:04 +01:00
Eike Kettner
f6f63000be Prepend a duplicate check when uploading files 2020-09-23 23:37:00 +02:00
Eike Kettner
06879456a6 Change job priority on queue page 2020-09-05 18:50:58 +02:00
Eike Kettner
f9fcee81a5 Add start-now button for train-classifier task 2020-09-02 21:22:22 +02:00
Eike Kettner
68bb65572b Integrate learn-classifier task into the app 2020-09-02 18:28:14 +02:00
Eike Kettner
8c4f2e702b Add classifier settings 2020-09-02 18:28:14 +02:00
Eike Kettner
3986487f11 Add api docs and cleanup 2020-08-13 21:22:54 +02:00
Eike Kettner
081c4da903 Add a route to trigger the convert-all-pdf task for a collective 2020-08-13 01:06:13 +02:00
Eike Kettner
69674eb485 Improve job-queue query to make sure jobs across all states show up 2020-08-13 01:06:13 +02:00
Eike Kettner
07e9a9767e Add a task to re-process files of an item 2020-08-12 22:29:56 +02:00
Eike Kettner
098e4cf868 Fix uploading to enabled/disabled source endpoints 2020-08-09 09:21:23 +02:00
Eike Kettner
43946ed347 Fail early when source id is wrong 2020-08-08 18:43:18 +02:00
Eike Kettner
06ad9ac46c Add routes to conveniently set/toggle tags 2020-08-08 15:08:04 +02:00
Eike Kettner
1c8b66194b Add a route to return used tags
This is part of the `/insights` route without queries for file usage.
2020-08-08 08:35:35 +02:00
Eike Kettner
a4796f3f7f Return more tag details with item insights 2020-08-08 00:41:20 +02:00
Eike Kettner
f3ba224124 Add missing organization/person/equipment routes 2020-08-07 01:30:43 +02:00
Eike Kettner
09d74b7e80 Return item notes with search results
In order to not make the response very large, a admin can define a
limit on how much to return.
2020-08-05 00:09:37 +02:00
Eike Kettner
a06d20a479 Remove duplicate results from index-only search 2020-08-01 15:46:00 +02:00
Eike Kettner
209c068436 Use keywords in pdfs to search for existing tags
During processing, keywords stored in PDF metadata are used to look
them up in the tag database and associate any existing tags to the
item.

See #175
2020-07-19 00:28:04 +02:00
Eike Kettner
5b01c93711 Add a folder-id to item processing
This allows to define a folder when uploading files. All generated
items are associated to this folder on creation.
2020-07-14 23:18:39 +02:00
Eike Kettner
259526a088 Organize imports 2020-07-12 13:51:52 +02:00
Eike Kettner
22fa1dba13 Apply folder restriction to fulltext only search
And update index when folder changes.
2020-07-12 13:50:45 +02:00
Eike Kettner
5b95fddf3d Make item queries depend on the account-id
Now the user is required, too, to list items.
2020-07-11 21:54:51 +02:00
Eike Kettner
86443e10a6 Set the folder of an item 2020-07-11 12:57:17 +02:00
Eike Kettner
2ab0b5e222 Rename space -> folder 2020-07-11 11:54:23 +02:00
Eike Kettner
60a08fc786 Return member count and if current user is owner or member 2020-07-11 01:30:29 +02:00
Eike Kettner
ea4ab11195 Allow to only return owning spaces 2020-07-11 01:30:28 +02:00
Eike Kettner
752a94a9e2 Implement space operations 2020-07-11 01:30:28 +02:00
Eike Kettner
c12201c4a5 Add routes to manage spaces 2020-07-11 01:30:28 +02:00
Eike Kettner
347a029af8 Scalafix organize-imports 2020-06-28 21:20:47 +02:00
Eike Kettner
41c0f70d3b Fix cancelling jobs
A request to cancel a job was not processed correctly. The cancelling
routine of a task must run, regardless of the (non-final) state. Now
it works like this: if a job is currently running, it is interrupted
and its cancel routine is invoked. It then enters "cancelled" state.
If it is stuck, it is loaded and only its cancel routine is run. If it
is in a final state or waiting, it is removed from the queue.
2020-06-26 23:08:27 +02:00
Eike Kettner
0ba1736bc8 Remove items/attachments from index on delete 2020-06-25 00:00:10 +02:00
Eike Kettner
d5c9923a6d Add a route that only searches the full-text index
It returns the results in the same order as received from the index to
preserve the relevance ordering.
2020-06-24 00:03:17 +02:00
Eike Kettner
d9f0f05613 Refactor findItemsWithTags to more general useful 2020-06-23 21:27:01 +02:00
Eike Kettner
647911563e Fix paging when using full-text search 2020-06-23 01:44:52 +02:00
Eike Kettner
15c0fb4395 Merge branch 'master' into fts 2020-06-23 00:32:27 +02:00
Eike Kettner
ffbb16db45 Transport highlighting information to the client 2020-06-23 00:17:29 +02:00
Eike Kettner
cfe5aa8894 Use no-op fts-client if disabled + push this flag to the webui 2020-06-21 21:06:08 +02:00
Eike Kettner
330fdcdd5b Add rest endpoints to re-create the index 2020-06-21 20:13:33 +02:00
Eike Kettner
0d8b03fc61 Add backend operations for re-creating the full-text index 2020-06-21 15:46:51 +02:00
Eike Kettner
14ea4091c4 Renaming things 2020-06-21 13:15:02 +02:00
Eike Kettner
9acea8307d Update full-text index when changing data 2020-06-21 00:33:39 +02:00
Eike Kettner
7609b2b7c3 Run scalafmtAll 2020-06-20 23:03:51 +02:00
Eike Kettner
1f4ff0d4c4 Add language to schema, extend fts-client 2020-06-20 22:44:47 +02:00
Eike Kettner
3576c45d1a First basic working solr search 2020-06-20 02:18:49 +02:00
Eike Kettner
522daaf57e Introducing fts client into codebase 2020-06-17 23:20:46 +02:00
Eike Kettner
7a3d2e4dc6 Extract OItemSearch from OItem 2020-06-15 23:13:48 +02:00
Eike Kettner
84a26461ed Add a route to update the name of an attachment 2020-06-14 17:03:07 +02:00
Eike Kettner
617487f5b3 Add mail-debug flag to rest-server
It has been added to the joex application, but it should be possible
to debug mail problems on both apps.
2020-06-13 15:10:00 +02:00
Eike Kettner
e51e84408b Change notify-due-item routes to allow multiple tasks per user 2020-06-13 14:26:38 +02:00
Eike Kettner
363eb81aff Add remaining routes to create and update item meta data 2020-06-11 22:28:31 +02:00
Eike Kettner
c6accca0ff Add route to create and associate correspondent org 2020-06-11 22:11:58 +02:00
Eike Kettner
f407f08ed3 Add a route to add a new tag and associate it to an item 2020-06-11 21:51:42 +02:00
Eike Kettner
1d2a6e6caa Add endpoint to search for items and return their tags
This is a more expensive query, since the tags must be resolved per
item. This is now implemented by doing additional queries while
caching each resolved tag.
2020-06-07 15:18:28 +02:00
Eike Kettner
e5b90eff34 Allow client to load items in batches 2020-06-06 11:05:15 +02:00
Eike Kettner
ee394eae86 Try streamline the different impls for MimeType 2020-05-25 09:24:24 +02:00
Eike Kettner
3cb738568f Allow to change position of attachments 2020-05-24 17:30:25 +02:00
Eike Kettner
24caba1457 Refactor UploadRoutes to remove duplicate code 2020-05-24 11:48:49 +02:00
Eike Kettner
f519a8effa Check for an existing item before attempting to add files 2020-05-24 11:48:49 +02:00
Eike Kettner
f4949446e3 Allow to specify an item id to amend files to existing items 2020-05-23 20:15:55 +02:00
Eike Kettner
f16632bc7f Allow a collective to disable the integration endpoint 2020-05-23 14:29:24 +02:00
Eike Kettner
9f9dd6c0fb Change routes for scan-mailbox task to allow multiple tasks per user 2020-05-21 22:04:45 +02:00
Eike Kettner
451a09dda0 Allow to skip joex notification on uploads 2020-05-20 17:52:38 +02:00
Eike Kettner
852455c610 Add upload operation to task arguments 2020-05-20 17:52:38 +02:00