Commit Graph

279 Commits

Author SHA1 Message Date
mergify[bot]
38e0a50942
Merge pull request #582 from eikek/delete-org-fix
Fix deleting organization
2021-01-21 22:56:56 +00:00
Eike Kettner
f4a03e7c69 Fix deleting organization
The foreign key in person must be resetted.
2021-01-21 21:27:02 +01:00
Eike Kettner
4cba96f390 Always return classifier results as suggestion
The classifier results are spliced into the suggestion list at second
place. When linking they are only used if nlp didn't find anything.
2021-01-21 21:05:28 +01:00
Eike Kettner
9957c3267e Add constraints from config to classifier training
For large and/or many documents, training the classifier can lead to
OOM errors. Some limits have been set by default.
2021-01-21 17:46:39 +01:00
Eike Kettner
363cf5aef0 Quote names in sql changesets 2021-01-21 00:22:58 +01:00
Eike Kettner
38387e00a0 Fix mariadb migration 2021-01-21 00:22:53 +01:00
Eike Kettner
27c24c128d Store tags guessed with classifier in database 2021-01-20 00:30:40 +01:00
Eike Kettner
9d83cb7fe4 Store item based proposals in separate table
Classifier don't work on each attachment, but on all. So the results
must not be stored at an attachment. This reverts some previous
changes to put the classifier results for item entities into its own
table.
2021-01-19 23:48:09 +01:00
Eike Kettner
3ff9284a64 Return classifier results as suggestions 2021-01-19 23:13:51 +01:00
Eike Kettner
1cd3441462 Run classifier for item entities (concerned, correspondent)
Store the results separately from nlp results in attachment metadata.
2021-01-19 22:08:29 +01:00
Eike Kettner
d124f0c1a9 Rename db changeset
It's not just a fix, but adds new things
2021-01-19 22:08:29 +01:00
Eike Kettner
99dcaae66b Learn classifiers for item entities
Learns classifiers for concerned and correspondent entities. This can
be used as an alternative to or after nlp.
2021-01-19 20:54:47 +01:00
Eike Kettner
a6f29153c4 Control what tag categories to use for auto-tagging 2021-01-19 01:20:13 +01:00
Eike Kettner
cce8878898 Exclude tags w/o category from classifying; remove obsolete models 2021-01-18 21:51:49 +01:00
Eike Kettner
3e28ce1254 Add the sql concat function to query builder 2021-01-18 21:51:45 +01:00
Eike Kettner
249f9e6e2a Extend guessing tags to all tag categories 2021-01-18 21:51:45 +01:00
Eike Kettner
f01646aeb5 Reorganize nlp pipeline and add nlp-unsupported language italian
Improves and reorganizes how nlp pipelines are setup. Now users can
choose from many options, depending on their hardware and usage
scenario.

This is the base to use more languages without depending on what
stanford-nlp supports. Support then is involves to text extraction and
simple regex-ner processing.
2021-01-18 17:41:40 +01:00
Eike Kettner
a70e9ab614 Store used language for processing on attachmentmeta
Issue: #570
2021-01-17 22:56:33 +01:00
Eike Kettner
f0f0e6e0d4 Search for categories case-insensitive
The string was already lowercased, but the comparison was not.

Fixes #568
2021-01-17 20:10:24 +01:00
Eike Kettner
623a61dbb6 Introduce a lowerEq operator to the query builder 2021-01-17 20:10:00 +01:00
Eike Kettner
3fccc3df39 Return all tags in search stats result
Before only tags with a count > 0 were included. Now those that have
not attached to any item are returned as well.
2021-01-11 12:13:13 +01:00
Eike Kettner
0cfd8974d3 Add a flag to imap settings to enable/disable oauth2 scheme 2021-01-04 11:03:04 +01:00
Eike Kettner
95fd386c14 Fixing find-by-checksum with exclusions
The NOT-IN query must check for null separately, as everything with
null evaluates to false in sql resulting in not finding existing
duplicates.
2021-01-03 12:29:03 +01:00
Eike Kettner
97dfcece97 Fix duplicate check on restarts
Issue: #530
2021-01-02 21:18:05 +01:00
Eike Kettner
a9ed0364d2 Fix linking guessed tags
Since tag names are lower-cased the search must happen lower-cased, too.
2021-01-02 01:30:31 +01:00
Eike Kettner
36858da624 Fix search condition for empty items set 2020-12-17 23:07:04 +01:00
Eike Kettner
8fba637ebe Add folder counts to search summary 2020-12-16 01:14:27 +01:00
Eike Kettner
77627534bc Improve on basic search summary 2020-12-15 23:37:02 +01:00
Eike Kettner
f3855628d5 Extend query builder with more functions 2020-12-15 23:34:12 +01:00
Eike Kettner
4ca6dfccae Get basic search summary 2020-12-15 23:10:13 +01:00
Eike Kettner
56d6d2e2ac Allow changing more parts of a select 2020-12-15 22:12:44 +01:00
Eike Kettner
f1c4b4adb0 Extract find-item query condition 2020-12-15 21:03:47 +01:00
Eike Kettner
2dff686fa0 Introduce unit condition 2020-12-15 21:03:47 +01:00
Eike Kettner
80406cabc2 Refactoring some code into separate files 2020-12-15 21:03:47 +01:00
Eike Kettner
278b1c22c9 Remove old code 2020-12-15 21:03:46 +01:00
Eike Kettner
2cecd01837 Convert rest of QItem 2020-12-15 21:03:46 +01:00
Eike Kettner
d1606d6f16 Remove old commented code 2020-12-15 21:03:46 +01:00
Eike Kettner
266fec9eb5 Convert find items query 2020-12-15 21:03:46 +01:00
Eike Kettner
5e2c5d2a50 Extends query builder 2020-12-15 21:03:46 +01:00
Eike Kettner
35c62049f5 Start converting QItem 2020-12-15 21:03:46 +01:00
Eike Kettner
a355767fdb Convert all query libs besides QItem 2020-12-15 21:03:46 +01:00
Eike Kettner
fd6d09587d Convert more records 2020-12-15 21:03:46 +01:00
Eike Kettner
613696539f Minor refactorings 2020-12-15 21:03:46 +01:00
Eike Kettner
d6f28d3eca Convert folder 2020-12-15 21:03:46 +01:00
Eike Kettner
87eb8c7f55 Convert more records 2020-12-15 21:03:46 +01:00
Eike Kettner
e3f6892abd Convert job record 2020-12-15 21:03:46 +01:00
Eike Kettner
1aa1f4367e Convert periodic tasks 2020-12-15 21:03:46 +01:00
Eike Kettner
3cef932ccd Convert more records 2020-12-15 21:03:46 +01:00
Eike Kettner
fe4815c737 Convert RSentMail 2020-12-15 21:03:46 +01:00
Eike Kettner
5cbf0d5602 Convert more records 2020-12-15 21:03:46 +01:00
Eike Kettner
10b49fccf8 Converting user and userimap records 2020-12-15 21:03:46 +01:00
Eike Kettner
c5c7f7ed3b Convert equipment record 2020-12-15 21:03:46 +01:00
Eike Kettner
adee496b77 Convert source record 2020-12-15 21:03:46 +01:00
Eike Kettner
2dbb1db2fd Initial outline for a simple query builder 2020-12-15 21:03:46 +01:00
Eike Kettner
27d087b14c Fix foreign key constraints 2020-12-14 14:34:22 +01:00
Eike Kettner
a0642905db Use remember-me cookie if present 2020-12-04 17:59:25 +01:00
Eike Kettner
c10c1fad72 Prepare remember-me authentication variant 2020-12-04 17:59:25 +01:00
Eike Kettner
290989f67f Reorder correspondent person suggestion based on org relationship 2020-12-01 23:39:45 +01:00
Eike Kettner
fc2668feee Allow to connect a person to an organization 2020-12-01 23:39:45 +01:00
Eike Kettner
0ee8ff66d5 Allow to search by source name 2020-11-30 14:07:45 +01:00
Eike Kettner
3fabe0a582 Update to Scala 2.13.4 2020-11-27 20:26:24 +01:00
Eike Kettner
0919eec3c2 Improve field query and fix mariadb's pickiness with parens
If no query is given, don't search with `like '%'`. MariaDB doesn't
want parens around columns in the GROUP BY clause.
2020-11-25 21:08:49 +01:00
Eike Kettner
52c6659f9f Add missing schema migrations for custom fields 2020-11-25 21:08:49 +01:00
Eike Kettner
9bea0298ad Allow to query custom field values with wildcards 2020-11-23 10:59:13 +01:00
Eike Kettner
7b7f1e4d6d Return custom field values with search results 2020-11-23 10:23:25 +01:00
Eike Kettner
066c856981 Allow to search for custom field values 2020-11-22 21:41:09 +01:00
Eike Kettner
1aefff37aa Return custom field values with item details 2020-11-22 21:41:09 +01:00
Eike Kettner
af1cca7d83 Fix condition for deleting custom field value 2020-11-22 21:41:09 +01:00
Eike Kettner
8d35d100d6 Change custom fields for multiple items 2020-11-22 21:41:09 +01:00
Eike Kettner
93295d63a5 Change custom field values for a single item 2020-11-22 21:41:09 +01:00
Eike Kettner
62313ab03a Add and change custom fields 2020-11-22 21:41:09 +01:00
Eike Kettner
248ad04dd0 Prepare custom fields 2020-11-22 21:41:09 +01:00
Eike Kettner
04ba14f802 Amend source form with tags and file-filter
Allow to define tags and a file filter per source.
2020-11-12 22:37:28 +01:00
Eike Kettner
10305bc82d Minor improvements 2020-11-09 21:16:53 +01:00
Eike Kettner
29455d638c Add startup task to find page counts of existing files 2020-11-09 20:35:35 +01:00
Eike Kettner
8c08bf233d Amend search results with attachment info
This uses again another query per item to retrieve some information
about each attachment already in the search results.
2020-11-09 14:24:28 +01:00
Eike Kettner
a77f34b7ba Add a processing step to retrieve page counts 2020-11-09 11:08:24 +01:00
Eike Kettner
d4bbb936b6 Count preview image sizes in insight data 2020-11-09 09:00:03 +01:00
Eike Kettner
f4e50c5229 Provide endpoints to submit tasks to re-generate previews
The scaling factor can be given in the config file. When this changes,
images can be regenerated via POSTing to certain endpoints. It is
possible to regenerate just one attachment preview or all within a
collective.
2020-11-09 09:00:02 +01:00
Eike Kettner
709848244c Create tasks to generate all previews
There is a task to generate preview images per attachment. It can
either add them (if not present yet) or overwrite them (e.g. some
config has changed).

There is a task that selects all attachments without previews and
submits a task to create it. This is submitted on start automatically
to generate previews for all existing attachments.
2020-11-08 23:46:02 +01:00
Eike Kettner
eede194352 Fix deleting preview files 2020-11-08 21:27:55 +01:00
Eike Kettner
757ad31165 Add a route to get the item preview
This is the first available preview of an attachment wrt position. If
all attachments have a preview image, the preview of the first
attachment is returned.
2020-11-08 15:12:56 +01:00
Eike Kettner
0841a33ae3 Add a table to hold the preview files 2020-11-08 01:25:38 +01:00
Eike Kettner
0461cfefe7 Fix sql error for mariadb <10.4
MariaDB below 10.4 doesn't support parentheses around selects for
`intersect` and `union`.

https://mariadb.com/kb/en/intersect/#parentheses

Fixes #404
2020-10-28 22:54:51 +01:00
Eike Kettner
b59696a9d3 Make sure to only remove/retry items in premature states 2020-10-26 23:39:26 +01:00
Eike Kettner
26e89bf84e Edit org/person/equipment of multiple items 2020-10-26 13:35:47 +01:00
Eike Kettner
2e6026b817 Edit dates of multiple items 2020-10-26 13:16:03 +01:00
Eike Kettner
d4043634ac Edit direction of multiple items 2020-10-26 12:48:15 +01:00
Eike Kettner
7ad37c8d26 Editing tags for multiple items 2020-10-26 11:54:04 +01:00
Eike Kettner
3e2d272746 Add unique constraint for equipment names
Fixes #370
2020-10-21 22:42:19 +02:00
Eike Kettner
3771587e55 Find duplicate tags without category 2020-10-19 00:30:41 +02:00
Eike Kettner
6a3386ce66 Fix sql comparison with optional values 2020-10-19 00:29:41 +02:00
Eike Kettner
80ddca9aa3 Add counter to joblog for correct log order
This is to distinguish log entries created at the same time.
2020-10-02 22:14:30 +02:00
Eike Kettner
d4354b8b49 Skip pdf conversion if a converted file exists
For images the conversion also returns the extracted text. If this
would have failed to be saved, it is extracted in the following
text-extraction step.
2020-10-02 17:39:39 +02:00
Eike Kettner
b6f23b038a Fix finding attachments for retries
The attachments to process again must be searched in sources and
archives, too.
2020-10-02 17:39:34 +02:00
Eike Kettner
e26d7129e7 Add fix for mariadb text columns
The `text` data type can only store up to 64kb data. The `mediumtext`
up to 16M and `longtext` up to 4G.

Issue: #297
2020-10-02 16:50:51 +02:00
Eike Kettner
552cdac1d3 Apply flyway api changes 2020-09-28 15:12:10 +02:00
Eike Kettner
f6f63000be Prepend a duplicate check when uploading files 2020-09-23 23:37:00 +02:00
Eike Kettner
c658677032 Autoformat 2020-09-09 00:29:32 +02:00
Eike Kettner
eb11b33028 Fix mariadb changsets 2020-09-07 20:02:50 +02:00