Eike Kettner
a1a93e5ca6
Fixes searching items with fulltext
...
When using fulltext only search, then only the index must be searched.
This wasn't working anymore, because the routes added a query to
always select valid items (those not being processed). But this lead
to the downstream code to always consult the database, too. Since the
routes are using a "simple-search" interface, this is now adding the
valid-state condition if applicable. There are still more low-level
interfaces that can be used when searching should be done differently.
Closes : #823
2021-05-23 14:14:25 +02:00
Stefan Scheidewig
6149a2ab89
Restored unused imports to make it compile again
2021-04-15 18:34:54 +02:00
Stefan Scheidewig
fa34312020
Implemented endpoint to delete multiple attachments
2021-04-15 18:05:01 +02:00
Eike Kettner
994e3df597
Fix query for getting tag categoy summary
2021-04-12 13:40:22 +02:00
Eike Kettner
3e0914ece7
Correctly count tag categories
...
If multiple tags of the same category are applied to the same item,
just summing tag counts will produce the wrong results as now items
are counted multiple times.
2021-04-11 14:34:44 +02:00
Eike Kettner
4041018c47
Reduce not expressions
...
Fixes queries containing macros inside a "not".
2021-04-11 12:57:42 +02:00
Eike Kettner
e1bbc2edf5
Apply autoformat
2021-04-10 16:31:58 +02:00
Scala Steward
144ea852bf
Update fs2-core, fs2-io to 2.5.4
2021-03-31 21:10:42 +02:00
Eike Kettner
c36073b852
Allow to give human readable summary to user tasks
2021-03-27 22:13:13 +01:00
Eike Kettner
cc38b850a6
Remove deprecated search routes and some refactoring
2021-03-27 22:13:13 +01:00
Eike Kettner
177488817d
Fix h2 migration
...
Using java source code obviously requires `javac` during migration.
2021-03-13 16:38:48 +01:00
Eike Kettner
2e443bc9b9
Fix mariadb migration
2021-03-13 15:52:38 +01:00
Eike Kettner
0229a867af
Add a use colum to metadata entities
2021-03-10 23:55:18 +01:00
Eike Kettner
6a63694a3e
Convert unit tests to munit
2021-03-10 19:48:56 +01:00
Eike Kettner
77a87782b7
Refactoring parser
...
- put all used strings in one place to have it easier to track
- don't use `$` for shortcuts, it's a detail not interesting to a
user; now names must not clash (which is a good idea anyways)
- Added two more shortcuts `conc` and `corr`
2021-03-08 22:51:14 +01:00
Eike Kettner
e681ffa96f
Extend query builder allowing more conditions
...
Before only a column or a dbfunction could be used in a condition. It
is now allowed for all `SelectExpr`.
2021-03-08 22:51:08 +01:00
Eike Kettner
30c901ddf1
Add more ways to query for attachments
...
- find items with a specified attachment count
- find items by attachment id
2021-03-08 09:49:38 +01:00
Eike Kettner
2b2f913e85
Add checksum query expr
2021-03-08 01:53:21 +01:00
Eike Kettner
1c834cbb77
Correctly compare numeric field values
2021-03-03 22:54:55 +01:00
Eike Kettner
71985244f1
Use a better representation for macros
2021-03-03 00:44:49 +01:00
Eike Kettner
a48504debb
Specificly search for field id vs name
2021-03-02 21:09:31 +01:00
Eike Kettner
168f5a1a98
Fix like search for custom fields
2021-03-01 20:56:23 +01:00
Eike Kettner
f8307f77c6
Search by field id or name
2021-03-01 20:56:23 +01:00
Eike Kettner
698ff58aa3
Provide a more convenient interface to search
2021-03-01 11:50:07 +01:00
Eike Kettner
d737da768e
Move to munit in query module
2021-03-01 00:51:01 +01:00
Eike Kettner
9013d9264e
Add more convenient date parsers and some basic macros
2021-03-01 00:51:01 +01:00
Eike Kettner
af73b59ec2
Parser improvements
...
- default expressions into a and node
- fix parsing string lists that end in whitespace
- fix package names of internal classes
2021-03-01 00:51:01 +01:00
Eike Kettner
a80d73d5d2
Optimize imports
2021-03-01 00:51:01 +01:00
Eike Kettner
e9ed998e3a
Basic poc to search via custom query
2021-03-01 00:51:01 +01:00
Eike Kettner
186014a1c6
Refactor search to separate between a base query and user query
...
The `findBase` is adding only strictly required conditions. Everything
else comes from the user.
2021-03-01 00:51:01 +01:00
Eike Kettner
c3cdec416c
Sketching some basic tests
2021-03-01 00:50:52 +01:00
Eike Kettner
be5c7ffb88
First draft of ast and parser
2021-03-01 00:46:57 +01:00
Eike Kettner
e6d9ce2c37
Remove obsolete type capabilities
...
These are now detected by the new scala compiler and lead to compile
errors.
2021-03-01 00:16:30 +01:00
Eike Kettner
7ef3185659
Add language to a source
...
Allows to define upload urls for different languages.
2021-02-18 23:34:42 +01:00
Eike Kettner
d7bc963450
Cleanup nodes that are not reachable anymore
2021-02-18 00:37:18 +01:00
Eike Kettner
5181283b1b
Add a short-name to organizations
2021-02-17 22:55:35 +01:00
Eike Kettner
20ccdda609
Add a notes field to equipments
2021-02-17 22:39:07 +01:00
Eike Kettner
48eee00c0b
Allow person to be correspondent, concerning or both
2021-02-16 22:49:55 +01:00
Eike Kettner
394aeeccb6
Introduce a sql literal and constants in query builder
...
The h2 jdbc driver could not translate the union query in QCollective
when the `kind` was set via a constant value. Using literals works
here. Renamed the corresponding elements in the query builder.
2021-01-25 00:18:24 +01:00
Eike Kettner
1b66e2af5c
Fix classifier_settings table
2021-01-23 21:30:26 +01:00
Eike Kettner
c7e850116f
Make the text length limit optional
2021-01-22 23:06:50 +01:00
mergify[bot]
38e0a50942
Merge pull request #582 from eikek/delete-org-fix
...
Fix deleting organization
2021-01-21 22:56:56 +00:00
Eike Kettner
f4a03e7c69
Fix deleting organization
...
The foreign key in person must be resetted.
2021-01-21 21:27:02 +01:00
Eike Kettner
4cba96f390
Always return classifier results as suggestion
...
The classifier results are spliced into the suggestion list at second
place. When linking they are only used if nlp didn't find anything.
2021-01-21 21:05:28 +01:00
Eike Kettner
9957c3267e
Add constraints from config to classifier training
...
For large and/or many documents, training the classifier can lead to
OOM errors. Some limits have been set by default.
2021-01-21 17:46:39 +01:00
Eike Kettner
363cf5aef0
Quote names in sql changesets
2021-01-21 00:22:58 +01:00
Eike Kettner
38387e00a0
Fix mariadb migration
2021-01-21 00:22:53 +01:00
Eike Kettner
27c24c128d
Store tags guessed with classifier in database
2021-01-20 00:30:40 +01:00
Eike Kettner
9d83cb7fe4
Store item based proposals in separate table
...
Classifier don't work on each attachment, but on all. So the results
must not be stored at an attachment. This reverts some previous
changes to put the classifier results for item entities into its own
table.
2021-01-19 23:48:09 +01:00
Eike Kettner
3ff9284a64
Return classifier results as suggestions
2021-01-19 23:13:51 +01:00
Eike Kettner
1cd3441462
Run classifier for item entities (concerned, correspondent)
...
Store the results separately from nlp results in attachment metadata.
2021-01-19 22:08:29 +01:00
Eike Kettner
d124f0c1a9
Rename db changeset
...
It's not just a fix, but adds new things
2021-01-19 22:08:29 +01:00
Eike Kettner
99dcaae66b
Learn classifiers for item entities
...
Learns classifiers for concerned and correspondent entities. This can
be used as an alternative to or after nlp.
2021-01-19 20:54:47 +01:00
Eike Kettner
a6f29153c4
Control what tag categories to use for auto-tagging
2021-01-19 01:20:13 +01:00
Eike Kettner
cce8878898
Exclude tags w/o category from classifying; remove obsolete models
2021-01-18 21:51:49 +01:00
Eike Kettner
3e28ce1254
Add the sql concat function to query builder
2021-01-18 21:51:45 +01:00
Eike Kettner
249f9e6e2a
Extend guessing tags to all tag categories
2021-01-18 21:51:45 +01:00
Eike Kettner
f01646aeb5
Reorganize nlp pipeline and add nlp-unsupported language italian
...
Improves and reorganizes how nlp pipelines are setup. Now users can
choose from many options, depending on their hardware and usage
scenario.
This is the base to use more languages without depending on what
stanford-nlp supports. Support then is involves to text extraction and
simple regex-ner processing.
2021-01-18 17:41:40 +01:00
Eike Kettner
a70e9ab614
Store used language for processing on attachmentmeta
...
Issue: #570
2021-01-17 22:56:33 +01:00
Eike Kettner
f0f0e6e0d4
Search for categories case-insensitive
...
The string was already lowercased, but the comparison was not.
Fixes #568
2021-01-17 20:10:24 +01:00
Eike Kettner
623a61dbb6
Introduce a lowerEq operator to the query builder
2021-01-17 20:10:00 +01:00
Eike Kettner
3fccc3df39
Return all tags in search stats result
...
Before only tags with a count > 0 were included. Now those that have
not attached to any item are returned as well.
2021-01-11 12:13:13 +01:00
Eike Kettner
0cfd8974d3
Add a flag to imap settings to enable/disable oauth2 scheme
2021-01-04 11:03:04 +01:00
Eike Kettner
95fd386c14
Fixing find-by-checksum with exclusions
...
The NOT-IN query must check for null separately, as everything with
null evaluates to false in sql resulting in not finding existing
duplicates.
2021-01-03 12:29:03 +01:00
Eike Kettner
97dfcece97
Fix duplicate check on restarts
...
Issue: #530
2021-01-02 21:18:05 +01:00
Eike Kettner
a9ed0364d2
Fix linking guessed tags
...
Since tag names are lower-cased the search must happen lower-cased, too.
2021-01-02 01:30:31 +01:00
Eike Kettner
36858da624
Fix search condition for empty items set
2020-12-17 23:07:04 +01:00
Eike Kettner
8fba637ebe
Add folder counts to search summary
2020-12-16 01:14:27 +01:00
Eike Kettner
77627534bc
Improve on basic search summary
2020-12-15 23:37:02 +01:00
Eike Kettner
f3855628d5
Extend query builder with more functions
2020-12-15 23:34:12 +01:00
Eike Kettner
4ca6dfccae
Get basic search summary
2020-12-15 23:10:13 +01:00
Eike Kettner
56d6d2e2ac
Allow changing more parts of a select
2020-12-15 22:12:44 +01:00
Eike Kettner
f1c4b4adb0
Extract find-item query condition
2020-12-15 21:03:47 +01:00
Eike Kettner
2dff686fa0
Introduce unit condition
2020-12-15 21:03:47 +01:00
Eike Kettner
80406cabc2
Refactoring some code into separate files
2020-12-15 21:03:47 +01:00
Eike Kettner
278b1c22c9
Remove old code
2020-12-15 21:03:46 +01:00
Eike Kettner
2cecd01837
Convert rest of QItem
2020-12-15 21:03:46 +01:00
Eike Kettner
d1606d6f16
Remove old commented code
2020-12-15 21:03:46 +01:00
Eike Kettner
266fec9eb5
Convert find items query
2020-12-15 21:03:46 +01:00
Eike Kettner
5e2c5d2a50
Extends query builder
2020-12-15 21:03:46 +01:00
Eike Kettner
35c62049f5
Start converting QItem
2020-12-15 21:03:46 +01:00
Eike Kettner
a355767fdb
Convert all query libs besides QItem
2020-12-15 21:03:46 +01:00
Eike Kettner
fd6d09587d
Convert more records
2020-12-15 21:03:46 +01:00
Eike Kettner
613696539f
Minor refactorings
2020-12-15 21:03:46 +01:00
Eike Kettner
d6f28d3eca
Convert folder
2020-12-15 21:03:46 +01:00
Eike Kettner
87eb8c7f55
Convert more records
2020-12-15 21:03:46 +01:00
Eike Kettner
e3f6892abd
Convert job record
2020-12-15 21:03:46 +01:00
Eike Kettner
1aa1f4367e
Convert periodic tasks
2020-12-15 21:03:46 +01:00
Eike Kettner
3cef932ccd
Convert more records
2020-12-15 21:03:46 +01:00
Eike Kettner
fe4815c737
Convert RSentMail
2020-12-15 21:03:46 +01:00
Eike Kettner
5cbf0d5602
Convert more records
2020-12-15 21:03:46 +01:00
Eike Kettner
10b49fccf8
Converting user and userimap records
2020-12-15 21:03:46 +01:00
Eike Kettner
c5c7f7ed3b
Convert equipment record
2020-12-15 21:03:46 +01:00
Eike Kettner
adee496b77
Convert source record
2020-12-15 21:03:46 +01:00
Eike Kettner
2dbb1db2fd
Initial outline for a simple query builder
2020-12-15 21:03:46 +01:00
Eike Kettner
27d087b14c
Fix foreign key constraints
2020-12-14 14:34:22 +01:00
Eike Kettner
a0642905db
Use remember-me cookie if present
2020-12-04 17:59:25 +01:00
Eike Kettner
c10c1fad72
Prepare remember-me authentication variant
2020-12-04 17:59:25 +01:00
Eike Kettner
290989f67f
Reorder correspondent person suggestion based on org relationship
2020-12-01 23:39:45 +01:00
Eike Kettner
fc2668feee
Allow to connect a person to an organization
2020-12-01 23:39:45 +01:00
Eike Kettner
0ee8ff66d5
Allow to search by source name
2020-11-30 14:07:45 +01:00
Eike Kettner
3fabe0a582
Update to Scala 2.13.4
2020-11-27 20:26:24 +01:00
Eike Kettner
0919eec3c2
Improve field query and fix mariadb's pickiness with parens
...
If no query is given, don't search with `like '%'`. MariaDB doesn't
want parens around columns in the GROUP BY clause.
2020-11-25 21:08:49 +01:00
Eike Kettner
52c6659f9f
Add missing schema migrations for custom fields
2020-11-25 21:08:49 +01:00
Eike Kettner
9bea0298ad
Allow to query custom field values with wildcards
2020-11-23 10:59:13 +01:00
Eike Kettner
7b7f1e4d6d
Return custom field values with search results
2020-11-23 10:23:25 +01:00
Eike Kettner
066c856981
Allow to search for custom field values
2020-11-22 21:41:09 +01:00
Eike Kettner
1aefff37aa
Return custom field values with item details
2020-11-22 21:41:09 +01:00
Eike Kettner
af1cca7d83
Fix condition for deleting custom field value
2020-11-22 21:41:09 +01:00
Eike Kettner
8d35d100d6
Change custom fields for multiple items
2020-11-22 21:41:09 +01:00
Eike Kettner
93295d63a5
Change custom field values for a single item
2020-11-22 21:41:09 +01:00
Eike Kettner
62313ab03a
Add and change custom fields
2020-11-22 21:41:09 +01:00
Eike Kettner
248ad04dd0
Prepare custom fields
2020-11-22 21:41:09 +01:00
Eike Kettner
04ba14f802
Amend source form with tags and file-filter
...
Allow to define tags and a file filter per source.
2020-11-12 22:37:28 +01:00
Eike Kettner
10305bc82d
Minor improvements
2020-11-09 21:16:53 +01:00
Eike Kettner
29455d638c
Add startup task to find page counts of existing files
2020-11-09 20:35:35 +01:00
Eike Kettner
8c08bf233d
Amend search results with attachment info
...
This uses again another query per item to retrieve some information
about each attachment already in the search results.
2020-11-09 14:24:28 +01:00
Eike Kettner
a77f34b7ba
Add a processing step to retrieve page counts
2020-11-09 11:08:24 +01:00
Eike Kettner
d4bbb936b6
Count preview image sizes in insight data
2020-11-09 09:00:03 +01:00
Eike Kettner
f4e50c5229
Provide endpoints to submit tasks to re-generate previews
...
The scaling factor can be given in the config file. When this changes,
images can be regenerated via POSTing to certain endpoints. It is
possible to regenerate just one attachment preview or all within a
collective.
2020-11-09 09:00:02 +01:00
Eike Kettner
709848244c
Create tasks to generate all previews
...
There is a task to generate preview images per attachment. It can
either add them (if not present yet) or overwrite them (e.g. some
config has changed).
There is a task that selects all attachments without previews and
submits a task to create it. This is submitted on start automatically
to generate previews for all existing attachments.
2020-11-08 23:46:02 +01:00
Eike Kettner
eede194352
Fix deleting preview files
2020-11-08 21:27:55 +01:00
Eike Kettner
757ad31165
Add a route to get the item preview
...
This is the first available preview of an attachment wrt position. If
all attachments have a preview image, the preview of the first
attachment is returned.
2020-11-08 15:12:56 +01:00
Eike Kettner
0841a33ae3
Add a table to hold the preview files
2020-11-08 01:25:38 +01:00
Eike Kettner
0461cfefe7
Fix sql error for mariadb <10.4
...
MariaDB below 10.4 doesn't support parentheses around selects for
`intersect` and `union`.
https://mariadb.com/kb/en/intersect/#parentheses
Fixes #404
2020-10-28 22:54:51 +01:00
Eike Kettner
b59696a9d3
Make sure to only remove/retry items in premature states
2020-10-26 23:39:26 +01:00
Eike Kettner
26e89bf84e
Edit org/person/equipment of multiple items
2020-10-26 13:35:47 +01:00
Eike Kettner
2e6026b817
Edit dates of multiple items
2020-10-26 13:16:03 +01:00
Eike Kettner
d4043634ac
Edit direction of multiple items
2020-10-26 12:48:15 +01:00
Eike Kettner
7ad37c8d26
Editing tags for multiple items
2020-10-26 11:54:04 +01:00
Eike Kettner
3e2d272746
Add unique constraint for equipment names
...
Fixes #370
2020-10-21 22:42:19 +02:00
Eike Kettner
3771587e55
Find duplicate tags without category
2020-10-19 00:30:41 +02:00
Eike Kettner
6a3386ce66
Fix sql comparison with optional values
2020-10-19 00:29:41 +02:00
Eike Kettner
80ddca9aa3
Add counter to joblog for correct log order
...
This is to distinguish log entries created at the same time.
2020-10-02 22:14:30 +02:00
Eike Kettner
d4354b8b49
Skip pdf conversion if a converted file exists
...
For images the conversion also returns the extracted text. If this
would have failed to be saved, it is extracted in the following
text-extraction step.
2020-10-02 17:39:39 +02:00
Eike Kettner
b6f23b038a
Fix finding attachments for retries
...
The attachments to process again must be searched in sources and
archives, too.
2020-10-02 17:39:34 +02:00
Eike Kettner
e26d7129e7
Add fix for mariadb text columns
...
The `text` data type can only store up to 64kb data. The `mediumtext`
up to 16M and `longtext` up to 4G.
Issue: #297
2020-10-02 16:50:51 +02:00
Eike Kettner
552cdac1d3
Apply flyway api changes
2020-09-28 15:12:10 +02:00
Eike Kettner
f6f63000be
Prepend a duplicate check when uploading files
2020-09-23 23:37:00 +02:00
Eike Kettner
c658677032
Autoformat
2020-09-09 00:29:32 +02:00
Eike Kettner
eb11b33028
Fix mariadb changsets
2020-09-07 20:02:50 +02:00
Eike Kettner
76ccfb8a81
Only learn from confirmed items
...
Text classification should only learn from confirmed items. Log if
classification is disabled when processing an item.
2020-09-07 13:04:40 +02:00
Eike Kettner
cb1a9e0699
Use separate sql migration for h2
2020-09-07 13:04:29 +02:00
Eike Kettner
06879456a6
Change job priority on queue page
2020-09-05 18:50:58 +02:00
Eike Kettner
4309bd8dfd
Some cleanup
2020-09-02 21:22:30 +02:00
Eike Kettner
316b490008
Implement learning a text classifier from collective data
2020-09-02 18:28:14 +02:00
Eike Kettner
68bb65572b
Integrate learn-classifier task into the app
2020-09-02 18:28:14 +02:00
Eike Kettner
8c4f2e702b
Add classifier settings
2020-09-02 18:28:14 +02:00
Eike Kettner
de5b33c40d
Add updated
column to some tables
2020-08-24 21:30:52 +02:00
Eike Kettner
96d2f948f2
Use collective's addressbook to configure regexner
2020-08-24 14:40:52 +02:00