Commit Graph

600 Commits

Author SHA1 Message Date
Eike Kettner
76ccfb8a81 Only learn from confirmed items
Text classification should only learn from confirmed items. Log if
classification is disabled when processing an item.
2020-09-07 13:04:40 +02:00
Eike Kettner
cb1a9e0699 Use separate sql migration for h2 2020-09-07 13:04:29 +02:00
Eike Kettner
06879456a6 Change job priority on queue page 2020-09-05 18:50:58 +02:00
Eike Kettner
1dcccbcf7d Allow to hide classification settings in the webapp 2020-09-05 16:00:19 +02:00
Eike Kettner
7a0f71604d Serve static files/assets preferring the gzip version 2020-09-03 01:29:09 +02:00
Eike Kettner
4309bd8dfd Some cleanup 2020-09-02 21:22:30 +02:00
Eike Kettner
f9fcee81a5 Add start-now button for train-classifier task 2020-09-02 21:22:22 +02:00
Eike Kettner
8677eca6d4 Fix setting default in dropdown 2020-09-02 18:28:14 +02:00
Eike Kettner
237b960625 Guess a tag on item processing using a trained model if available 2020-09-02 18:28:14 +02:00
Eike Kettner
316b490008 Implement learning a text classifier from collective data 2020-09-02 18:28:14 +02:00
Eike Kettner
68bb65572b Integrate learn-classifier task into the app 2020-09-02 18:28:14 +02:00
Eike Kettner
0c97b4ef76 Initial impl of a text classifier based on stanford-nlp 2020-09-02 18:28:14 +02:00
Eike Kettner
8c4f2e702b Add classifier settings 2020-09-02 18:28:14 +02:00
Eike Kettner
3473cbb773 Use collective data with NER annotation 2020-08-25 20:40:44 +02:00
Eike Kettner
de5b33c40d Add updated column to some tables 2020-08-24 21:30:52 +02:00
Eike Kettner
96d2f948f2 Use collective's addressbook to configure regexner 2020-08-24 14:40:52 +02:00
Eike Kettner
8628a0a8b3 Allow configuring stanford-ner and cache based on collective 2020-08-24 10:55:59 +02:00
Eike Kettner
fdb46da26d Add french language and upgrade stanford-nlp to 4.0.0 2020-08-23 17:48:42 +02:00
Eike Kettner
30d5abddd8 Set version to 0.11.0-SNAPSHOT 2020-08-15 00:41:58 +02:00
Eike Kettner
f2fbf20f00 Set version to 0.10.0 2020-08-14 23:42:01 +02:00
Eike Kettner
7921dca665 Fixup for dropdown improvement 2020-08-14 23:37:28 +02:00
Eike Kettner
fde52bbbb0 Make dropdowns searchable by default and improve open/close clicks
Ref #207
2020-08-14 23:04:39 +02:00
Eike Kettner
760dec2230 Rename new route for retrieving used tags 2020-08-13 23:25:30 +02:00
Eike Kettner
3986487f11 Add api docs and cleanup 2020-08-13 21:22:54 +02:00
Eike Kettner
081c4da903 Add a route to trigger the convert-all-pdf task for a collective 2020-08-13 01:06:13 +02:00
Eike Kettner
69674eb485 Improve job-queue query to make sure jobs across all states show up 2020-08-13 01:06:13 +02:00
Eike Kettner
41ea071555 Add a task to convert all pdfs that have not been converted 2020-08-13 01:06:13 +02:00
Eike Kettner
07e9a9767e Add a task to re-process files of an item 2020-08-12 22:29:56 +02:00
Eike Kettner
57c1144f40 Allow to filter tags/categories in search menu 2020-08-10 13:25:25 +02:00
Eike Kettner
098e4cf868 Fix uploading to enabled/disabled source endpoints 2020-08-09 09:21:23 +02:00
Eike Kettner
6460315b2b Improve menu shadow 2020-08-09 09:12:28 +02:00
Eike Kettner
e793b63248 Allow to hide fields in menus based on ui settings 2020-08-08 22:51:02 +02:00
Eike Kettner
43946ed347 Fail early when source id is wrong 2020-08-08 18:43:18 +02:00
Eike Kettner
5810eac899 Fix remembering selection when going to detail view 2020-08-08 17:24:27 +02:00
Eike Kettner
75c958281e Redesign search/landing page 2020-08-08 16:38:52 +02:00
Eike Kettner
000d1aff2b Toggle tags via drag-drop from list view 2020-08-08 15:50:54 +02:00
Eike Kettner
06ad9ac46c Add routes to conveniently set/toggle tags 2020-08-08 15:08:04 +02:00
Eike Kettner
f86f644365 Prepare for drag-drop items into tags in list view 2020-08-08 14:34:26 +02:00
Eike Kettner
b1ef0c55af Show only visible folders in search menu 2020-08-08 14:16:13 +02:00
Eike Kettner
d6d16e39bd Drag-drop items into folders in list view 2020-08-08 14:03:36 +02:00
Eike Kettner
9c50a85363 Prepare drag-drop for items into folders 2020-08-08 13:20:29 +02:00
Eike Kettner
f0a5f84c8b Define how many tags to see in ui settings 2020-08-08 11:16:45 +02:00
Eike Kettner
4c57d16501 Rename ui setting field 2020-08-08 10:23:08 +02:00
Eike Kettner
7c8c2f856f Include tag categories into the new tag selection field 2020-08-08 10:20:43 +02:00
Eike Kettner
3642b95f8c Add a better tag selection field 2020-08-08 09:23:48 +02:00
Eike Kettner
1c8b66194b Add a route to return used tags
This is part of the `/insights` route without queries for file usage.
2020-08-08 08:35:35 +02:00
Eike Kettner
a4796f3f7f Return more tag details with item insights 2020-08-08 00:41:20 +02:00
Eike Kettner
c8ad9bf11f Put number of folders to display in ui settings 2020-08-08 00:06:23 +02:00
Eike Kettner
873d9fafc3 Add better folder field to search menu and re-order fields 2020-08-08 00:06:21 +02:00
Eike Kettner
c0a7c0d62c Fix modal positioning in item detail 2020-08-07 16:56:15 +02:00
Eike Kettner
af7cfa0ae1 Allow editing metadata in item-detail 2020-08-07 01:30:43 +02:00
Eike Kettner
f3ba224124 Add missing organization/person/equipment routes 2020-08-07 01:30:43 +02:00
Eike Kettner
639ab7440e Fix edit menu layout 2020-08-06 23:49:54 +02:00
Eike Kettner
a8ea391715 Render edit-modals above the menu and not the whole page 2020-08-06 23:38:55 +02:00
Eike Kettner
a6a6e334d5 Search by tag category via web ui 2020-08-06 22:23:35 +02:00
Eike Kettner
070c2b5e5f Allow to search by tag categories
The server accepts a list of tag categories for inclusion and
exclusion. The categories in the include list imply to return items
that have at least one tag of each category. The categories in the
exclude list imply to return all items that have no tag in any of
these categories.
2020-08-06 21:43:27 +02:00
Eike Kettner
cf3e051e83 Fix load more button 2020-08-06 00:49:15 +02:00
Eike Kettner
dfbbcdf73c Allow only one horizontal form being open 2020-08-05 23:11:21 +02:00
Eike Kettner
082f468155 Use a icon menu for the edit menu top bar 2020-08-05 22:43:04 +02:00
Eike Kettner
baa25d0f2f Allow to set item notes below or above the files 2020-08-05 22:43:04 +02:00
Eike Kettner
0453494cc6 Make notes below the files view and always visible
It looks similiar to github's readme. If there are no notes, the form
is displayed.
2020-08-05 22:41:08 +02:00
Eike Kettner
1662e1e2c8 Split ItemDetail file into multiple files due to its size 2020-08-05 17:57:45 +02:00
Eike Kettner
08f953dd52 Display item notes in card view if configured
The user can set a maximum length of the item notes to display in each
card. If set to 0, it is hidden.
2020-08-05 00:09:44 +02:00
Eike Kettner
09d74b7e80 Return item notes with search results
In order to not make the response very large, a admin can define a
limit on how much to return.
2020-08-05 00:09:37 +02:00
Eike Kettner
dbd27057d1 Improve source view and add qrcode for urls
The qr-code for urls is added so that these urls are easy to copy into
a phone. Then buttons for copying them into the clipboard have been
added.
2020-08-03 23:58:41 +02:00
Eike Kettner
ed8f16fe73 Add a qr-code for source urls 2020-08-03 18:27:13 +02:00
Eike Kettner
deacd8e9f6 Set version to 0.10.0-SNAPSHOT 2020-08-01 19:03:32 +02:00
Eike Kettner
2664b3ddb2 Set version to 0.9.0 2020-08-01 16:09:24 +02:00
Eike Kettner
45b0deeced Print solr url on start
This is useful info to see which url has been selected, same as db
connection.
2020-08-01 15:59:14 +02:00
Eike Kettner
1fc57fc2b2 Set default value for min-text-len to 500
This value is used to decide whether to try OCR or not. If text is
below this value, OCR is run and both results are compared. It was set
to 10, which is just one or two words. Since the context for docspell
are documents, this value is too low.
2020-08-01 15:46:00 +02:00
Eike Kettner
a06d20a479 Remove duplicate results from index-only search 2020-08-01 15:46:00 +02:00
Eike Kettner
b4e11a7264 Fixes a race condition when initializing the calendar-event field
The problem was that the field executes a request to validate its
state. This was initiated at the same time for two values. Then it was
undetermined which value comes back first.
2020-08-01 11:42:01 +02:00
mergify[bot]
f95f01759b
Merge pull request #198 from eikek/default-search
Default search
2020-07-31 23:28:23 +00:00
Eike Kettner
0599176ae8 Update scala to 2.13.3 2020-08-01 01:03:43 +02:00
Eike Kettner
46b784cc33 Simplify search bar and menu
The option "contents" has been removed from the search bar. This field
is not intended to be used alone, but rather in conjunction with other
fields. Otherwise it may be really slow on large databases.

The "name" option has been removed from the search menu. This doesn't
provide anything better over the "Names" field, that search more
fields, including item names.
2020-08-01 00:26:41 +02:00
mergify[bot]
5bf302a40e
Merge pull request #196 from eikek/website
Website
2020-07-31 20:51:38 +00:00
Eike Kettner
808a5a3c94 Remove old site 2020-07-31 01:28:09 +02:00
mergify[bot]
2f5036231c
Merge pull request #194 from eikek/sort-tag-list
Sort tag list by count
2020-07-30 22:23:42 +00:00
Eike Kettner
091ded50cb Sort tag list by count
It was displayed in some random order. Now the most used tag is first.
2020-07-31 00:14:25 +02:00
Eike Kettner
79eb7b4d66 Fix datepicker position for action input fields
Either the width and appearance must be changed to match this of an
`ui action input` or the position must be fixed as done here. It is
not correctly positioned, because the `ui input` class uses a flex.
2020-07-31 00:02:57 +02:00
Eike Kettner
f8c6f79b10 Initial website 2020-07-30 20:33:26 +02:00
Eike Kettner
cec4948710 Add pdf meta data to extracted text to add it to full-text index 2020-07-19 01:07:49 +02:00
Eike Kettner
209c068436 Use keywords in pdfs to search for existing tags
During processing, keywords stored in PDF metadata are used to look
them up in the tag database and associate any existing tags to the
item.

See #175
2020-07-19 00:28:04 +02:00
Eike Kettner
da68405f9b Extract meta data from pdfs using pdfbox 2020-07-18 23:04:46 +02:00
Eike Kettner
bd20165d1a Use given folder-id when adding initial fts docs 2020-07-18 23:04:01 +02:00
Eike Kettner
3d49ceaab5 Use ocrmypdf tool to create pdf/a during conversion
- Use another external tool to convert pdf to pdf which also adds the
  extracted text as another layer into the pdf

- Although not used, the external conversion routine will now check
  for an existing text file that is named as the pdf file with extension
  `.txt`. If present it is included in the conversion result and will be
  used as the extracted text.

- text extraction for pdf files happens now on the converted file,
  because it may already contain the text from the conversion step and
  thus avoids running OCR twice.

- All errors during conversion are not fatal; processing continues
  without a converted file.
2020-07-18 17:19:29 +02:00
Eike Kettner
99210365ce Update documentation for folders 2020-07-17 00:02:25 +02:00
Eike Kettner
94089fd0b6 Fix decoding joex responses in JoexClient 2020-07-15 20:45:07 +02:00
Eike Kettner
c697501571 Add folders sql changeset for mariadb 2020-07-14 23:22:52 +02:00
Eike Kettner
25538d6a59 Allow to set a folder when importing mailboxes 2020-07-14 23:18:39 +02:00
Eike Kettner
225877a40c Show folder in item detail view 2020-07-14 23:18:39 +02:00
Eike Kettner
ca5b7b999f Update source form to specify folder 2020-07-14 23:18:39 +02:00
Eike Kettner
5b01c93711 Add a folder-id to item processing
This allows to define a folder when uploading files. All generated
items are associated to this folder on creation.
2020-07-14 23:18:39 +02:00
Eike Kettner
ec7f027b4e Fix postgres changeset for folders 2020-07-12 16:15:02 +02:00
Eike Kettner
259526a088 Organize imports 2020-07-12 13:51:52 +02:00
Eike Kettner
22fa1dba13 Apply folder restriction to fulltext only search
And update index when folder changes.
2020-07-12 13:50:45 +02:00
Eike Kettner
aeba4ba913 Refactor full-text migrations and add folder to solr schema 2020-07-12 13:50:14 +02:00
Eike Kettner
e387b5513f Remove items in non-member folders from sql search results 2020-07-11 22:25:56 +02:00
Eike Kettner
5b95fddf3d Make item queries depend on the account-id
Now the user is required, too, to list items.
2020-07-11 21:54:51 +02:00
Eike Kettner
e66c501056 Extend dropdown to display additional option info
Use this to display folder information when setting the folder on an
item.
2020-07-11 17:56:08 +02:00
Eike Kettner
0df541f30a Allow to search by folders 2020-07-11 16:52:13 +02:00
Eike Kettner
86443e10a6 Set the folder of an item 2020-07-11 12:57:17 +02:00
Eike Kettner
5bde78083a Hide delete button when creating new folder 2020-07-11 11:54:23 +02:00
Eike Kettner
2ab0b5e222 Rename space -> folder 2020-07-11 11:54:23 +02:00
Eike Kettner
0365c1980a Show new data about spaces in web-ui 2020-07-11 01:30:29 +02:00
Eike Kettner
60a08fc786 Return member count and if current user is owner or member 2020-07-11 01:30:29 +02:00
Eike Kettner
ea4ab11195 Allow to only return owning spaces 2020-07-11 01:30:28 +02:00
Eike Kettner
6c304b4e7a Manage spaces in web-ui 2020-07-11 01:30:28 +02:00
Eike Kettner
752a94a9e2 Implement space operations 2020-07-11 01:30:28 +02:00
Eike Kettner
0e8c9b1819 Initial outline for managing spaces 2020-07-11 01:30:28 +02:00
Eike Kettner
d43e17d9fb Transport user-id to client 2020-07-11 01:30:28 +02:00
Eike Kettner
c12201c4a5 Add routes to manage spaces 2020-07-11 01:30:28 +02:00
Eike Kettner
7ec0fc2593 Add endpoints for managing spaces to openapi spec 2020-07-11 01:30:28 +02:00
Eike Kettner
13ad5e3219 Setup space entities 2020-07-11 01:30:28 +02:00
Eike Kettner
fadd21944f Set version to 0.9.0-SNAPSHOT 2020-06-29 21:04:15 +02:00
Eike Kettner
8998706598 Set version to 0.8.0 2020-06-29 20:37:52 +02:00
Eike Kettner
7b922fec94 Update documentation and fix changelog wording 2020-06-29 20:37:52 +02:00
Eike Kettner
347a029af8 Scalafix organize-imports 2020-06-28 21:20:47 +02:00
Eike Kettner
5bad157b9e Change link on home page 2020-06-28 19:34:28 +02:00
Eike Kettner
82104ff148 Update documentation and changelog 2020-06-28 14:45:04 +02:00
Eike Kettner
d3b3c6289b Prepare docker setup for fulltext search 2020-06-28 13:37:39 +02:00
Eike Kettner
8500d4d804 Extend consumedir.sh to work with integration endpoint
Now running one consumedir script can upload files to multiple
collectives separately.
2020-06-28 00:08:37 +02:00
Eike Kettner
41c0f70d3b Fix cancelling jobs
A request to cancel a job was not processed correctly. The cancelling
routine of a task must run, regardless of the (non-final) state. Now
it works like this: if a job is currently running, it is interrupted
and its cancel routine is invoked. It then enters "cancelled" state.
If it is stuck, it is loaded and only its cancel routine is run. If it
is in a final state or waiting, it is removed from the queue.
2020-06-26 23:08:27 +02:00
Eike Kettner
23477e34f9 Change columns from timestamp to datetime
In MariaDB the timestamp has some properties that make it a not a good
fit.
2020-06-26 17:07:00 +02:00
Eike Kettner
d79ae6233a Restrict proposals for due date
Avoid dates too far in the future.
2020-06-26 16:58:17 +02:00
Eike Kettner
91da3b149e Reducing default retries to 2
Many errors cannot be recovered from by retrying. There is currently
no way to distinguish these states so it is now set to a lower value
to have not long wait times until an item arrives.
2020-06-25 23:57:01 +02:00
mergify[bot]
50b3554c9a
Merge pull request #160 from eikek/fts
Fts
2020-06-25 21:19:42 +00:00
Eike Kettner
dc8f1a0387 Fix global re-index task to re-create the schema
Otherwise new instances could not be re-indexed.
2020-06-25 23:02:06 +02:00
Eike Kettner
4a41168bbb Allow a collective to re-index their data
If something goes wrong, this might be necessary.
2020-06-25 21:52:38 +02:00
Eike Kettner
2a98c2ca42 Fix openapi spec for joex 2020-06-25 08:43:02 +02:00
Eike Kettner
c81b92af6d Documentation updates 2020-06-25 01:36:26 +02:00
Eike Kettner
0ba1736bc8 Remove items/attachments from index on delete 2020-06-25 00:00:10 +02:00
Eike Kettner
64c96942a9 Fix deleting items that have sent mails 2020-06-24 23:47:58 +02:00
Eike Kettner
14213c4c27 Allow some solr query options in the config file 2020-06-24 23:37:20 +02:00
Eike Kettner
793f33b640 Update finding documentation 2020-06-24 23:37:20 +02:00
Eike Kettner
532caed84c Consistent logging of request/responses to solr
Using a middleware. Also add missing changesets for mariadb.
2020-06-24 21:25:46 +02:00
Eike Kettner
47697a8056 Set some logs to trace 2020-06-24 01:16:13 +02:00
Eike Kettner
7df77208fe Fix duplicate search results 2020-06-24 01:15:53 +02:00
Eike Kettner
8e0282c25f Indicate when the search-menu is not used 2020-06-24 01:15:41 +02:00
Eike Kettner
7d7460b1c9 Cleanup + hiding false errors from log 2020-06-24 00:23:22 +02:00
Eike Kettner
30937d4908 Set default max page size to 200 2020-06-24 00:04:10 +02:00
Eike Kettner
43b18db76a Don't scroll when loading more items 2020-06-24 00:03:58 +02:00
Eike Kettner
b8558d6837 Don't trigger search when fields are cleared 2020-06-24 00:03:17 +02:00
Eike Kettner
6846f2f46e Add new search-index route to web-ui 2020-06-24 00:03:17 +02:00
Eike Kettner
d5c9923a6d Add a route that only searches the full-text index
It returns the results in the same order as received from the index to
preserve the relevance ordering.
2020-06-24 00:03:17 +02:00
Eike Kettner
d9f0f05613 Refactor findItemsWithTags to more general useful 2020-06-23 21:27:01 +02:00
Eike Kettner
647911563e Fix paging when using full-text search 2020-06-23 01:44:52 +02:00
Eike Kettner
15c0fb4395 Merge branch 'master' into fts 2020-06-23 00:32:27 +02:00
Eike Kettner
e06a3f8fdd ScalafmtAll 2020-06-23 00:18:59 +02:00
Eike Kettner
a3e16e57de Display search highlighting in webapp 2020-06-23 00:17:29 +02:00