Eike Kettner
14a25fe23e
Fix serializing mediatype parameters
2020-03-27 21:50:06 +01:00
Eike Kettner
aed5dfaff6
Fix mimetype extractors
2020-03-27 21:49:55 +01:00
Eike Kettner
75405dbcba
Update documentation
2020-03-27 20:16:18 +01:00
Eike Kettner
16edf84752
Setup new site
2020-03-27 00:35:15 +01:00
Eike Kettner
9656ba62f4
scalafmtAll
2020-03-26 18:26:00 +01:00
Eike Kettner
09ea724c13
Store message-id of eml files
2020-03-25 22:00:51 +01:00
Eike Kettner
43efb4e6ba
Use doobie support from emil project
2020-03-24 23:40:29 +01:00
Eike Kettner
e305b46708
Extract tnef attachments and fix incomplete html
...
The wkhtmltopdf requires the content encoding set correctly in the
document.
2020-03-24 23:40:29 +01:00
Eike Kettner
0b80572664
Fix encodings for mails with non-utf8 html parts
2020-03-24 23:40:29 +01:00
Eike Kettner
cf7ccd572c
Improve handling encodings
...
Html and text files are not fixed to be UTF-8. The encoding is now
detected, which may not work for all files. Default/fallback will be
utf-8.
There is still a problem with mails that contain html parts not in
utf8 encoding. The mail text is always returned as a string and the
original encoding is lost. Then the html is stored using utf-8 bytes,
but wkhtmltopdf reads it using latin1. It seems that the `--encoding`
setting doesn't override encoding provided by the document.
2020-03-23 22:51:28 +01:00
Eike Kettner
b265421a46
Allow to use the browser's pdf viewer
...
The viewerjs library has some limitations. Sometimes PDFs are quite
blurry and some content is displayed scrambled. Switching to the
browsers build-in PDF viewer (for chromium and firefox) fixes this. So
while on mobile the viewerjs is the only working viewer, for desktop
use it might be desireable to use the browsers builtin viewer instead.
2020-03-22 22:03:43 +01:00
Eike Kettner
75ead33652
Provide a download link to the original archive file
2020-03-22 21:48:49 +01:00
Eike Kettner
7e6eec9533
Include archive infos in item detail
2020-03-22 21:35:50 +01:00
Eike Kettner
cbc95b11e6
Add routes to retrive the archive of an attachment
2020-03-22 21:21:49 +01:00
Eike Kettner
9a99c852a8
Fix typo in search menu
2020-03-22 21:08:01 +01:00
Eike Kettner
3703dce9a6
Update fs2 to 2.3.0
2020-03-20 22:47:09 +01:00
Eike Kettner
cba466ed47
Set item due date candidate
...
After processing, set the due date of an item to the first candidate.
The earliest due date is considered best match.
2020-03-20 22:39:09 +01:00
Eike Kettner
74a6cf1dd1
Remove unused migration directory
2020-03-19 22:43:41 +01:00
Eike Kettner
b1a1a2b837
Add archives to collective insights
2020-03-19 22:43:18 +01:00
Eike Kettner
d78bd4142c
Update documentation
2020-03-19 22:42:58 +01:00
Eike Kettner
439aaee27b
Search archives when looking for files via checksum
2020-03-19 22:42:48 +01:00
Eike Kettner
6b1156182c
Add support for eml (rfc822 email) files
2020-03-19 22:42:40 +01:00
Eike Kettner
4ed7a137f7
Add support for archive files
...
Each attachment is now first extracted into potentially multiple ones,
if it is recognized as an archive. This is the first step in
processing. The original archive file is also stored and the resulting
attachments are associated to their original archive.
First support is implemented for zip files.
2020-03-19 22:42:27 +01:00
Eike Kettner
10f3d5b7ed
Fix bug to select other attachments
2020-03-17 22:37:43 +01:00
Eike Kettner
f0449dd2ce
Properly initialize thread pools
2020-03-17 22:37:12 +01:00
Eike Kettner
00ca6b5697
Improve text analysis
...
- Search for consecutive labels
- Sort list of candidates by a weight
- Search for organizations using person labels
2020-03-17 22:34:50 +01:00
Eike Kettner
718e44a21c
Add cleanup jobs task
2020-03-09 20:24:00 +01:00
Eike Kettner
854a596da3
Integrate periodic tasks
...
The first use case for periodic task is the cleanup of expired
invitation keys. This is part of a house-keeping periodic task.
2020-03-08 22:49:49 +01:00
Eike Kettner
616c333fa5
Implement storage routines for periodic scheduler
2020-03-08 13:56:23 +01:00
Eike Kettner
1e598bd902
Sketch a scheduler for running periodic tasks
...
Periodic tasks are special in that they are usually kept around and
started based on a schedule. A new component checks periodic tasks and
submits them in the queue once they are due.
In order to avoid duplicate periodic jobs, the tracker of a job is
used to store the periodic job id. Each time a periodic task is due,
it is first checked if there is a job running (or queued) for this
task.
2020-03-08 12:55:03 +01:00
Eike Kettner
9b28858d06
Create a simple client for joex in its api module
...
This client can be used within the backend app and later in other
modules. The `OJoex` object is replaced with a better implementation
where the http client is initialized once on app start.
2020-03-03 23:07:49 +01:00
Eike Kettner
42c59179b8
Fix search by checksum to include source files
2020-03-02 20:56:32 +01:00
Eike Kettner
867b59ac10
Fix link in doc menu
2020-03-01 14:08:21 +01:00
Eike Kettner
d8bbcb1409
Fix front-page links for microsite
...
The links work while testing locally with jekyll. Must be checked at
the published site.
2020-03-01 09:45:38 +01:00
Eike Kettner
b7f2c051f4
Set next version to 0.4.0-SNAPSHOT
2020-02-28 21:19:01 +01:00
Eike Kettner
aa3b9258c4
Set version to 0.3.0
2020-02-28 20:52:39 +01:00
Eike Kettner
3f53779ae4
Change documentation side menu and front
2020-02-28 20:52:39 +01:00
Eike Kettner
ad8d64eded
Fix microsite and add changelog
2020-02-27 23:59:03 +01:00
Eike Kettner
1bb464b9ed
Extend tools/ds.sh
to check for file existence
2020-02-27 20:03:46 +01:00
Eike Kettner
902fd63125
Fix initializing concerned equipment
2020-02-26 20:43:16 +01:00
Eike Kettner
2f87065b2e
sbt scalafmtAll
2020-02-25 20:55:00 +01:00
Eike Kettner
c8d090ae28
Remove small notes form field in favor for the new one
2020-02-24 22:34:32 +01:00
Eike Kettner
381de1e198
Show project version in the documentation
2020-02-24 20:59:15 +01:00
Eike Kettner
25c3f2b541
Add more explaining tooltips
2020-02-24 15:18:42 +01:00
Eike Kettner
478797e2a4
Add a help link to the main menu
2020-02-24 15:11:58 +01:00
Eike Kettner
36093c5d52
Add reverse proxy doc
2020-02-24 15:11:50 +01:00
Eike Kettner
cc16b0c024
Fix query to also work with mariadb
2020-02-24 13:34:54 +01:00
Eike Kettner
5f32eadaba
Fix dropdown in source create view
2020-02-23 23:01:48 +01:00
Eike Kettner
661cc3e65f
Fix deleting attachments (again)
2020-02-23 20:18:13 +01:00
Eike Kettner
d937e0501a
Add source files to collective insights
2020-02-23 20:17:53 +01:00
Eike Kettner
be8eacdbe9
Display full attachment name in title tooltip
2020-02-23 14:36:19 +01:00
Eike Kettner
1f431c3222
Make link to original file enabled if files are different
2020-02-23 14:33:22 +01:00
Eike Kettner
957073fe62
Return info about original files in item detail
...
This adds data to the current rest api.
2020-02-23 14:25:32 +01:00
Eike Kettner
ec419c7bfd
Adopt nix modules to new config
2020-02-22 12:40:56 +01:00
Eike Kettner
74a037887d
Fix deleting items and attachments to also remove the binary files
2020-02-22 00:54:55 +01:00
Eike Kettner
8cfecfb3dd
Update docs
2020-02-22 00:48:58 +01:00
Eike Kettner
98576a5fb5
Add link to original file
2020-02-20 22:40:27 +01:00
Eike Kettner
72fd3b1a25
Implement downloading original file
2020-02-20 22:33:57 +01:00
Eike Kettner
39809f9d05
Sketch route for retrieving original file
2020-02-20 22:12:27 +01:00
Eike Kettner
7fe8843893
Update documentation sites
2020-02-20 21:43:37 +01:00
Eike Kettner
3f316ab4d0
Update config file doc
2020-02-20 21:10:00 +01:00
Eike Kettner
fbe0c1aec5
Allow more chars for mimetype
2020-02-20 00:39:31 +01:00
Eike Kettner
97305d27ff
Integrate support for more files into processing and upload
...
The restriction that only pdf files can be uploaded is removed. All
files can now be uploaded. The processing may not process all. It is
still possible to restrict file uploads by types via a configuration.
2020-02-19 23:27:00 +01:00
Eike Kettner
9b1349734e
Convert some files to pdf
2020-02-19 02:03:10 +01:00
Eike Kettner
5869e2ee6e
Streamline extern-conv stdin/infile
2020-02-18 12:43:47 +01:00
Eike Kettner
0dcc00836b
Make logger configurable in system commands
2020-02-18 12:02:43 +01:00
Eike Kettner
bd605b8c94
Add first drafts for converting
2020-02-18 01:31:22 +01:00
Eike Kettner
c665c212a0
Early draft for running wkhtmltopdf
2020-02-17 14:02:23 +01:00
Eike Kettner
e0682464b5
Configure pdf extraction; move Logger and DataType to common
2020-02-17 14:01:36 +01:00
Eike Kettner
3d615181e0
Early draft for text extraction
2020-02-17 01:57:22 +01:00
Eike Kettner
8143a4edcc
Adding extraction primitives
2020-02-16 21:37:26 +01:00
Eike Kettner
851ee7ef0f
Reorganize processing code
...
Use separate modules for
- text extraction
- conversion to pdf
- text analysis
2020-02-15 21:25:25 +01:00
Eike Kettner
919381be1e
More research on how to create pdfs from other files
2020-02-15 13:57:21 +01:00
Eike Kettner
3deba44282
Rename example files
2020-02-15 12:52:24 +01:00
Eike Kettner
1309c8b7fa
Move mimetype detection to docspell-files
2020-02-14 22:06:18 +01:00
Eike Kettner
5c3d2b2e28
Rename example-files to files
2020-02-14 11:14:09 +01:00
Eike Kettner
bf9bf25502
Rename example files
2020-02-14 11:10:54 +01:00
Eike Kettner
569aae3038
Add example files into its own project
...
The text and convert module can use them in their tests.
2020-02-11 22:46:23 +01:00
Eike Kettner
2c0425433e
Move File class to common module
2020-02-11 22:42:04 +01:00
Eike Kettner
3026f199f7
Some research on pdf conversion
2020-02-11 22:41:44 +01:00
Eike Kettner
ce22b727b1
Add new convert module and sketch its integration
2020-02-11 00:33:52 +01:00
Eike Kettner
3be90d64d5
Move SystemCommand
to common module
2020-02-10 22:23:06 +01:00
Eike Kettner
ba3865ef5e
Starting to support more file types
...
First, files are be converted to PDF for archiving. It is also easier
to create a preview. This is done via the `ConvertPdf` processing
task (which is not yet implemented).
Text extraction then tries first with the original file. If that
fails, OCR is done on the (potentially) converted pdf file.
To not loose information of the original file, it is saved using the
table `attachment_source`. If the original file is already a pdf, or
the conversion did not succeed, the `attachment` and
`attachment_source` record point to the same file.
2020-02-10 12:42:45 +01:00
Eike Kettner
5c37efeaba
Apply scalafmt to all files
2020-02-09 01:54:26 +01:00
Eike Kettner
533396d386
Using the new preview route to show the attachment in webui
2020-02-08 18:02:31 +01:00
Eike Kettner
8908ad2561
Add attachment preview url based on ViewerJS
...
The viewerJS library can display PDF files easily using pdfjs. Another
attachment route redirects to the viewerjs application to display the
current attachment.
The attachment responses have been improved in that now the response
headers are added to all responses. Additional a HEAD route has been
added to support the viewerJS application.
2020-02-08 18:02:31 +01:00
Eike Kettner
e1826f39ac
Disable revolver plugin on non-app projects
...
This allows to type `reStart` in the root sbt project to start both
applications.
2020-02-08 18:02:31 +01:00
Eike Kettner
9b66604b96
Include item notes in search
2020-02-08 13:39:06 +01:00
Eike Kettner
d2edddd238
Show attachment meta data in ui
...
Allow to view the extracted text and results from text analysis of an
attachment.
2020-02-08 12:23:59 +01:00
Eike Kettner
f8aa5c28ac
Update http4s to 0.21.0-RC3, fs2 to 2.2.2
2020-02-04 22:14:18 +01:00
Eike Kettner
c9c8672234
Fix line-breaks in mail body
2020-02-02 12:25:15 +01:00
Eike Kettner
518d6911f0
Edit notes in a larger area
2020-01-29 21:57:02 +01:00
Eike Kettner
c504a3df42
Fix elm-analyse issues
2020-01-29 20:56:14 +01:00
Eike Kettner
1c8a143623
Add a complete example for nixos
2020-01-24 23:12:08 +01:00
Eike Kettner
61bbdab8b5
nix: add user doc and pkg fixes
...
- Add user doc for how to use with nix/nixos
- fix potential collisions in packages if both are installed via
`nix-env`
2020-01-24 21:56:48 +01:00
Eike Kettner
23af8acff8
Add support for integrating into nix/nixos
2020-01-20 00:21:15 +01:00
Eike Kettner
2454f358b1
Add sbt task to create a zip for things in tools/
2020-01-19 20:32:52 +01:00
Eike Kettner
8f7e8c7800
Add redirect for root (/
) to gui (/app
)
2020-01-18 17:48:45 +01:00
Eike Kettner
1c13537f47
Set version to 0.3.0-SNAPSHOT
2020-01-12 15:36:09 +01:00
Eike Kettner
ab045b0ce6
Set version to 0.2.0
2020-01-12 13:58:04 +01:00