mirror of
https://github.com/TheAnachronism/docspell.git
synced 2025-06-21 09:58:26 +00:00
Improve handling encodings
Html and text files are not fixed to be UTF-8. The encoding is now detected, which may not work for all files. Default/fallback will be utf-8. There is still a problem with mails that contain html parts not in utf8 encoding. The mail text is always returned as a string and the original encoding is lost. Then the html is stored using utf-8 bytes, but wkhtmltopdf reads it using latin1. It seems that the `--encoding` setting doesn't override encoding provided by the document.
This commit is contained in:
@ -161,7 +161,8 @@ val files = project.in(file("modules/files")).
|
||||
settings(
|
||||
name := "docspell-files",
|
||||
libraryDependencies ++=
|
||||
Dependencies.tika,
|
||||
Dependencies.tika ++
|
||||
Dependencies.icu4j,
|
||||
Test / sourceGenerators += Def.task {
|
||||
val base = (Test/resourceDirectory).value
|
||||
val files = (base ** (_.isFile)) pair sbt.io.Path.relativeTo(base)
|
||||
|
Reference in New Issue
Block a user