mirror of
https://github.com/TheAnachronism/docspell.git
synced 2024-11-13 02:31:10 +00:00
bc1ec90b6e
This cuts down considerably when high-dpi images are provided in pdfs. The test file, scanned with 600dpi resulting in a 5.4M pdf file contains a 9900x13800 image. This image is loaded into memory in order to scale it down by PDFBox. This easily results in out of memory errors (this image requires already ~400M). With subsampling the size is reduced at most by a factor of 8. Still recommended to avoid large dpi image-only scans for text based documents or increase the heap size for joex. |
||
---|---|---|
.. | ||
analysis/src | ||
backend/src/main/scala/docspell/backend | ||
common/src | ||
config/src | ||
convert/src | ||
extract/src | ||
files/src | ||
fts-client/src/main/scala/docspell/ftsclient | ||
fts-solr/src/main/scala/docspell/ftssolr | ||
joex/src | ||
joexapi/src/main | ||
jsonminiq/src | ||
notification | ||
oidc/src/main/scala/docspell/oidc | ||
pubsub | ||
query | ||
restapi/src | ||
restserver/src | ||
store/src | ||
totp/src | ||
webapp |