Initial website

2025-08-05 02:24:52 +00:00 · 2020-07-27 22:13:22 +02:00
parent dbd0f3ff97
commit f8c6f79b10
160 changed files with 8854 additions and 64 deletions
--- a/website/site/content/docs/dev/adr/0013_archive_files.md
+++ b/website/site/content/docs/dev/adr/0013_archive_files.md
@ -0,0 +1,42 @@
+++
+title = "Archive Files"
+weight = 140
+++
+
+
+# Context and Problem Statement
+
+Docspell should have support for files that contain the actual files
+that matter, like zip files and other such things. It should extract
+its contents automatcially.
+
+Since docspell should never drop or modify user data, the archive file
+must be present in the database. And it must be possible to download
+the file unmodified.
+
+On the other hand, files in there need to be text analysed and
+converted to pdf files.
+
+# Decision Outcome
+
+There is currently a table `attachment_source` which holds references
+to "original" files. These are the files as uploaded by the user,
+before converted to pdf. Archive files add a subtlety to this: in case
+of an archive, an `attachment_source` is the original (non-archive)
+file inside an archive.
+
+The archive file itself will be stored in a separate table `attachment_archive`.
+
+Example: uploading a `files.zip` ZIP file containing `report.jpg`:
+
+- `attachment_source`: report.jpg
+- `attachment`: report.pdf
+- `attachment_archive`: files.zip
+
+Archive may contain other archives. Then the inner archives will not
+be saved. The archive file is extracted recursively, until there is no
+known archive file found.
+
+# Initial Support
+
+Initial support is implemented for ZIP and EML (e-mail files) files.