docspell/modules/microsite/docs/dev/adr.md
Eike Kettner 3d49ceaab5 Use ocrmypdf tool to create pdf/a during conversion
- Use another external tool to convert pdf to pdf which also adds the
  extracted text as another layer into the pdf

- Although not used, the external conversion routine will now check
  for an existing text file that is named as the pdf file with extension
  `.txt`. If present it is included in the conversion result and will be
  used as the extracted text.

- text extraction for pdf files happens now on the converted file,
  because it may already contain the text from the conversion step and
  thus avoids running OCR twice.

- All errors during conversion are not fatal; processing continues
  without a converted file.
2020-07-18 17:19:29 +02:00

27 lines
947 B
Markdown

---
layout: docs
title: ADRs
permalink: dev/adr
---
# ADR
Some early information about certain details can be found in a few
[ADR](https://adr.github.io/) that exist:
- [0001 Components](adr/0001_components)
- [0002 Component Interaction](adr/0002_component_interaction)
- [0003 Encryption](adr/0003_encryption)
- [0004 ISO8601 vs Unix](adr/0004_iso8601vsEpoch)
- [0005 Job Executor](adr/0005_job-executor)
- [0006 More File Types](adr/0006_more-file-types)
- [0007 Convert HTML](adr/0007_convert_html_files)
- [0008 Convert Text](adr/0008_convert_plain_text)
- [0009 Convert Office Files](adr/0009_convert_office_docs)
- [0010 Convert Image Files](adr/0010_convert_image_files)
- [0011 Extract Text](adr/0011_extract_text)
- [0012 Periodic Tasks](adr/0012_periodic_tasks)
- [0013 Archive Files](adr/0013_archive_files)
- [0014 Full-Text Search](adr/0014_fulltext_search_engine)
- [0015 Convert PDF files](adr/0015_convert_pdf_files)