mirror of
https://github.com/TheAnachronism/docspell.git
synced 2024-11-13 02:31:10 +00:00
3d49ceaab5
- Use another external tool to convert pdf to pdf which also adds the extracted text as another layer into the pdf - Although not used, the external conversion routine will now check for an existing text file that is named as the pdf file with extension `.txt`. If present it is included in the conversion result and will be used as the extracted text. - text extraction for pdf files happens now on the converted file, because it may already contain the text from the conversion step and thus avoids running OCR twice. - All errors during conversion are not fatal; processing continues without a converted file.
27 lines
947 B
Markdown
27 lines
947 B
Markdown
---
|
|
layout: docs
|
|
title: ADRs
|
|
permalink: dev/adr
|
|
---
|
|
|
|
# ADR
|
|
|
|
Some early information about certain details can be found in a few
|
|
[ADR](https://adr.github.io/) that exist:
|
|
|
|
- [0001 Components](adr/0001_components)
|
|
- [0002 Component Interaction](adr/0002_component_interaction)
|
|
- [0003 Encryption](adr/0003_encryption)
|
|
- [0004 ISO8601 vs Unix](adr/0004_iso8601vsEpoch)
|
|
- [0005 Job Executor](adr/0005_job-executor)
|
|
- [0006 More File Types](adr/0006_more-file-types)
|
|
- [0007 Convert HTML](adr/0007_convert_html_files)
|
|
- [0008 Convert Text](adr/0008_convert_plain_text)
|
|
- [0009 Convert Office Files](adr/0009_convert_office_docs)
|
|
- [0010 Convert Image Files](adr/0010_convert_image_files)
|
|
- [0011 Extract Text](adr/0011_extract_text)
|
|
- [0012 Periodic Tasks](adr/0012_periodic_tasks)
|
|
- [0013 Archive Files](adr/0013_archive_files)
|
|
- [0014 Full-Text Search](adr/0014_fulltext_search_engine)
|
|
- [0015 Convert PDF files](adr/0015_convert_pdf_files)
|