The server accepts a list of tag categories for inclusion and
exclusion. The categories in the include list imply to return items
that have at least one tag of each category. The categories in the
exclude list imply to return all items that have no tag in any of
these categories.
This value is used to decide whether to try OCR or not. If text is
below this value, OCR is run and both results are compared. It was set
to 10, which is just one or two words. Since the context for docspell
are documents, this value is too low.
The problem was that the field executes a request to validate its
state. This was initiated at the same time for two values. Then it was
undetermined which value comes back first.
The option "contents" has been removed from the search bar. This field
is not intended to be used alone, but rather in conjunction with other
fields. Otherwise it may be really slow on large databases.
The "name" option has been removed from the search menu. This doesn't
provide anything better over the "Names" field, that search more
fields, including item names.
Either the width and appearance must be changed to match this of an
`ui action input` or the position must be fixed as done here. It is
not correctly positioned, because the `ui input` class uses a flex.
- Use another external tool to convert pdf to pdf which also adds the
extracted text as another layer into the pdf
- Although not used, the external conversion routine will now check
for an existing text file that is named as the pdf file with extension
`.txt`. If present it is included in the conversion result and will be
used as the extracted text.
- text extraction for pdf files happens now on the converted file,
because it may already contain the text from the conversion step and
thus avoids running OCR twice.
- All errors during conversion are not fatal; processing continues
without a converted file.