Update documentation

This commit is contained in:
Eike Kettner
2021-01-25 08:50:46 +01:00
parent e9a4f904c9
commit 946204e809
16 changed files with 154 additions and 93 deletions

View File

@ -25,3 +25,8 @@ To get started, here are some quick links:
user provided [notes and unraid
templates](https://github.com/vakilando/unraid-docker-templates)
which can get you started. Thanks for providing these!
Every [component](@docs/intro/_index.md#components) (restserver, joex,
consumedir) can run on different machines and multiple times. Most of
the time running all on one machine is sufficient and also for
simplicity, the docker-compose setup reflects this variant.

View File

@ -27,7 +27,7 @@ result in long processing times for OCR and text analysis. The board
should provide 4G of RAM (like the current RPi4), especially if also a
database and solr are running next to it. The memory required by joex
depends on the config and document language. Please pick a value that
suits your setup from [here](@/docs/install/running.md#memory-usage).
suits your setup from [here](@/docs/configure/_index.md#memory-usage).
For boards like the RPi, it might be necessary to use
`nlp.mode=basic`, rather than `nlp.mode=full`. You should also set the
joex pool size to 1.

View File

@ -45,42 +45,14 @@ when opened up to the outside, it is recommended to lock this down.
{% end %}
## Memory Usage
## Memory
The memory requirements for the joex component depends on the document
language and the configuration for [file
processing](@/docs/configure/_index.md#file-processing). The
`nlp.mode` setting has significant impact, especially when your
documents are in German. Here are some rough numbers on jvm heap usage
(the same small jpeg file was used for all tries):
<table class="table is-hoverable is-striped">
<thead>
<tr><th>nlp.mode</th><th>English</th><th>German</th><th>French</th></tr>
</thead>
<tfoot>
</tfoot>
<tbody>
<tr><td>full</td><td>420M</td><td>950M</td><td>490M</td></tr>
<tr><td>basic</td><td>170M</td><td>380M</td><td>390M</td></tr>
</tbody>
</table>
When using `mode=full`, a heap setting of at least `-Xmx1400M` is
recommended. For `mode=basic` a heap setting of at least `-Xmx500M` is
recommended.
Other languages can't use these two modes, and so don't require this
amount of memory (but don't have as good results). Then you can go
with less heap.
More details about these modes can be found
[here](@/docs/joex/file-processing.md#text-analysis).
The restserver component is very lightweight, here you can use
defaults.
Using the options below you can define how much memory the JVM process
is able to use. This might be necessary to adopt depending on the
usage scenario and configured text analysis features.
Please have a look at the corresponding [configuration
section](@/docs/configure/_index.md#memory-usage).
## Options