mirror of
https://github.com/TheAnachronism/docspell.git
synced 2025-06-22 02:18:26 +00:00
Updating stanford corenlp to 4.3.2; adding more languages
There are models for Spanish, that have been added now. Also the Hungarian language has been added to the list of supported languages (for tesseract mainly, no nlp models)
This commit is contained in:
@ -147,11 +147,11 @@ experience. The features of text analysis strongly depend on the
|
||||
language. Docspell uses the [Stanford NLP
|
||||
Library](https://nlp.stanford.edu/software/) for its great machine
|
||||
learning algorithms. Some of them, like certain NLP features, are only
|
||||
available for some languages – namely German, English and French. The
|
||||
reason is that the required statistical models are not available for
|
||||
other languages. However, docspell can still run other algorithms for
|
||||
the other languages, like classification and custom rules based on the
|
||||
address book.
|
||||
available for some languages – namely German, English, French and
|
||||
Spanish. The reason is that the required statistical models are not
|
||||
available for other languages. However, docspell can still run other
|
||||
algorithms for the other languages, like classification and custom
|
||||
rules based on the address book.
|
||||
|
||||
More information about file processing and text analysis can be found
|
||||
[here](@/docs/joex/file-processing.md#text-analysis).
|
||||
|
Reference in New Issue
Block a user