Initial impl of a text classifier based on stanford-nlp

This commit is contained in:
Eike Kettner
2020-08-31 22:35:27 +02:00
parent 8c4f2e702b
commit 0c97b4ef76
16 changed files with 376 additions and 18 deletions

View File

@ -298,7 +298,7 @@ docspell.joex {
# These settings are used to configure the classifier. If
# multiple are given, they are all tried and the "best" is
# chosen at the end. See
# https://nlp.stanford.edu/wiki/Software/Classifier/20_Newsgroups
# https://nlp.stanford.edu/nlp/javadoc/javanlp/edu/stanford/nlp/classify/ColumnDataClassifier.html
# for more info about these settings. The settings are almost
# identical to them, as they yielded best results with *my*
# dataset.