Commit Graph

8 Commits

Author SHA1 Message Date
Eike Kettner
53c8d3031d Skip invalid dates find in texts
Fixes: #298
2020-10-02 22:37:15 +02:00
Eike Kettner
c658677032 Autoformat 2020-09-09 00:29:32 +02:00
Eike Kettner
0c97b4ef76 Initial impl of a text classifier based on stanford-nlp 2020-09-02 18:28:14 +02:00
Eike Kettner
96d2f948f2 Use collective's addressbook to configure regexner 2020-08-24 14:40:52 +02:00
Eike Kettner
fdb46da26d Add french language and upgrade stanford-nlp to 4.0.0 2020-08-23 17:48:42 +02:00
Eike Kettner
9656ba62f4 scalafmtAll 2020-03-26 18:26:00 +01:00
Eike Kettner
8143a4edcc Adding extraction primitives 2020-02-16 21:37:26 +01:00
Eike Kettner
851ee7ef0f Reorganize processing code
Use separate modules for

- text extraction
- conversion to pdf
- text analysis
2020-02-15 21:25:25 +01:00