Eike Kettner
|
0c97b4ef76
|
Initial impl of a text classifier based on stanford-nlp
|
2020-09-02 18:28:14 +02:00 |
|
Eike Kettner
|
96d2f948f2
|
Use collective's addressbook to configure regexner
|
2020-08-24 14:40:52 +02:00 |
|
Eike Kettner
|
8628a0a8b3
|
Allow configuring stanford-ner and cache based on collective
|
2020-08-24 10:55:59 +02:00 |
|
Eike Kettner
|
fdb46da26d
|
Add french language and upgrade stanford-nlp to 4.0.0
|
2020-08-23 17:48:42 +02:00 |
|
Eike Kettner
|
347a029af8
|
Scalafix organize-imports
|
2020-06-28 21:20:47 +02:00 |
|
Eike Kettner
|
897d91475e
|
Update scalafmt-core to 2.6.0
|
2020-06-17 19:53:56 +02:00 |
|
Eike Kettner
|
075b665c68
|
Add some more tlds to look for
|
2020-05-24 11:48:49 +02:00 |
|
Eike Kettner
|
5e6ce1737c
|
Change recognizing dates with short years
Short years are now added to the current centure (2000) such that date
strings like 12/26/11 result in 12/26/2011 and not 12/26/1911.
|
2020-05-17 11:58:51 +02:00 |
|
Eike Kettner
|
c41cdeefec
|
Update scalafmt to 2.5.1 + scalafmtAll
|
2020-05-04 23:53:57 +02:00 |
|
Eike Kettner
|
6a1297fc95
|
Add a limit for text analysis
|
2020-03-27 22:54:49 +01:00 |
|
Eike Kettner
|
9656ba62f4
|
scalafmtAll
|
2020-03-26 18:26:00 +01:00 |
|
Eike Kettner
|
2f87065b2e
|
sbt scalafmtAll
|
2020-02-25 20:55:00 +01:00 |
|
Eike Kettner
|
8143a4edcc
|
Adding extraction primitives
|
2020-02-16 21:37:26 +01:00 |
|
Eike Kettner
|
851ee7ef0f
|
Reorganize processing code
Use separate modules for
- text extraction
- conversion to pdf
- text analysis
|
2020-02-15 21:25:25 +01:00 |
|