mirror of
https://github.com/TheAnachronism/docspell.git
synced 2025-04-04 10:29:34 +00:00
Add some docs for postgres fts
This commit is contained in:
parent
21e13341e3
commit
3e87feff7b
@ -177,17 +177,37 @@ this account and setup the notification hooks in there - not in your
|
||||
normal account.
|
||||
|
||||
|
||||
## Full-Text Search: SOLR
|
||||
## Full-Text Search
|
||||
|
||||
[Apache SOLR](https://solr.apache.org) is used to provide the
|
||||
full-text search. Both docspell components must provide the same
|
||||
connection setup. This is defined in the `full-text-search.solr`
|
||||
Fulltext search is optional and provided by external systems. There
|
||||
are currently [Apache SOLR](https://solr.apache.org) and [PostgreSQL's
|
||||
text search](https://www.postgresql.org/docs/14/textsearch.html)
|
||||
available.
|
||||
|
||||
You can enable and configure the fulltext search backends as described
|
||||
below and then choose the wanted backend:
|
||||
|
||||
```conf
|
||||
full-text-search {
|
||||
enabled = true
|
||||
# Which backend to use, either solr or postgresql
|
||||
backend = "solr"
|
||||
…
|
||||
}
|
||||
```
|
||||
|
||||
All docspell components must provide the same fulltext search
|
||||
configuration.
|
||||
|
||||
### SOLR
|
||||
|
||||
[Apache SOLR](https://solr.apache.org) can be used to provide the
|
||||
full-text search. This is defined in the `full-text-search.solr`
|
||||
subsection:
|
||||
|
||||
``` bash
|
||||
...
|
||||
full-text-search {
|
||||
enabled = true
|
||||
...
|
||||
solr = {
|
||||
url = "http://localhost:8983/solr/docspell"
|
||||
@ -247,6 +267,79 @@ The solr index doesn't contain any new information, it can be
|
||||
regenerated any time using the above REST call. Thus it doesn't need
|
||||
to be backed up.
|
||||
|
||||
### PostgreSQL
|
||||
|
||||
PostgreSQL provides many additional features, one of them is [text
|
||||
search](https://www.postgresql.org/docs/14/textsearch.html). Docspell
|
||||
can utilize this to provide the fulltext search feature. This is
|
||||
especially useful, if PostgreSQL is used as the primary database for
|
||||
docspell.
|
||||
|
||||
You can choose to use the same database or separate connection. The
|
||||
fulltext search will create a single table `ftspsql_search` that holds
|
||||
all necessary data. When doing backups, you can exclude this table as
|
||||
it can be recreated from the primary data any time.
|
||||
|
||||
The configuration is placed inside `full-text-search`:
|
||||
|
||||
```conf
|
||||
full-text-search {
|
||||
…
|
||||
postgresql = {
|
||||
use-default-connection = false
|
||||
|
||||
jdbc {
|
||||
url = "jdbc:postgresql://server:5432/db"
|
||||
user = "pguser"
|
||||
password = ""
|
||||
}
|
||||
|
||||
pg-config = {
|
||||
}
|
||||
pg-query-parser = "websearch_to_tsquery"
|
||||
pg-rank-normalization = [ 4 ]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The flag `use-default-connection` can be set to `true` if you use
|
||||
PostgreSQL as the primary db to have it also used for the fulltext
|
||||
search. If set to `false`, the subsequent `jdbc` block defines the
|
||||
connection to the postgres database to use.
|
||||
|
||||
It follows some settings to tune PostgreSQL's text search feature.
|
||||
Please visit [their
|
||||
documentation](https://www.postgresql.org/docs/14/textsearch.html) for
|
||||
all the details.
|
||||
|
||||
- `pg-config`: this is an optional mapping from document languages as
|
||||
used in Docspell to a PostgreSQL text search configuration. Not all
|
||||
languages are equally well supported out of the box. You can create
|
||||
your own text search config in PostgreSQL and then define it in this
|
||||
map for your language. For example:
|
||||
|
||||
```conf
|
||||
pg-config = {
|
||||
english = "my-english"
|
||||
german = "my-german"
|
||||
}
|
||||
```
|
||||
|
||||
By default, the predefined configs are used for some lanugages and
|
||||
otherwise fallback to `simple`.
|
||||
|
||||
*If you change this setting, you must re-index everything.*
|
||||
- `pg-query-parser`: the parser applied to the fulltext query. By
|
||||
default it is `websearch_to_tsquery`. (relevant [doc
|
||||
link](https://www.postgresql.org/docs/14/textsearch-controls.html#TEXTSEARCH-PARSING-QUERIES))
|
||||
- `pg-rank-normalization`: this is used to tweak rank calculation that
|
||||
affects the order of the elements returned from a query. It is an
|
||||
array of numbers out of `1`, `2`, `4`, `8`, `16` or `32`. (relevant
|
||||
[doc
|
||||
link](https://www.postgresql.org/docs/14/textsearch-controls.html#TEXTSEARCH-RANKING))
|
||||
|
||||
|
||||
|
||||
## Bind
|
||||
|
||||
The host and port the http server binds to. This applies to both
|
||||
|
@ -110,7 +110,7 @@ Fulltext search is powered by [SOLR](https://solr.apache.org). You
|
||||
need to install solr and create a core for docspell. Then cange the
|
||||
solr url for both components (restserver and joex) accordingly. See
|
||||
the relevant section in the [config
|
||||
page](@/docs/configure/_index.md#full-text-search-solr).
|
||||
page](@/docs/configure/_index.md#full-text-search).
|
||||
|
||||
|
||||
### Watching a directory
|
||||
|
Loading…
x
Reference in New Issue
Block a user