diff --git a/tools/convert-all-pdfs.sh b/tools/convert-all-pdfs.sh new file mode 100755 index 00000000..5e47e2e1 --- /dev/null +++ b/tools/convert-all-pdfs.sh @@ -0,0 +1,29 @@ +#!/usr/bin/env bash +# +# Simple script to authenticate with docspell and trigger the "convert +# all pdf" route that submits a task to convert all pdf files using +# ocrmypdf. + +set -e + +BASE_URL="${1:-http://localhost:7880}" +LOGIN_URL="$BASE_URL/api/v1/open/auth/login" +TRIGGER_URL="$BASE_URL/api/v1/sec/item/convertallpdfs" + +echo "Login to trigger converting all pdfs." +echo "Using url: $BASE_URL" +echo -n "Account: " +read USER +echo -n "Password: " +read -s PASS +echo + +auth=$(curl --fail -XPOST --silent --data-binary "{\"account\":\"$USER\", \"password\":\"$PASS\"}" "$LOGIN_URL") + +if [ "$(echo $auth | jq .success)" == "true" ]; then + echo "Login successful" + auth_token=$(echo $auth | jq -r .token) + curl --fail -XPOST -H "X-Docspell-Auth: $auth_token" "$TRIGGER_URL" +else + echo "Login failed." +fi diff --git a/website/site/content/docs/joex/_index.md b/website/site/content/docs/joex/_index.md index c9ac7fd7..bc0bd517 100644 --- a/website/site/content/docs/joex/_index.md +++ b/website/site/content/docs/joex/_index.md @@ -67,7 +67,7 @@ logged in. The relevant part of the config file regarding the scheduler is shown below with some explanations. -``` +``` conf docspell.joex { # other settings left out for brevity diff --git a/website/site/content/docs/tools/convert-all-pdf.md b/website/site/content/docs/tools/convert-all-pdf.md new file mode 100644 index 00000000..a0b91aea --- /dev/null +++ b/website/site/content/docs/tools/convert-all-pdf.md @@ -0,0 +1,46 @@ ++++ +title = "Convert All PDFs" +description = "Convert all PDF files using OcrMyPdf." +weight = 60 ++++ + +# convert-all-pdf.sh + +With version 0.9.0 there was support added for another external tool, +[OCRMyPdf](https://github.com/jbarlow83/OCRmyPDF), that can convert +PDF files such that they contain the OCR-ed text layer. This tool is +optional and can be disabled. + +In order to convert all previously processed files with this tool, +there is an +[endpoint](/openapi/docspell-openapi.html#api-Item-secItemConvertallpdfsPost) +that submits a task to convert all PDF files not already converted for +your collective. + +There is no UI part to trigger this route, so you need to use curl or +the script `convert-all-pdfs.sh` in the `tools/` directory. + + +# Usage + +``` +./convert-all-pdfs.sh [docspell-base-url] +``` + +For example, if docspell is at `http://localhost:7880`: + +``` +./convert-all-pdfs.sh http://localhost:7880 +``` + +The script asks for your account name and password. It then logs in +and triggers the said endpoint. After this you should see a few tasks +running. + +There will be one task per file to convert. All these tasks are +submitted with a low priority. So files uploaded through the webapp or +a [source](@/docs/webapp/uploading.md#anonymous-upload) with a high +priority, will be preferred as [configured in the job +executor](@/docs/joex/_index.md#scheduler-config). This is to not +disturb normal processing when many conversion tasks are being +executed. diff --git a/website/site/content/docs/webapp/sources-edit.png b/website/site/content/docs/webapp/sources-edit.png new file mode 100644 index 00000000..23804991 Binary files /dev/null and b/website/site/content/docs/webapp/sources-edit.png differ diff --git a/website/site/content/docs/webapp/uploading.md b/website/site/content/docs/webapp/uploading.md index 4f729435..a8f37381 100644 --- a/website/site/content/docs/webapp/uploading.md +++ b/website/site/content/docs/webapp/uploading.md @@ -29,6 +29,8 @@ scripts. For this the next variant exists. It is also possible to upload files without authentication. This should make tools that interact with docspell much easier to write. +The [Android Client App](@/docs/tools/android.md) uses these urls to +upload files. Go to "Collective Settings" and then to the "Source" tab. A *Source* identifies an endpoint where files can be uploaded anonymously. @@ -41,7 +43,7 @@ username is not visible. Example screenshot: -{{ figure(file="sources-form.png") }} +{{ figure(file="sources-edit.png") }} This example shows a source with name "test". Besides a description and a name that is only used for displaying purposes, a priority and a @@ -58,25 +60,26 @@ The source endpoint defines two urls: - `/app/upload/` - `/api/v1/open/upload/item/` +{{ figure(file="sources-form.png") }} + The first points to a web page where everyone could upload files into your account. You could give this url to people for sending files directly into your docspell. The second url is the API url, which accepts the requests to upload -files (it is used by the upload page, the first url). +files. This second url can be used with the [Android Client +App](@/docs/tools/android.md) to upload files. -For example, the api url can be used to upload files with curl: +Another example is to use curl for uploading files from the command +line:: ``` bash $ curl -XPOST -F file=@test.pdf http://192.168.1.95:7880/api/v1/open/upload/item/3H7hvJcDJuk-NrAW4zxsdfj-K6TMPyb6BGP-xKptVxUdqWa {"success":true,"message":"Files submitted."} ``` -You could add more `-F file=@/path/to/your/file.pdf` to upload -multiple files (note, the `@` is required by curl, so it knows that -the following is a file). There is a [script -provided](@/docs/tools/ds.md) that uses this to upload files from the -command line. +There is a [script provided](@/docs/tools/ds.md) that uses curl to +upload files from the command line more conveniently. When files are uploaded to an source endpoint, the items resulting from this uploads are marked with the name of the source. So you know