mirror of
https://github.com/TheAnachronism/docspell.git
synced 2024-11-13 02:31:10 +00:00
72 lines
1.6 KiB
Markdown
72 lines
1.6 KiB
Markdown
|
---
|
||
|
layout: docs
|
||
|
title: Convert HTML Files
|
||
|
---
|
||
|
|
||
|
# {{ page.title }}
|
||
|
|
||
|
## Context and Problem Statement
|
||
|
|
||
|
How can HTML documents be converted into a PDF file that looks as much
|
||
|
as possible like the original?
|
||
|
|
||
|
It would be nice to have a java-only solution. But if an external tool
|
||
|
has a better outcome, then an external tool is fine, too.
|
||
|
|
||
|
Since Docspell is free software, the tools must also be free.
|
||
|
|
||
|
|
||
|
## Considered Options
|
||
|
|
||
|
* [pandoc](https://pandoc.org/) external command
|
||
|
* [wkhtmltopdf](https://wkhtmltopdf.org/) external command
|
||
|
* [Unoconv](https://github.com/unoconv/unoconv) external command
|
||
|
|
||
|
Native (firefox) view:
|
||
|
|
||
|
<div class="thumbnail">
|
||
|
<img src="./img/example-html-native.jpg" title="Native view of an HTML example file">
|
||
|
</div>
|
||
|
|
||
|
Note: the example html is from
|
||
|
[here](https://www.sparksuite.com/open-source/invoice.html).
|
||
|
|
||
|
I downloaded the HTML file to disk together with its resources (using
|
||
|
*Save as...* in the browser).
|
||
|
|
||
|
|
||
|
### Pandoc
|
||
|
|
||
|
<div class="thumbnail">
|
||
|
<img src="./img/example-html-pandoc-latex.jpg" title="Pandoc (Latex) HTML->PDF">
|
||
|
</div>
|
||
|
|
||
|
<div class="thumbnail">
|
||
|
<img src="./img/example-html-pandoc-html.jpg" title="Pandoc (html) HTML->PDF">
|
||
|
</div>
|
||
|
|
||
|
Not showing the version using `context` pdf-engine, since it looked
|
||
|
very similiar to the latex variant.
|
||
|
|
||
|
|
||
|
### wkhtmltopdf
|
||
|
|
||
|
<div class="thumbnail">
|
||
|
<img src="./img/example-html-wkhtmltopdf.jpg" title="wkhtmltopdf HTML->PDF">
|
||
|
</div>
|
||
|
|
||
|
|
||
|
### Unoconv
|
||
|
|
||
|
|
||
|
<div class="thumbnail">
|
||
|
<img src="./img/example-html-unoconv.jpg" title="Unoconv HTML->PDF">
|
||
|
</div>
|
||
|
|
||
|
|
||
|
## Decision Outcome
|
||
|
|
||
|
wkhtmltopdf.
|
||
|
|
||
|
It shows the best results.
|