Forum OpenACS Q&A: creating PDFs from OpenACS

Request notifications

Collapse
Posted by Brian Fenton on
I'm wondering what the current best practice for creating PDFs from OpenACS is? My requirements include must work on Windows and Linux, should be able to convert Microsoft Word, Excel and HTML. Also the ability to concatenate various documents into one PDF would be good.

Any suggestions?

many thanks
Brian

Collapse
Posted by Claudio Pasolini on
Hi Brian,

I'm using htmldoc to produce simple documents and trml2pdf (Python based) for quality printing.

I don't know if they run on Windows.

I recently tested the OpenOffice filters (with Java and JODConverter). They provide conversion from any odt, doc, txt, rtf and html to pdf, odt, doc, txt, rtf and html.

Unfortunately the rendering of the documents converted from html is awful, while the other conversions are quite good.

Collapse
Posted by Brian Fenton on
Claudio,

many thanks for the pointers. The OpenOffice filters sounds like the way to go, and I believe with the correct templates it should be possible to clean up the converted HTML.

I wonder is there much of an overhead in having the OpenOffice installation sitting on the server?

Cheers!
Brian

Collapse
Posted by Hamilton Chua on
I have had success using PrinceXML (http://www.princexml.com/) but it only seems to support converting HTML to PDF. No support for MS Word or Excel.

Nevertheless it seems to be the most accurate I've seen when it comes to translating the layout in an HTML page with CSS into a high quality pdf document.

Collapse
Posted by Brian Fenton on
Thanks Hamilton, that's good to know. Although it doesn't look cheap!
Collapse
Posted by Guillem Fabregó on
Hi,

After using htmldoc for some time to convert html to pdf, we have recently discovered wkhtmltopdf (http://code.google.com/p/wkhtmltopdf/), which provides css support.

Collapse
Posted by Brian Fenton on
To answer my own question, the solution we came up with is to use OpenOffice + PyODConverter to convert the documents. Then used ImageMagick (already using it in OpenACS so it was a perfect fit for us) to convert images and also to do the concatenation of PDFs.