In the previous article I’ve introduced you to OpenTBS which is a class able to create and edit OpenXML and OpenDocument documents in PHP, extremely useful for report generation. But what if you need to provide the documents in PDF for the final users?
In this article I’ll demonstrate how you can convert OpenXML or (preferably) OpenDocument reports to PDF using LibreOffice or OpenOffice. All this using PHP.
- Necessary tools
- How does it work
- Why is
unoconv
necessary - Configurations and important remarks
- I don’t have control over the server. What can I do?
Necessary tools
The solution I’m showing here works both for Windows and Linux servers as long as you have LibreOffice or OpenOffice installed. If you don’t have these tools installed and you can’t install them because you have no control over the server (for instance, you are using a shared hosting), check my suggestion at the end of the article.
Beside this, you’ll also need a Python script called unoconv
(download the latest version on GitHub.
How does it work
The document is converted using the soffice
executable that is provided with the installation of LibreOffice or OpenOffice. The execution of soffice
is performed through the unoconv
script and you only need to specify the document’s location.
$documentPath = 'documentos' . DIRECTORY_SEPARATOR . 'relatorio.odt';
// Path for the python executable on Linux
$pythonPath = '/usr/bin/python';
// If you're using Windows the path should be something like the following
// $pythonPath = 'C:\Program Files (x86)\LibreOffice 5\program\python';
// Path to the unoconv script
$unoconvPath = 'lib' . DIRECTORY_SEPARATOR . 'unoconv';
// Here we're creating the command that will be used to generate the PDF file. This command executes
// the unoconv script with Python and converts the specified OpenDocument file into a PDF file with
// the same name
$command = sprintf(
'"%s" "%s" -f pdf "%s" 2>&1',
$pythonPath,
$unoconvPath,
$documentPath
);
// Execute the command and save the result
exec($command, $output, $return);
// Note: If the $output has some content, then there was an error trying to convert the file
if (is_array($output) && !empty($output[0])) {
throw new Exception(json_encode(implode('; ', $output), true));
}
As you can see, the conversion of the document into PDF is relatively simple in terms of code. You just need to verify if the variable $output
has some content and, if so, treat the error.
Why is unoconv
necessary
As mentioned before, the executable soffice
is responsible for converting the document into PDF. However, in this article, we’re using the third-party intermediary unoconv
which will call soffice
. The reason we can’t call soffice
directly is because the process ends after being called and it doesn’t wait for the conversion result.
In other words, using soffice
doesn’t guarantees the file is converted after the exec()
call nor that it has been successfully converted. So, if you use unoconv
you’re making sure the PHP script waits until the conversion has ended. Hence, if you need to access the converted PDF immediately after the exec()
call, you need to use unoconv
.
Configurations and important remarks
We’re calling exec
with a third-party script (unoconv
) and, as you may guess, there are 1001 things that could go wrong. In order for you to not make the same mistakes as I have, here’s a list of things you absolutely need to be aware:
- To assure there are no incompatibilities with Python versions, you should always use the Python provided by LibreOffice installation (in cases where you have Python installed).
If you don’t have LibreOffice installed, here’s how to install it (in CentOS)
yum install libreoffice-headless yum install libreoffice yum install libreoffice-pyuno
For this solution to work, when
unoconv
is executed you can’t have anysoffice.exe
norsoffice.bin
processes in execution, otherwise the conversion will throw an error. Hence, you need to make sure there’s no process in execution. Alternatively, you can solve this by executesoffice
as a service.You need to set executing permissions in
unoconv
andsoffice
to avoid issues when callingexec()
This solution works for OpenXML (docx, xlsx) and for OpenDocument (odt, ods) documents. However, I recommend you use this with OpenDocument documents. The reason is simple: although
soffice
supports OpenXML files it has much more compatibility with OpenDocument documents. So, although this works with OpenXML documents you might get unexpected results (the converted PDF might be disconfigured).In this article I refer LibreOffice more than OpenOffice because I have tried OpenOffice and, for some yet unknown reason, I couldn’t successfully convert the documents. There are, however, reports of people using OpenOffice successfully.
I don’t have control over the server. What can I do?
You don’t have enough permissions on the server to install LibreOffice or to change permissions? The best advice I have to give is to buy a VPS that can be found at a relatively low cost (for instance on DigitalOcean) and create an API whose only purpose is to receive the OpenDocument files and return the converted PDFs.
Obviously this depends on the type of project you’re working on and its variables (more specifically in financial terms), but it’s quite simple to create a REST API and add an SSL Certificate (check our guide on how to create a free SSL certificate) to the server so the whole process is secure.
But, as I’ve said, it all depends on the project’s restrictions. If you can’t use such solution I’m afraid I have no other solution for you.
I hope this two articles have helped you to create a powerful and flexible report management for your web application. Drop me a line to let me know how it went.