Create PDF with Python — Server side report generation

The business world moves towards paperless offices and full digitalization. Nevertheless, printing PDF and Excel reports is still a very common use case. We started developing a B2B web application for simplifying time tracking and invoicing in 2011. That was when we had to decide on the reporting tool we were going to use.

Requirements

Our application features PDF and Excel exports of invoices, statistics, time reports and so on. Our use cases for the reporting tool included:
Create standardized reports of lists, time reports, statistics
Support text, line, images and tables
Customize specific reports to our clients individual needs (colors, images, element placement,...)
Adapt layout and content details during runtime

Based on these considerations we were looking for something that includes a design tool allowing us to visually create and adapt reports whenever needed.

Java vs. Python

Our application is written in Python and the obvious tool to use would be ReportLab. However, while ReportLab provides the functionality we needed it lacked the design tool. That’s why we started using JasperReports Server together with Jaspersoft Studio as the design tool. Even though our application had no need for data warehousing and wouldn’t use most of Jasper’s reporting capabilities for complex data visualisations it best met our requirements by that time.

A win for Java (and Jasper) in the first round ☕

Time for a new tool

Python Logo After some years of usage Jasper wasn't able to hold up with our changing requirements with regards to a web application with a Python server backend. We decided to develop our own reporting framework: written in Python, lightweight, fast and optimized for small reports and, most importantly, a visual designer which can be embedded into the web application.

We identified 5 major drawbacks server side using the Java-based reporting framework Jasper. I'll discuss these aspects and show how we addressed them with our Python framework ReportBro. Read the client side aspects in the second part of this article Browser-based report template design.

1 Installation

Jasper requires a Java Runtime Environment (JRE) plus corresponding Tomcat server configuration. Besides the necessity of adopting the appropriate know-how regarding its setup and optimization this means to reserve designated resources for running the reporting server. JasperReports Server is needed to generate reports with REST requests (see Request handling below).

ReportBro is installed from PyPI just like other Python packages:

$ pip install reportbro-lib

2 Resource allocation

Running and servicing a resource-intense JAVA server for the purpose of reporting only is a major disadvantage when the main application runs in Python. Resources for the Java server have to be allocated but don't serve any other service than reporting.

ReportBro generates the reports directly in the server backend, everything runs in the same thread. There is no additional overhead.

3 Request handling

Users of web applications expect quick results and low latency. Especially when creating hundreds of relatively simple reports at once — all of them with a fairly manageable amount of data (like invoices or monthly analysis) — overall application performance shouldn’t be interfered by report generation.

When generating a report with Jasper we perform a REST request to the JasperReport Server in our controller function. We use the jasperclient lib for a convenient way to perform the request. Initially we faced some minor hurdles, e.g. to encode date parameters, encoding <, > and & in strings and so on. In our web application it is possible to download multiple (up to 100) reports at once — each report is generated individually and merged into a single PDF file afterwards. This can take quite some time because each report has to be created by a separate REST request and may also require a database query (see data processing below).

We call a controller in our application to access a report. This way we can use our already existing authentication/authorization, access data and perform calculations as needed. The reports are directly created in the application controller itself, omitting a lot of overhead. We also do not have to convert any data in our server backend — we can simply pass all native Python objects to ReportBro as report parameters.

4 Data processing

Jasper is specialized in processing huge volumes of data. Therefore, the request for data is usually done within the report via data source (e.g. a database query). Besides additional requests to the database, this also implies the risk of inconsistencies due to separated data queries and data processing within the application (for the web view) and the report.

We eliminate the risk of data inconsistency by using the same functions for data requests both in the web application and for the report. This is possible because both the controller for the web view and for the report are executed in Python.

5 Creation and maintenance of reports

We have a separate instance for each of our clients, this means we also have separate databases and customized reports. Therefore, each time we setup a new client we also have to configure the client reports. In Jasper this includes creation of a data source (for the database query), folder (for the customer), report objects (containing the report itself) and report units to access a report with the correct data source. We also have to update the report on the server every time the report is changed. These tasks are quite cumbersome. Of course, we automated the repeating jobs as much as possible with a script but there were many manual and error-prone steps left (e.g. to create different versions of the same report).

In ReportBro the report template is passed to the lib when creating a report. There are no additional dependencies like data sources. Therefore, we can easily integrate report maintenance in our web application by storing the report template in our application database. This saves us from a lot of annoying administrative work, especially when we have to maintain or compare different versions of a report.

Screenshot showing ReportBro report management as part of a web application Report management within our web application

Python over Java

I'm happy to say that we were able to address our most important requirements with ReportBro:
Native Python
Lightweight and fast
Optimized for small reports
A visual designer which can be embedded into a web application (read here about that)

A win for Python (and ReportBro) in the long run

This article is the first of a two part series. It has been revised from its first version, published on Sep 4, 2019 on Medium.
The second article, Create PDF with Python — Browser-based report template design, covers client side aspects that led us to developing our own reporting framework.

By the way, ReportBro is also available for everyone as open-source on github (see Download page). Star us on github if you like what you see!