It is best practice to deliver content in HTML as it generally is the most accessible format. When designing a site we generally try to avoid creating more PDFs. However, sometimes the convenience of a format such as PDF is preferable, especially when users need to download the details of an interaction. This can be the case when users have completed a form and stepped through a process. In these cases the site needs to generate a PDF dynamically.
Dynamically generating PDFs is therefore a relatively frequently asked for feature from clients. Many of our projects require data to be gathered from users and then to summarise that data in a portable format. How to do this? As the title says, it is more easily said than done. There are a lot of different approaches that can be taken, and each one has its advantages and disadvantages. In this article I will walk you through some of the approaches we have tried and how successful they were. At the end of the article I give you my preferred approach to dynamically produced PDFs. As with most things, YMMV.
Print PDF
This is the classic and most simplest approach. All the complexity of PDF generation can be sidestepped if the browser can do the heavy lifting for you. Users can easily use the Print option in the browser to download a PDF or to print it out. All that is required is a well targeted print.css file to make the output look good when printed.
This approach is the most natural from a web development point of view, but it is not the most obvious for many users. Many users will not be familiar with this approach and it will therefore remain hidden from them.
The main problem with this approach is that the downloaded PDF is not accessible. The headings and other structural elements are not retained. The content is suitable for print, but not to be fully accessible. If this is important for you, and it should be, the print PDF option is convenient but not perfect.
Pros
- Simple to implement
- Leverages the browser
Cons
- Not known by all users
- Output is not accessible.
Print PDF - IFrame
The next logical step for Print PDF is to utilise the same core functionality and to then wrap it up in a more obvious way for the user. The user can be presented with a download link on the page that will result in the PDF being downloaded - avoiding the need to interact with the print dialog box.
Technically, this is achieved by making a new iframe in the DOM and to write the selected HTML into that iframe. The content can then be downloaded by triggering the print method on the iframe. This approach has the advantages of leveraging the browser and making things better for the user. However, as you might guess, the PDF will lack key accessibility features.
Pros
- Selectable text - You can select text in PDF and do copy and paste
- Full control - You decide exactly what gets printed and how it looks
- Clean output - No webpage clutter, just your content
- Formatting - Can format data specifically for printing
Cons
- Mobile - does not work in 100% of cases - User needs to have the PDF app installed and select PDF from the dropdown
- Slower - Creates iframe + 500ms delay every time
- More complex / Harder to maintain - CSS for printing adds another layer that needs to be maintained/updated
- Browser issues - Some browsers block iframe manipulation
- PDF problems - Print dialogue sometimes can't save to PDF properly (mobile device issues mentioned above etc.)
- No error handling - The User won't know if something goes wrong
Code example:
function printContent() {
// Get the content you want to print
const contentToPrint = document.querySelector('.content-to-print');
const htmlContent = contentToPrint.innerHTML;
printIframe = document.createElement('iframe');
printIframe.id = 'print-iframe';
printIframe.style.position = 'absolute';
printIframe.style.top = '0';
printIframe.style.left = '-1000px';
printIframe.style.width = '0';
printIframe.style.height = '0';
printIframe.style.border = 'none';
document.body.appendChild(printIframe);
// Write HTML content to iframe
const iframeDoc = printIframe.contentDocument;
iframeDoc.open();
iframeDoc.write(`
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
<title>Data Print</title>
<style>
//PDF css styles belong here
</style>
</head>
<body>
${htmlContent}
</body>
</html>
`);
iframeDoc.close();
// Wait for content to load, then print
setTimeout(() => {
printIframe.contentWindow.focus();
printIframe.contentWindow.print();
}, 500);
}
// Wait for DOM to be fully loaded before adding event listeners
document.addEventListener('DOMContentLoaded', function() {
// Add click event listener to the button
document.getElementById('btn').addEventListener('click', printContent);
});
mPDF - direct api
Most of our work is done with Drupal and PHP. Given this context, it is possible to use a PHP library server side to deliver the the PDF to the user. Once such library is mPDF which is based on FPDF and HTML2FPDF with a number of enhancements.
Use mPDF if you cannot use a non-PHP approach to generate PDF files or if you want to leverage some of the benefits of mPDF over the browser approach. Advantages include color handling, pre-print, barcodes support, headers and footers, page numbering and TOCs. There are complexities though. A HTML/CSS template tailored for mPDF might be necessary, leading to extra development work. If you are looking for state-of-the-art CSS support, mirroring existing HTML pages to PDF, using headless Chrome may be preferable.
As much of our work is on the GovCMS platform, adding custom routes for PDF downloads is not possible. We generally would opt for a frontend approach with JavaScript
Pros
- HTML/CSS support - Converts web pages directly to PDF.
- Rich features - Headers, footers, watermarks, barcodes, TOCs
- Well documented - https://mpdf.github.io/
- Short code initialisation - (info in the code example)
Cons
- Dated - mPDF as a whole is a quite dated software. Nowadays, better alternatives are available, albeit not written in PHP.
- A lot of open issues - https://github.com/mpdf/mpdf/issues
Code example:
$mpdf = new \Mpdf\Mpdf();
$mpdf->WriteHTML('<h1>Hello world!</h1>');
$mpdf->Output();
html2pdf
HTML2PDF is a JavaScript library that allows you to print web content directly to PDF for download. The most important thing to understand about html2pdf is that it doesn't actually translate HTML text into PDF text. Instead, it acts as a wrapper that chains two other powerful libraries together (html2canvas and jsPDF).
By far the biggest problem with this approach is that the output it generates cannot be copied and pasted. Content is rendered to a canvas, and this severely restricts the utility of the content once rendered out. The end result is convenient, but the accessibility is completely missing. This was the solution we were looking for, but it fails at this last hurdle.
Pros
- HTML/CSS support - Converts web pages directly to PDF.
- Short code initialisation - See code example below.
- Client-side rendering - Less heavy on the server
Cons
- Issues with styling - current sitewide styles can skew PDF (it is not isolated like an iframe) => this can be resolved by redirecting to an endpoint where PDF is generated (where are only PDF styles) and then redirect back
Code example:
const elementToPrint = document.getElementById('element-to-print');
// PDF generation options with default values
const opt = {
margin: 8,
filename: 'selected-items.pdf',
image: {
type: 'jpeg',
quality: 0.98
},
html2canvas: {
scale: 2,
useCORS: true,
letterRendering: true
},
jsPDF: {
unit: 'mm',
format: 'a4',
orientation: 'portrait',
compress: true
}
};
html2pdf().set(opt).from(elementToPrint).save();
jsPDF - direct API
jsPDF is a JavaScript library that takes a different approach. Rather than outputting a PDF based on HTML, it is based on a the content being built up programmatically. This requires some more attention to detail from the developer, as the PDF needs ot be built up piece by piece. It is a good option for simpler PDFs without too much complexity.
The generted PDF is text based so that the text can be copied and pasted. This is a big win for users who need to interact with the document once downloaded.
Pros:
- Client-side generation reduces server load
- No server dependencies
- Ideal for simple PDFs
- Selectable text - You can select text in PDF and copy and paste
Cons:
- Minimal formatting and layout options
- Issues with text formatting - it is tricky to keep longer text well formatted
- Browser compatibility inconsistencies
Code example:
function generatePDF() {
const { jsPDF } = window.jspdf;
const doc = new jsPDF();
doc.text("Hello, this is a PDF generated with jsPDF!", 100, 100);
doc.save("sample.pdf");
}
Headless Chrome print
Headless Chrome is a tool that takes a different approach. Rather than building the PDF programmatically, it uses an invisible Chrome browser to print your HTML and CSS exactly as it appears on the web. This requires some more setup from the developer, as running background browser instances on a server can be tricky and resource-heavy. It is a great option for complex PDFs that rely on modern web styling.
The generated PDF is text-based so that the text can be copied and pasted. This is a big win for users who need to interact with the document once downloaded.
Pros:
- Useful to print the page as is
- All CSS options should be available
- Selectable text - You can select text in PDF and copy and paste
Cons:
- Tricky to set up
- Chrome is an external binary and does not understand Drupal's stream wrappers (like
public://ortemporary://) - Can be restricted on the server side - permission to execute the google-chrome binary needed
Browser compatibility inconsistencies
CLI code example:
google-chrome --headless --disable-gpu --print-to-pdf https://www.chromestatus.com/
Accessible documents
The following methods do not preserve headings and structure, so please be careful if accessibility is a priority:
- Native Browser Print & IFrame Approach: These methods strip out HTML headings and semantic tags. The visual look is retained via CSS, but the document becomes structurally flat and lacks the tagging necessary for screen readers to navigate by headings.
- html2pdf: Accessibility is completely missing. Because it takes a canvas "snapshot" of the page, headings are essentially flattened into a single image. You cannot even select the text, let alone navigate it structurally.
- jsPDF (Direct API): Because you are building the document programmatically line-by-line (e.g.,
doc.text("Hello", 100, 100)), it does not automatically convert HTML heading tags into PDF structural tags.
The following methods can handle document structure:
- mPDF (Server-side): Because this PHP library translates HTML/CSS directly into PDF objects on the server, it is the best equipped to understand and utilize document structure. It natively supports generating Tables of Contents (TOCs), which relies on parsing document headings.
- Headless Chrome: While it uses the browser's print engine (which traditionally struggles with semantic tags), it is the most robust tool for rendering accurate HTML/CSS.(Note: Depending on exact command-line flags and Chrome versions, it has better potential for generating tagged PDFs than standard frontend tools, though the article highlights its main benefit as visual fidelity and selectable text)
Conclusion:
At the start of this article I said I would make a recommendation as to what I think is best. Let's remind ourselves of the contenders and their strengths and weaknesses.
- Print PDF - The easiest and most native solution suffers from usability issues for users who do not know about it.
- Print PDF - IFrame offers full printing control and clean output but suffers from inconsistent mobile support and complex CSS.
- mPDF is a PHP solution that provides rich HTML/CSS support but is a dated library with integration challenges.
- html2pdf is a solution that offers easy integration and simple styling. Unfortunately, because internally it utilises canvas you cannot copy text from generated PDF. This solution at least generates href links to the PDF
- jsPDF is excellent for simple, client-side PDFs but has minimal formatting options with potential browser inconsistencies.
- Headless Chrome is difficult to set up, but it is a great option to print the page as is into a PDF. No CSS restrictions because we are utilising the latest Chrome browser.
Selecting the optimal method for dynamic PDFs often comes down to the environment. I recommend utilizing Headless Chrome whenever feasible. By employing a modern browser instance to handle the rendering, you bypass most CSS limitations and ensure the visual output is identical to the web version. This approach yields a text-based, selectable document, which is a major win for users. Although configuring server permissions for the binary can be tricky, the resulting fidelity makes it the superior choice for professional results.
Nevertheless, technical constraints often dictate the path forward. In restricted ecosystems like GovCMS where external binaries aren't permitted, the html2pdf library serves as a highly capable alternative for frontend-driven requirements.