PDF.js Express Plusplay_arrow

Professional PDF.js Viewing & Annotations - Try for free

Optimizing Documents for PDF.js

9 Oct 2019

header
author
Adam Pez

Mozilla’s open-source library PDF.js will render small and simple PDFs “good enough” for most workflows. But users working in high-stakes enterprise or demanding design-agency settings may find PDF.js’s many rendering issues to be a source of irritation and disruption.

If you embed a PDF.js viewer in a website or application, your users may not have the patience to wait months or years on the open-source community to resolve a rendering issue that keeps them from using their documents. Therefore, we’ve created this guide with tips on how to reduce the risks of rendering inaccuracies occurring with documents in your PDF.js viewer.

Controlling Users or Controlling Documents

PDF.js faces several rendering inaccuracies due to its incomplete implementation of the PDF specification and heavy reliance on the browser for rendering.

There are two general strategies you can use to pre-empt some of these issues. One is to politely inform users upstream that they should limit what PDF features, browsers, and documents they use. However, this method may prove impractical or send the wrong message.

The alternative is to fix documents yourself before they are viewed. In what follows, we’ll discuss two techniques recommended by the open-source community.

Flattening PDFs

A common cause of incorrect rendering is where documents embed a feature such as a PDF transparency, pattern or gradient not supported by PDF.js. A technique known as “flattening” may correct for these issues. Flattening compresses different colors, objects, and images together into a single layer, simplifying the document and even improving performance when used in conjunction with PDF linearization.

However, it is important to bear in mind that flattening imposes certain tradeoffs:

Flattened vs Unflattened PDF

First, as seen above, it can reduce PDF.js image quality and color fidelity -- two areas where PDF.js struggles already.

Second, flattening may sometimes create rendering inaccuracies of its own, requiring that you inspect documents manually. For example, flattening certain PDF transparencies with Adobe Acrobat is known to “divide” overlapping art.

If you want to flatten your documents, you can do so manually with design software such as Acrobat -- or programmatically in a server environment via another library.

(For this purpose, you can consider the PDFTron SDK -- we offer a quick guide on how to flatten & optimize your documents.)

Converting to PDF/A

Thousands of different PDF generators are used to create PDFs. Unfortunately, the PDF specification doesn’t require that these generators embed fonts. As a result, many PDFs may contain missing, non-standard, or otherwise malformed fonts, and as we’ve documented, PDF.js may be unable to correct for these issues, resulting in text with mismatched fonts, wrong spacing, or ugly kerning. In a worst-case scenario, text may become illegible.

If you commonly run into text issues in your PDF.js viewer, you can try to convert documents to PDF/A.

PDF/A is a type of PDF created for the purpose of long-term preservation of the original intended appearance of the document. It therefore requires all font information to be embedded in the file so any standard-conformant PDF reader will be able to easily render its text.

Converting documents to PDF/A can be accomplished singularly with this free online PDF/A converter or programmatically via a professional commercial PDF SDK. The latter should be able to correct automatically for issues such as missing fonts to ensure your documents pass verification without effort on your part.

Due to the complexity of PDF, however, auto-conversion with a commercial SDK can't be entirely guaranteed. You may be required to manually inspect and edit documents to ensure verification. To achieve this, you can try using any number of commercial tools that will let you inspect and edit objects in the PDF itself (such as this one).

The Bottom Line

You can reduce the frequency of rendering inaccuracies in PDF.js either by controlling document creation or by leveraging techniques such as flattening and converting to PDF/A prior to viewing in PDF.js. Either may prove impractical, however, if you have a large and growing user base or if you frequently deal with many large and complex documents.

Don’t hesitate to contact us directly to share your thoughts on this article or our SDK. We’re always looking for feedback.