Safeguards against loops in cross-reference structure of PDF files
We've made some improvements to how we parse cross-reference tables or streams in existing documents. The PDF file format uses these cross-reference (or xref) tables and streams to store information about the structure of a PDF more efficiently,
In layman's terms the cross-reference table or stream of a PDF document is the place where all objects of said PDF are listed. For a more detailed explanation on cross-reference tables, you might like to refer to the following blog.
When loading PDF documents with iText this cross-reference entry is read out as part of the process. These cross-reference tables and streams can contain loops. People with malicious intent can abuse this to create infinite loops, or PDFs that slowly rise in memory usage until they result in an OutOfMemory situation.
iText already had some safeguards to avoid this, however this wasn't as complete as we wanted it to be. So we've spent some time on improving our cross-reference parsing logic to make sure that it cannot be forced into infinite loops or denial of service states.