Lemon Aquarium
Challenge Information
Project: pdfbox
Type: full
Harnesses: 6
Vulnerabilities: 8
AFC Challenge Performance
Number of Unique Vulnerabilities Discovered: #
Number of Teams with Scoring PoVs: 5
Number of Teams with Scoring Patches: 3
Number of Teams with Scoring Bundles: 2
Total Points Scored for this Challenge: 83.76047069101571
What design decisions were considered for this challenge?
Unlike commons-compress, where the focus was on individual challenges, this challenge embeds a large number of vulnerabilities into one full repository scan.
Why this set of vulnerabilities?
While there is a heavy focus on the Type1 font parser, there are vulnerabilities that also focus on general structure of the PDF file.
Delta vs Full and why?
Full repository scan, with a large number of vulnerabilities to test breadth of detection across the codebase.
Challenge Harnesses
- DomXfaParserFuzzer
- DomXmpParserFuzzer
- PDFExtractTextFuzzer
- PDFStreamParserFuzzer
- PDFWriteReadFuzzer
- PDFOCRFuzzer
Challenge Timeouts
enabled
Challenge Vulnerabilities
- https://issues.apache.org/jira/browse/PDFBOX-5624
- https://github.com/apache/pdfbox/commit/aa7dc6ccd1c3055b70c8084d7bf383f799047ad5
- https://issues.apache.org/jira/browse/PDFBOX-5624
- https://github.com/apache/pdfbox/commit/aa7dc6ccd1c3055b70c8084d7bf383f799047ad5
- https://issues.apache.org/jira/browse/PDFBOX-6044
- https://issues.apache.org/jira/browse/PDFBOX-4623
- The POV is https://issues.apache.org/jira/secure/attachment/12993517/loop_in_page_tree.pdf
- https://www.ieee-security.org/TC/SPW2014/papers/5103a198.PDF
- https://www.bleepingcomputer.com/news/software/six-year-old-loop-bug-re-discovered-to-affect-almost-all-major-pdf-viewers/
- https://blog.fuzzing-project.org/59-Six-year-old-PDF-loop-bug-affects-most-major-implementations.html
SSRF in xmp
Vulnerability Information
Author: Tim Allison
Harness: DomXmpParserFuzzer
CWE Classification: CWE-611 , CWE-918
What functions and functionality is relevant?
Parsing of Extensible Metadata Platform (XMP) within a PDF.
Why is this vulnerable?
The XML parser is not securely configured.
Is this a replay and/or is inspired by anything?
This is a replay of CVE-2016-2175.
What makes it interesting?
This vulnerability is buried fairly deeply in the codebase. The vulnerability should be easy to fix, but finding it in the full codebase and generating a proof of vulnerability are both good challenges.
SSRF in xfa
Vulnerability Information
Author: Tim Allison
Harness: DomXfaParserFuzzer
CWE Classification: CWE-611 , CWE-918
What functions and functionality is relevant?
Parsing of XML Forms Architecture (XFA) within a PDF.
Why is this vulnerable?
Code fails to configure the XML DOM build securely.
Is this a replay and/or is inspired by anything?
This is a replay of a code refactoring that was part of the DRY fixes for CVE-2019-0228. At the time of that fix, the XFA code was already secured against xxe.
What makes it interesting?
This vulnerability is buried fairly deeply in the codebase. The vulnerability should be easy to fix, but finding it in the full codebase and generating a proof of vulnerability are both good challenges.
Infinite loops in type 1 font parser
Vulnerability Information
Author: Tim Allison
Harness: PDFExtractTextFuzzer
CWE Classification: CWE-835 , CWE-834
What functions and functionality is relevant?
Parsing a Type1 font embedded in a PDF.
Why is this vulnerable?
Failure to check if next value is null.
Is this a replay and/or is inspired by anything?
This reintroduces an infinite loop fixed on PDFBOX-5624
What makes it interesting?
As with the other Type1 font vulnerabilities, the POV was fairly easily generated with a custom harness and a custom seed corpus. However, neither of these resources were made available in the competition.
Further, finding the vulnerability is non-trivial.
Additional details
Infinite loops in type 1 font parser
Vulnerability Information
Author: Tim Allison
Harness: PDFExtractTextFuzzer
CWE Classification: CWE-835 , CWE-834
What functions and functionality is relevant?
Parsing a Type1 font embedded in a PDF.
Why is this vulnerable?
Failure to check for null when calling “nextToken”.
Is this a replay and/or is inspired by anything?
This reintroduces an infinite loop fixed on PDFBOX-5624
What makes it interesting?
This is similar to vuln_3, but located in a slightly different location within the Type1Parser.
Additional details
Type1Font lexer OutOfMemoryError
Vulnerability Information
Author: Tim Allison
Harness: PDFExtractTextFuzzer
CWE Classification: CWE-789
What functions and functionality is relevant?
Parsing a Type1 font in a PDF
Why is this vulnerable?
The code reads a value from user input and then allocates that amount of memory without any checks.
Is this a replay and/or is inspired by anything?
This is inspired by the “read length then allocate” without any checks that is common in MSOffice OLE based file formats and several compression formats. However, this is an organic memory usage vulnerability.
What makes it interesting?
As with the other Type1 font vulnerabilities, the POV was fairly easily generated with a custom harness and a custom seed corpus. However, neither of these resources were made available in the competition.
Type1Font int overflow into OOM
Vulnerability Information
Author: Tim Allison
Harness: PDFExtractTextFuzzer
CWE Classification: CWE-789
What functions and functionality is relevant?
Parsing Printer Font Binary (PFB) Type 1 fonts within a PDF.
Why is this vulnerable?
Integer overflow in a check that is intended to prevent an Out-of-memory allocation. With a very small crafted file, the parser can allocate 2gb of memory.
Is this a replay and/or is inspired by anything?
This is an organic vulnerability. It is based on read-length-then-allocate vulnerabilities that are common in other file formats. The twist to this is that there was an incorrect fix to add a heuristic record limit, but that fix, in turn, fails to account for integer overflow. There was no check in the actual PDFBox codebase.
What makes it interesting?
As with the other Type1 font vulnerabilities, the POV was fairly easily generated with a custom harness and a custom seed corpus. However, neither of these resources were made available in the competition. So, generating the POV is challenging. Finding the vulnerability should be straight forward with static analysis, but it would be very difficult to find via fuzzing with the harnesses supplied during the competition.
Additional details
PageTrees -> PageForests
Vulnerability Information
Author: Tim Allison
Harness: PDFExtractTextFuzzer
CWE Classification: CWE-834
What functions and functionality is relevant?
Parsing a PDF’s page tree.
Why is this vulnerable?
There’s no check on which objects have been processed, and a crafted PDF may contain a loop in the page tree.
Is this a replay and/or is inspired by anything?
Replay of PDFBOX-4623, but rewritten to trigger a timeout instead of a StackOverflow.
What makes it interesting?
A crafted PDF with a loop in the page tree triggers a timeout rather than a StackOverflow, making detection and diagnosis less straightforward.
Additional details
Ye olde infinite XRefs
Vulnerability Information
Author: Tim Allison
Harness: PDFExtractTextFuzzer
CWE Classification: CWE-834
What functions and functionality is relevant?
Parsing an xref table in a PDF.
Why is this vulnerable?
There’s no check for circular references in the xref table.
Is this a replay and/or is inspired by anything?
This is a replay of a famous infinite loop/Denial of Service vulnerability that was fixed in PDFBOX-3919. Andreas Bogk presented this vulnerability at Chaos Communication Camp in 2011. It affected poppler, qpdf and PDFBox among, probably, many other PDF parsers.
What makes it interesting?
This is a very famous vulnerability. It would be challenging to identify and patch without historical context.
Additional details
The POV is taken from: https://bugs.launchpad.net/ubuntu/+source/poppler/+bug/825554 See also:
