Manipulated inline images can force PyPDF2 into an infinite loop

When you try to get the content stream of [this attached PDF](https://github.com/mstamy2/PyPDF2/files/783085/malicious.pdf), PyPDF2 will end up in an infinite loop. So this is probably a security issue because it might be possible to denial-of-service applications using PyPDF2.

The reason is that the last while-loop in [ContentStream._readInlineImage](https://github.com/mstamy2/PyPDF2/blob/2a9d76d1244444f7bdd1e8f42eaeee159eadf7fa/PyPDF2/pdf.py#L2772-L2818) only terminates when it finds the `EI` token, but never actually checks if the stream has already ended. So it's as simple as adding a (broken) inline image that doesn't have an `EI` token at all, like the attached PDF.

You can see the infinite loop by running this test script with the attached PDF:

```python
import sys

from PyPDF2 import PdfFileReader, PdfFileWriter
from PyPDF2.pdf import ContentStream

with open(sys.argv[1], 'rb') as f:
    pdf = PdfFileReader(f, strict=False)
    for page in pdf.pages:
        contentstream = ContentStream(page.getContents(), pdf)
        for operands, command in contentstream.operations:
            if command == b'INLINE IMAGE':
                data = operands['data']
                print(len(data))
```

I will soon prepare a pull request that fixes this issue.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Manipulated inline images can force PyPDF2 into an infinite loop #329

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Manipulated inline images can force PyPDF2 into an infinite loop #329

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions