PageObject.images compresses JPEG images with only 75% quality, and there is no option to control this

Hi!

I'm a maintainer of `pdfly`, that has an `extract-images` subcommand, that uses [`PageObject.images`](https://pypdf.readthedocs.io/en/latest/modules/PageObject.html#pypdf._page.PageObject.images) internally.

Recently, a user reported an image compression issue: https://github.com/py-pdf/pdfly/issues/200

I investigated and I think I found what happens:
* in `pypdf`, `PageObject.images` invokes `PageObject._get_image()` that calls `_xobj_to_image()`
* `_xobj_to_image()` calls the `Image.save()` method of Pillow that use a compression of **`75%` by default**: https://pillow.readthedocs.io/en/stable/handbook/image-file-formats.html#jpeg-saving

## Suggested bug fix
In `_xobj_to_image()`, `pypdf` could simply provide an extra `quality="keep"` to `Image.save()`.

IMHO this seems like the best default value,
even if that means a non-fully-backward-compatible change to `pypdf`.

## Feature request
Could it be also possible to introduce a way to provide a custom value for this `quality` parameter provided to `Image.save()`, please?

Ideally a new optional argument would be great, but I don't quite see how to make this works with the `PageObject.images` property that returns a `VirtualListImages`...

Maybe through a global variable ?
Or through a new parameter of `PdfReader`?


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PageObject.images compresses JPEG images with only 75% quality, and there is no option to control this #3515

Suggested bug fix

Feature request

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

PageObject.images compresses JPEG images with only 75% quality, and there is no option to control this #3515

Description

Suggested bug fix

Feature request

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions