-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Description
Hi!
I'm a maintainer of pdfly, that has an extract-images subcommand, that uses PageObject.images internally.
Recently, a user reported an image compression issue: py-pdf/pdfly#200
I investigated and I think I found what happens:
- in
pypdf,PageObject.imagesinvokesPageObject._get_image()that calls_xobj_to_image() _xobj_to_image()calls theImage.save()method of Pillow that use a compression of75%by default: https://pillow.readthedocs.io/en/stable/handbook/image-file-formats.html#jpeg-saving
Suggested bug fix
In _xobj_to_image(), pypdf could simply provide an extra quality="keep" to Image.save().
IMHO this seems like the best default value,
even if that means a non-fully-backward-compatible change to pypdf.
Feature request
Could it be also possible to introduce a way to provide a custom value for this quality parameter provided to Image.save(), please?
Ideally a new optional argument would be great, but I don't quite see how to make this works with the PageObject.images property that returns a VirtualListImages...
Maybe through a global variable ?
Or through a new parameter of PdfReader?