Fix max_pixels/min_pixels ignored in `Qwen2VLImageProcessorFast.from_pretrained` #41954

ReinforcedKnowledge · 2025-10-30T16:03:48Z

TL;DR

When loading Qwen2VL image processor with from_pretrained(max_pixels=X), the parameter is stored as an attribute but not synchronized with the size dict that controls actual image resizing behavior. This causes images to be resized using default max_pixels=16,777,216 instead of the user's specified value, resulting in excessive token counts.

This fix implements max_pixels and min_pixels as properties that automatically synchronize with the size dict when set via setattr() during from_pretrained() loading. The properties ensure size['longest_edge'] and size['shortest_edge'] stay in sync with max_pixels and min_pixels attributes respectively.

The property-based approach is safe and localized to Qwen2VL, avoiding potential impacts on other image processors but this issue might concern other processors and I'm not totally fan of the property lookup.

We could argue there is no need for two sources of truth for the same thing (size['longest_edge'] and max_pixels) but we can also argue that the base class should / might be more clever about these user kwargs that don't synchronize with processor configs around their __init__ logic.

Since it's my first PR / issue to transformers I thought I'd give a more detailed explanation:

Issue

When loading Qwen2VL/Qwen3VL image processor with:

processor = Qwen2VLImageProcessorFast.from_pretrained(
    'Qwen/Qwen3-VL-2B-Instruct',
    max_pixels=200_000
)

The expected behavior:

processor.max_pixels = 200,000
processor.size['longest_edge'] = 200,000

That way the images get resized to ~200k pixels giving us around ~195 tokens.

Actual behavior (before fix):

processor.max_pixels = 200,000 which is correct
processor.size['longest_edge'] = 16,777,216 which is incorrect (loaded from config)
Images resized to default 16M pixels so around ~3,844 tokens

Root cause analysis

The Qwen2VL processor uses dual representation for pixel constraints:

self.max_pixels: an attribute
self.size['longest_edge']: the actual value used during image processing

The __init__ method is designed to keep these synchronized:

def __init__(self, **kwargs):
    max_pixels = kwargs.pop("max_pixels", None)
    # ...
    if max_pixels is not None:
        size["longest_edge"] = max_pixels

However, when loading via from_pretrained() we lose this synchronization because:

ImageProcessingMixin.from_dict() calls [cls(**config_dict)](https://github.com/huggingface/transformers/blob/02c324f43fe0ef5d484e846417e5f3bf4484524c/src/transformers/image_processing_base.py#L374) without max_pixels
Synchronization logic skips (max_pixels=None)
Later, from_dict() sets max_pixels via setattr() which is too late, just sets the value to the attribute

So as a result: max_pixels is stored correctly but size['longest_edge'] is never updated

Some other fixes IMHO

The "obvious" fix would be to add special handling in image_processing_base.py:

# In ImageProcessingMixin.from_dict()
if "max_pixels" in kwargs:
    image_processor_dict["max_pixels"] = kwargs.pop("max_pixels")

This would forward max_pixels to __init__() before the synchronization logic runs. This would allow us not to track every past and future processor that has similar logic as Qwen2VL's image processor. But the issue is that:

Not all image processors may accept max_pixels in __init__(). If a processor's __init__ doesn't have **kwargs, it would raise a TypeError I guess and we can't guarantee all processors handle this parameter.
The changes to the base class affect all processors so it'll require extensive testing across all image processor implementations and a higher risk of unintended side effects
The existing pattern is incomplete: current code only forwards params already in config:

if "size" in kwargs and "size" in image_processor_dict:  # requires the parameter to be both in the user kwargs and the config
    image_processor_dict["size"] = kwargs.pop("size")

But Qwen2VL's config doesn't have max_pixels in the dict, so this pattern wouldn't help.

That's why I went with the properties instead, minimal scope, max_pixels (or min_pixels) and size['longest_edge'] (or size['shortest_edge']) stay synchronized wether the class is instantiated directly, or with from_pretrained and wether the user down the line changes the attribute directly or not.

This PR implements a safer, processor-specific fix using Python

Future Considerations

While this PR fixes the issue for Qwen2VL/Qwen3VL, the underlying issue in from_dict() could affect other processors with similar synchronization patterns.

Related issue

#41955

…retrained() When loading Qwen2VL image processor with from_pretrained(max_pixels=X), the parameter was stored as an attribute but not synchronized with the size dict that controls actual image resizing behavior. This caused images to be resized using default max_pixels=16,777,216 instead of the user's specified value, resulting in excessive token counts. This fix implements max_pixels and min_pixels as properties that automatically synchronize with the size dict when set via setattr() during from_pretrained() loading. The properties ensure size['longest_edge'] and size['shortest_edge'] stay in sync with max_pixels and min_pixels attributes respectively. The property-based approach is safe and localized to Qwen2VL, avoiding potential impacts on other image processors.

ReinforcedKnowledge · 2025-11-03T10:59:27Z

Hi everyone, if anyone is reviewing this PR or wants to review it, do not hesitate to tell me if there's anything I can do to get the rest of the tests to go through (first time contributing to HF but I made sure that the bug exists and is solved with my PR).

github-actions · 2025-11-03T10:59:48Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: qwen2_vl

Rocketknight1 · 2025-11-03T13:48:30Z

cc @yonigozlan @molbap

yonigozlan

Hello @ReinforcedKnowledge ! Thanks a lot for raising this issue, I was able to reproduce it, but the issue actually comes from bad logic in from_pretrained, and can affect other image processors, so I opened another PR to fix it here #41997.
Thanks again for flagging this!

ReinforcedKnowledge · 2025-11-03T16:04:00Z

@yonigozlan Thank you for your response! I wanted to update the from_pretrained as I explained in the body of the PR but doing so impacts a lot of stuff and I didn't want to bear that responsibility as it's my first contribution. I have one quick question but I'll ask it on your PR 😄

yonigozlan · 2025-11-03T18:32:08Z

Sorry I missed that you had already pointed out the issue with from_pretrained!

ReinforcedKnowledge · 2025-11-04T09:29:35Z

It's okay! Don't hesitate to close this PR when you merge yours if needed

yonigozlan · 2025-11-04T16:05:39Z

Closing this as the fix PR is merged :)

ReinforcedKnowledge and others added 2 commits October 30, 2025 16:35

Merge branch 'main' into fix-qwen2vl-max-pixels-synchronization

d2d444b

ReinforcedKnowledge mentioned this pull request Oct 30, 2025

max_pixels parameter ignored when loading Qwen2VL/Qwen3VL image processors via from_pretrained() #41955

Closed

4 tasks

ReinforcedKnowledge added 3 commits October 31, 2025 11:03

Merge branch 'main' into fix-qwen2vl-max-pixels-synchronization

2745e30

Merge branch 'main' into fix-qwen2vl-max-pixels-synchronization

2cb9b19

Merge branch 'main' into fix-qwen2vl-max-pixels-synchronization

fc80994

yonigozlan mentioned this pull request Nov 3, 2025

Fix issue with from pretrained and kwargs in image processors #41997

Merged

yonigozlan reviewed Nov 3, 2025

View reviewed changes

yonigozlan closed this Nov 4, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix max_pixels/min_pixels ignored in `Qwen2VLImageProcessorFast.from_pretrained` #41954

Fix max_pixels/min_pixels ignored in `Qwen2VLImageProcessorFast.from_pretrained` #41954

ReinforcedKnowledge commented Oct 30, 2025 •

edited

Loading

Uh oh!

ReinforcedKnowledge commented Nov 3, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Nov 3, 2025

Uh oh!

Rocketknight1 commented Nov 3, 2025

Uh oh!

yonigozlan left a comment

Uh oh!

ReinforcedKnowledge commented Nov 3, 2025

Uh oh!

yonigozlan commented Nov 3, 2025

Uh oh!

ReinforcedKnowledge commented Nov 4, 2025

Uh oh!

yonigozlan commented Nov 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fix max_pixels/min_pixels ignored in Qwen2VLImageProcessorFast.from_pretrained #41954

Fix max_pixels/min_pixels ignored in Qwen2VLImageProcessorFast.from_pretrained #41954

Conversation

ReinforcedKnowledge commented Oct 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

TL;DR

Issue

Root cause analysis

Some other fixes IMHO

Future Considerations

Related issue

Uh oh!

ReinforcedKnowledge commented Nov 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Nov 3, 2025

Uh oh!

Rocketknight1 commented Nov 3, 2025

Uh oh!

yonigozlan left a comment

Choose a reason for hiding this comment

Uh oh!

ReinforcedKnowledge commented Nov 3, 2025

Uh oh!

yonigozlan commented Nov 3, 2025

Uh oh!

ReinforcedKnowledge commented Nov 4, 2025

Uh oh!

yonigozlan commented Nov 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fix max_pixels/min_pixels ignored in `Qwen2VLImageProcessorFast.from_pretrained` #41954

Fix max_pixels/min_pixels ignored in `Qwen2VLImageProcessorFast.from_pretrained` #41954

ReinforcedKnowledge commented Oct 30, 2025 •

edited

Loading

ReinforcedKnowledge commented Nov 3, 2025 •

edited

Loading