Skip to content

Conversation

@HollowMan6
Copy link
Contributor

What does this PR do?

Alternative root fix for volcengine/verl#3281

For example, this PR fixes the following error when we pass an empty list of images to Qwen2.5-VL:

  File "torchdata/stateful_dataloader/worker.py", line 242, in _worker_loop
    data = fetcher.fetch(index)  # type: ignore[union-attr]
  File "torch/utils/data/_utils/fetch.py", line 52, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "torch/utils/data/_utils/fetch.py", line 52, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "verl/utils/dataset/rl_dataset.py", line 248, in __getitem__
    model_inputs = self.processor(text=[raw_prompt], images=images, videos=videos, return_tensors="pt")
  File "transformers/models/qwen2_5_vl/processing_qwen2_5_vl.py", line 150, in __call__
    image_inputs = self.image_processor(images=images, **output_kwargs["images_kwargs"])
  File "transformers/image_processing_utils_fast.py", line 637, in __call__
    return self.preprocess(images, *args, **kwargs)
  File "transformers/models/qwen2_vl/image_processing_qwen2_vl_fast.py", line 151, in preprocess
    return super().preprocess(images, videos, **kwargs)
  File "transformers/image_processing_utils_fast.py", line 662, in preprocess
    return self._preprocess_image_like_inputs(
  File "transformers/models/qwen2_vl/image_processing_qwen2_vl_fast.py", line 173, in _preprocess_image_like_inputs
    batch_feature = self._preprocess(images, **kwargs)
  File "transformers/models/qwen2_vl/image_processing_qwen2_vl_fast.py", line 211, in _preprocess
    grouped_images, grouped_images_index = group_images_by_shape(images, disable_grouping=disable_grouping)
  File "transformers/image_transforms.py", line 917, in group_images_by_shape
    device = images[0][0].device if is_nested else images[0].device
IndexError: list index out of range

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@ArthurZucker, @amyeroberts, @qubvel

Copy link
Member

@zucchini-nlp zucchini-nlp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @doubao2021-ai !

We had another one in progress here (#36682) which is more complete and fixes all models. I'd prefer that one to be merged

@HollowMan6
Copy link
Contributor Author

HollowMan6 commented Sep 2, 2025

Hi @zucchini-nlp, thanks for your reply! I think #36682 doesn't consider the situation when there's no image at all and batch size is 1, i.e., images = []. Feel free to take over this PR as I don't have the bandwidth to fix all models. Thanks in advance!

@zucchini-nlp
Copy link
Member

@HollowMan6 if there are no images at all, it is recommended to simply pass images=None which is the default for no input

@HollowMan6
Copy link
Contributor Author

Yeah, that's what I had proposed in volcengine/verl#3281, but maybe it can be good as well to enforce some checks at the transformers library side.

For example, this PR fixes the following error when we pass an empty list
of `images` to Qwen2.5-VL:

```log
  File "torchdata/stateful_dataloader/worker.py", line 242, in _worker_loop
    data = fetcher.fetch(index)  # type: ignore[union-attr]
  File "torch/utils/data/_utils/fetch.py", line 52, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "torch/utils/data/_utils/fetch.py", line 52, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "verl/utils/dataset/rl_dataset.py", line 248, in __getitem__
    model_inputs = self.processor(text=[raw_prompt], images=images, videos=videos, return_tensors="pt")
  File "transformers/models/qwen2_5_vl/processing_qwen2_5_vl.py", line 150, in __call__
    image_inputs = self.image_processor(images=images, **output_kwargs["images_kwargs"])
  File "transformers/image_processing_utils_fast.py", line 637, in __call__
    return self.preprocess(images, *args, **kwargs)
  File "transformers/models/qwen2_vl/image_processing_qwen2_vl_fast.py", line 151, in preprocess
    return super().preprocess(images, videos, **kwargs)
  File "transformers/image_processing_utils_fast.py", line 662, in preprocess
    return self._preprocess_image_like_inputs(
  File "transformers/models/qwen2_vl/image_processing_qwen2_vl_fast.py", line 173, in _preprocess_image_like_inputs
    batch_feature = self._preprocess(images, **kwargs)
  File "transformers/models/qwen2_vl/image_processing_qwen2_vl_fast.py", line 211, in _preprocess
    grouped_images, grouped_images_index = group_images_by_shape(images, disable_grouping=disable_grouping)
  File "transformers/image_transforms.py", line 917, in group_images_by_shape
    device = images[0][0].device if is_nested else images[0].device
IndexError: list index out of range
```

Signed-off-by: Hollow Man <[email protected]>
@github-actions
Copy link
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: glm4v, llama4, qwen2_vl, qwen3_vl

@zucchini-nlp
Copy link
Member

zucchini-nlp commented Oct 13, 2025

Hey @HollowMan6 , the feature was added in transformers already in one of the old PRs. I think you can close it now. Passing empty lists for images should work images = [[im1, im2], [], [im3]]

@HollowMan6
Copy link
Contributor Author

Oh okay, good to know! Thanks!

@HollowMan6 HollowMan6 closed this Oct 13, 2025
@HollowMan6 HollowMan6 deleted the noimage branch October 13, 2025 08:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants