Skip to content

Conversation

@ywang96
Copy link
Member

@ywang96 ywang96 commented Nov 11, 2025

Purpose

This PR fixes the bug in dummy data generation for the image data that would result in the max number of features/embeddings as well as cleaning up some code since it's always assumed that resizing is turned on.

  • On main
Encoder cache will be initialized with a budget of 2048 tokens, and profiled with 10 image items of the maximum feature size.

Max image embedding size is 196.

  • This branch
Encoder cache will be initialized with a budget of 3600 tokens, and profiled with 1 image items of the maximum feature size.

Max image embedding size is 3600.

Test Plan

Test Result


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request aims to fix a bug in dummy data generation for PaddleOCR-VL to correctly calculate the maximum image embedding size. The changes involve removing a redundant if condition for resizing and updating the logic in get_image_size_with_most_features.

While the intention is correct, I've found a flaw in the new logic for get_image_size_with_most_features. The current implementation is brittle and only works correctly if the maximum number of tokens is a perfect square. I've provided a critical review comment with a more robust implementation to ensure it works for all cases.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Signed-off-by: Roger Wang <[email protected]>
@ywang96
Copy link
Member Author

ywang96 commented Nov 11, 2025

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request fixes a bug in calculating the maximum image size for PaddleOCR-VL models, which is important for memory profiling. The change correctly computes the image dimensions that yield the maximum number of features based on max_pixels. Additionally, it cleans up some redundant code related to image resizing. The core logic of the fix is sound, but I've identified a potential edge case that could lead to a crash.

@DarkLight1337 DarkLight1337 enabled auto-merge (squash) November 11, 2025 05:02
@github-actions github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Nov 11, 2025
@DarkLight1337 DarkLight1337 merged commit 4fd4b74 into vllm-project:main Nov 11, 2025
54 checks passed
@ywang96 ywang96 added this to the v0.11.1 milestone Nov 13, 2025
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Nov 13, 2025
khluu pushed a commit that referenced this pull request Nov 16, 2025
Signed-off-by: Roger Wang <[email protected]>
(cherry picked from commit 4fd4b74)
devpatelio pushed a commit to SumanthRH/vllm that referenced this pull request Nov 29, 2025
charlotte12l pushed a commit to charlotte12l/vllm that referenced this pull request Dec 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants