Support tokenization_kwargs override #29794

piood · 2025-12-01T16:41:12Z

Purpose

Add support for tokenization_kwargs parameter in embedding, classification, and scoring APIs to allow users to override tokenization behavior during inference.
This enhancement enables users to customize tokenization parameters (e.g., padding, max_length, truncation) when calling embedding-related methods (embed(), classify(), score()), providing more flexibility for handling different input scenarios.

Fix #27566 (comment)

Test Plan

Run the SiGLIP multimodal pooling tests with custom tokenization parameters:

pytest tests/models/multimodal/pooling/test_siglip.py

The test suite now validates:

Text-only inputs with custom padding and max_length settings
Image-only inputs with default tokenization behavior

Test Result

All SiGLIP pooling tests pass successfully with the following configurations:
✅ test_models_text - validates custom tokenization_kwargs (padding="max_length", max_length=64)
✅ test_models_image - validates default behavior with empty tokenization_kwargs

Signed-off-by: piood <[email protected]>

chatgpt-codex-connector · 2025-12-01T16:41:18Z

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.

gemini-code-assist

Code Review

This pull request adds support for overriding tokenization behavior by introducing a tokenization_kwargs parameter to the embedding, classification, and scoring APIs. The changes are well-implemented by plumbing this new parameter through the call stack. However, I've found a critical issue in the test utility function get_inputs where the passed processor_kwargs dictionary is modified in-place within a loop, which could lead to incorrect test behavior. I've provided a code suggestion to fix this.

tests/conftest.py

tests/models/multimodal/pooling/test_siglip.py

Signed-off-by: piood <[email protected]>

piood · 2025-12-02T04:08:03Z

/gemini review

gemini-code-assist

Code Review

This pull request adds support for overriding tokenization behavior by introducing a tokenization_kwargs parameter in various API methods. The changes are well-implemented for the most part, correctly plumbing the new parameter through the call stack. The tests have also been updated to cover the new functionality. However, I've identified a potential issue in _cross_encoding_score where tokenization_kwargs is used in two different places in the call chain, which could lead to confusion and bugs. My review comment details this concern.

vllm/entrypoints/llm.py

Signed-off-by: piood <[email protected]>

piood · 2025-12-02T08:12:23Z

/gemini review

gemini-code-assist

Code Review

This pull request adds support for tokenization_kwargs to customize tokenization behavior in embedding, classification, and scoring APIs. The changes correctly plumb the new parameter through the different layers of the application. The tests have also been updated to cover the new functionality. I've found a couple of high-severity issues where the user-provided tokenization_kwargs dictionary is modified in-place, which can lead to unexpected side effects. I've provided suggestions to fix this by creating a copy of the dictionary.

vllm/entrypoints/llm.py

Signed-off-by: piood <[email protected]>

piood · 2025-12-02T08:32:17Z

/gemini review

gemini-code-assist

Code Review

This pull request adds support for overriding tokenization behavior via tokenization_kwargs. The changes are well-implemented across the embedding and classification APIs. However, I've identified a few critical issues where passing conflicting keys in tokenization_kwargs could lead to a TypeError. I've also noted a high-severity issue where multimodal arguments could be dropped during cross-encoder scoring.

Additionally, the public score() method was not updated to accept tokenization_kwargs, which makes this new feature inaccessible for scoring tasks. Please consider adding this parameter to the score() method and passing it to the internal _embedding_score and _cross_encoding_score methods to complete the feature.

vllm/entrypoints/llm.py

Signed-off-by: piood <[email protected]>

piood · 2025-12-02T09:19:41Z

All tests still pass locally after implementing the suggested fixes. Please review it, thanks! @DarkLight1337

tests/models/multimodal/pooling/test_siglip.py

Signed-off-by: piood <[email protected]>

DarkLight1337

Thanks and sorry for the delay!

Support tokenization_kwargs override

5610992

Signed-off-by: piood <[email protected]>

piood requested review from aarnphm, chaunceyjiang and noooop as code owners December 1, 2025 16:41

mergify bot added frontend multi-modality Related to multi-modality (#4194) labels Dec 1, 2025

piood mentioned this pull request Dec 1, 2025

[Model] Siglip2 Model Support #27566

Merged

5 tasks

gemini-code-assist bot reviewed Dec 1, 2025

View reviewed changes

tests/conftest.py Outdated Show resolved Hide resolved

DarkLight1337 reviewed Dec 1, 2025

View reviewed changes

tests/models/multimodal/pooling/test_siglip.py Outdated Show resolved Hide resolved

fix

67b40f5

Signed-off-by: piood <[email protected]>

gemini-code-assist bot reviewed Dec 2, 2025

View reviewed changes

vllm/entrypoints/llm.py Outdated Show resolved Hide resolved

fix

9597c5c

Signed-off-by: piood <[email protected]>

gemini-code-assist bot reviewed Dec 2, 2025

View reviewed changes

vllm/entrypoints/llm.py Outdated Show resolved Hide resolved

vllm/entrypoints/llm.py Outdated Show resolved Hide resolved

fix

d455b29

Signed-off-by: piood <[email protected]>

gemini-code-assist bot reviewed Dec 2, 2025

View reviewed changes

vllm/entrypoints/llm.py Show resolved Hide resolved

vllm/entrypoints/llm.py Show resolved Hide resolved

vllm/entrypoints/llm.py Outdated Show resolved Hide resolved

piood added 2 commits December 2, 2025 08:53

remove redundant code and comment

7d79e49

Signed-off-by: piood <[email protected]>

add comment

5610239

Signed-off-by: piood <[email protected]>

DarkLight1337 reviewed Dec 6, 2025

View reviewed changes

tests/models/multimodal/pooling/test_siglip.py Outdated Show resolved Hide resolved

fix

bdfb80f

Signed-off-by: piood <[email protected]>

piood force-pushed the add_override_tokenization_kwargs branch from 2caeb7c to bdfb80f Compare December 6, 2025 02:58

Merge branch 'main' into add_override_tokenization_kwargs

866624b

DarkLight1337 approved these changes Dec 6, 2025

View reviewed changes

DarkLight1337 enabled auto-merge (squash) December 6, 2025 03:17

DarkLight1337 added the ready ONLY add when PR is ready to merge/full CI is needed label Dec 6, 2025

Merge branch 'main' into add_override_tokenization_kwargs

17c95b8

DarkLight1337 merged commit 43e7593 into vllm-project:main Dec 6, 2025
48 checks passed

Uh oh!

Support tokenization_kwargs override #29794

Support tokenization_kwargs override #29794

Conversation

piood commented Dec 1, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

The test suite now validates:

Test Result

Uh oh!

chatgpt-codex-connector bot commented Dec 1, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

piood commented Dec 2, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

piood commented Dec 2, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

piood commented Dec 2, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

piood commented Dec 2, 2025

Uh oh!

Uh oh!

DarkLight1337 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

piood commented Dec 1, 2025 •

edited by github-actions bot

Loading