Simplify vllm batch by sahel-sh · Pull Request #345 · castorini/rank_llm

sahel-sh · 2026-02-24T02:00:57Z

Mostly agentic coding, both vLLM inference modes are tested via demos, no impact on Gemini and GPT rankers.
I will add tests for sglang and tensorrt in a follow up PR if we decide to keep them.

Before this change we would create the prompts for all requests, then process the sliding window for all requests in parallel. This meant every single request had to complete the ranking of the current window before sending the next batch. For thinking models, only a single harder query is enough to keep everyone waiting, lots of idle time.

This cl does the prompt creation, llm inference, permutation update and enqueing of the next window for each request individually, the batch_size controls the number of on the fly requests to vllm inference handlers.

Pull Request Checklist

Reference Issue

Please provide the reference to issue this PR is addressing (# followed by the issue number). If there is no associated issue, write "N/A".

ref:

Checklist Items

Before submitting your pull request, please review these items:

Have you followed the contributing guidelines?
Have you verified that there are no existing Pull Requests for the same update/change?
Have you updated any relevant documentation or added new tests where needed?

PR Type

What kind of change does this PR introduce?

sahel-sh added 2 commits February 24, 2026 01:02

simplify vllm inference handlers

7ff1f39

run a demo as a test, fix missing ] by setting add_special_tokens=False

a6bf0ce

sahel-sh requested a review from clides February 24, 2026 02:00

sahel-sh added 2 commits February 24, 2026 16:42

Merge branch 'main' into fix_vllm_batch

a36b528

fix failing test

7751b99

sahel-sh requested review from lilyjge February 24, 2026 16:53

lilyjge approved these changes Feb 24, 2026

View reviewed changes

sahel-sh added 2 commits February 24, 2026 22:48

async vllm handlers

f1027dd

fix_oss_demo

196309e

lilyjge approved these changes Feb 24, 2026

View reviewed changes

sahel-sh merged commit 94d8c51 into main Feb 24, 2026
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simplify vllm batch#345

Simplify vllm batch#345
sahel-sh merged 6 commits intomainfrom
fix_vllm_batch

sahel-sh commented Feb 24, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

sahel-sh commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request Checklist

Reference Issue

Checklist Items

PR Type

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

sahel-sh commented Feb 24, 2026 •

edited

Loading