Skip to content

[RFC]: Improving vLLM Dependency Compatibility with Downstream Ecosystems #33599

@jeffreywang-anyscale

Description

@jeffreywang-anyscale

Motivation.

vLLM maintains a set of Python dependencies that are strictly pinned or constrained (example). While this helps ensure internal stability, consuming vLLM as a dependency from external libraries can introduce incompatibilities with their existing dependency stacks.

In practice, these constraints can conflict with widely used ecosystem configurations. For example, vLLM currently requires numpy >= 2 through transitive dependencies, while Ray’s released container images and lock files depend on numpy==1.26.4. As a result, building Ray LLM images that depend on both Ray and vLLM becomes infeasible due to unsatisfiable dependencies.

These conflicts are often only discovered when Ray attempts to upgrade its vLLM dependency, at which point the incompatibility is already blocking integration or release timelines. Earlier detection of ecosystem-level incompatibilities would reduce integration friction and allow maintainers to make more informed tradeoffs when introducing or tightening dependency constraints.

Proposed Change.

We propose adding CI coverage that tests vLLM against representative downstream ecosystem configurations, starting with Ray. The goal is not to guarantee universal compatibility, but to surface breaking dependency changes earlier and make them explicit.

[Recommended] Approach 1: CI testing against Ray lock files

Add CI tests that attempt to install vLLM using Ray’s published dependency lock files. For example:

pip install vllm.whl -c ray_test_py311_cu128.lock

Using Ray’s recent lock files (example lock file) would allow vLLM to detect dependency conflicts at build time rather than during downstream integration.

Pros

  • Catches incompatibilities early
  • Minimal changes to vLLM packaging
  • Scales to additional ecosystem partners over time

Cons

  • Requires agreement on which downstream configurations are representative

Approach 2: Provide an alternative wheel with relaxed constraints

Maintain a separate vLLM wheel or dependency profile with relaxed version constraints to improve composability with downstream systems.

Pros

  • Maximizes flexibility for downstream consumers

Cons

  • Requires parallel dependency definitions and CI
  • Does not scale well to multiple ecosystems

Feedback Period.

2 weeks.

CC List.

@simon-mo @khluu @kouroshHakha

Any Other Things.

No response

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions