Skip to content

[Feature]: LoRA adapter support for vLLM alignment #281

@AndyZhou952

Description

@AndyZhou952

🚀 The feature, motivation and pitch

Follow-up to our previous communication under the roadmap, I’m opening this issue to propose implementing LoRA support for the vLLM alignment workflow. The multimodal RL projects, e.g., mm_grpo are also looking to adopt vllm-omni as a rollout engine. Since MM RL typically fine-tunes only the LoRA adapters (i.e., FlowGRPO, DiffusionNFT, this integration would be directly beneficial in the sense that it enables RL training workflows where weights update dynamically.

Having LoRA in vllm-omni also aligns with base vllm design, and it brings additional benefits

  • Dynamic adaptation: load/unload adapters without restart
  • Memory efficiency: smaller memory footprint than full model copies

Happy to take on the work if no one else is currently assigned.

Alternatives

No response

Additional context

No response

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

Metadata

Metadata

Labels

help wantedExtra attention is needed

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions