Skip to content

Conversation

@noooop
Copy link
Collaborator

@noooop noooop commented Mar 3, 2025

TLDR

Use model_redirect to redirect the model name to a local folder.
Use model name in the code without hard-coding the model path.

Usage

One redirect rule per line, model_name and redirect_name separated by \t.

For example, given a redirection facebook/opt-125m -> /data/LLM-model/opt-125m

echo -e "facebook/opt-125m\t/data/LLM-model/opt-125m\n" > .model.redirect

VLLM_MODEL_REDIRECT_PATH=".model.redirect" vllm serve facebook/opt-125m

should be equivalent to

vllm serve /data/LLM-model/opt-125m --served-model-name facebook/opt-125m

Use Case

  1. Use model name instead of model path to serve a local model, (e.g. your own trained model).
  2. Allow models from different sources, such as ModelScope and huggingface, using non-standard paths, local models to coexist harmoniously
  3. Offline mode. If you use model name, it need to request huggingface several times to confirm the model path and whether the model is up-to-date. Redirecting model names to a local folder can be done completely without using the network.
  4. Pin model version. Using the model name will automatically request huggingface and download the latest model version. Sometimes you need a specific model version. Redirecting model names to a local folder allows you to avoid hardcoding revisions.

@noooop noooop marked this pull request as draft March 3, 2025 06:09
@github-actions
Copy link

github-actions bot commented Mar 3, 2025

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

@noooop noooop force-pushed the model_overwrite branch 3 times, most recently from 50acdc7 to 5fecb39 Compare March 3, 2025 06:49
@noooop noooop marked this pull request as ready for review March 3, 2025 07:08
@noooop noooop changed the title [WIP] Use model_overwrite to redirect the model name to a local folder. [Misc] Use model_overwrite to redirect the model name to a local folder. Mar 3, 2025
@noooop noooop marked this pull request as draft March 3, 2025 07:39
@noooop noooop marked this pull request as ready for review March 3, 2025 08:33
@DarkLight1337
Copy link
Member

I think this is not necessary anymore since we have migrated the CI/CD to using our own file storage which has pre-downloaded models.

@noooop
Copy link
Collaborator Author

noooop commented Mar 25, 2025

@DarkLight1337

This feature is very convenient for everyone who needs to load models locally.

No need to use the model path every time

Not just for vllm CI/CD

@mergify
Copy link

mergify bot commented Mar 25, 2025

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @noooop.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot added the needs-rebase label Mar 25, 2025
@DarkLight1337
Copy link
Member

Don't we already have HF_HOME to change the directory containing locally downloaded HF repos?

@noooop
Copy link
Collaborator Author

noooop commented Mar 25, 2025

Some of my models use modelscope, some use hf, and some, e.g. deepseek v3, need to be downloaded to another partition.

I found that using model_overwrite is very convenient

@DarkLight1337 DarkLight1337 requested a review from Isotr0py March 25, 2025 06:10
@DarkLight1337
Copy link
Member

Personally I don't use ModelScope, so I'll have @Isotr0py review this instead.

@noooop noooop closed this Mar 25, 2025
@noooop noooop reopened this Mar 25, 2025
@mergify mergify bot added documentation Improvements or additions to documentation and removed needs-rebase labels Mar 25, 2025
@noooop
Copy link
Collaborator Author

noooop commented Mar 25, 2025

@Isotr0py
ready to review

examples/offline_inference/basic/use_model_overwrite.py will be deleted.

I think the config is too complicated, and it will also request hf multiple times.

Copy link
Member

@Isotr0py Isotr0py left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, I'm fine to have a function to redirect the model_repo to a downloaded local directory manually, but I don't really like the name of "overwrite" and the introduction of .model.overwrite (seems it's not a formal file used by neither HF nor modelscope)...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def model_overwrite(model: str):
def maybe_model_redirect(model: str):

I prefer to use maybe_model_redirect here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe_model_redirect fine

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should move the helper function to vllm.transformers_utils.utils if it's used by both config and tokenizer.

Comment on lines 270 to 272
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we just need to call the redirect function when initialize ModelConfig, so that we don't need to add it here and there.

Copy link
Collaborator Author

@noooop noooop Mar 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't want to override any model name in ModelConfig.

Yes, reading config is too complicated, and I don't want to write it everywhere, too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a formal file in modelscope? I can't find it in modelscope's API documentations.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I named it

@Isotr0py
Copy link
Member

I think the config is too complicated, and it will also request hf multiple times.

I think we just need to redirect to local location when initializing ModelConfig, so that we don't need to request hf multiple times.

@noooop
Copy link
Collaborator Author

noooop commented Mar 25, 2025

I think we just need to redirect to local location when initializing ModelConfig, so that we don't need to request hf multiple times.

  1. Redirection to local location when initializing ModelConfig. Log output and server names will be very weird.
    It may even trigger strange bugs

  2. Redirection in each submodule. Redirection logic is written in many places.

  3. Add a model_path parameter to each module?It's a huge project.

It's all very hacky anyway.

@Isotr0py

Maybe I should stop this stupid idea.

@Isotr0py
Copy link
Member

Isotr0py commented Mar 25, 2025

Otherwise, log output and server name will be very weird.

IMO, if we redirect the model_repo to local directory manually, we should also make sure the model's name updated accordingly, otherwise it's a little bit hacky and make bugs caused by outdated custom code difficult to find. (For HF, model with custom code won't be updated automatically when loading an outdated loacl checkpoint)

About the server name, we can use --served-model-name to use model repo name.

@noooop
Copy link
Collaborator Author

noooop commented Mar 25, 2025

I personally prefer 2

2. Redirection in each submodule. Redirection logic is written in many places.

@Isotr0py

Looking forward to hearing your opinions

Copy link
Member

@Isotr0py Isotr0py left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM now!

@Isotr0py Isotr0py enabled auto-merge (squash) March 26, 2025 09:14
@github-actions github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Mar 26, 2025
@noooop
Copy link
Collaborator Author

noooop commented Mar 26, 2025

Thanks for reviewing the code

auto-merge was automatically disabled March 27, 2025 03:12

Head branch was pushed to by a user without write access

@noooop noooop closed this Mar 27, 2025
auto-merge was automatically disabled March 27, 2025 03:12

Pull request was closed

@noooop noooop reopened this Mar 27, 2025
@Isotr0py Isotr0py enabled auto-merge (squash) March 27, 2025 04:18
@noooop
Copy link
Collaborator Author

noooop commented Mar 27, 2025

@Isotr0py

Please restart the test

@noooop
Copy link
Collaborator Author

noooop commented Mar 27, 2025

I'm not sure if this PR is the cause of the problem. It seems that these tests also reported errors yesterday.

QVQ

@Isotr0py
Copy link
Member

The entrypoint test failure should be unrelated, I can confirm it's passed locally. The V1 test is flaky currently. 😅

@noooop
Copy link
Collaborator Author

noooop commented Mar 27, 2025

The entrypoint test failure should be unrelated, I can confirm it's passed locally. The V1 test is flaky currently. 😅

QVQ

@vllm-bot vllm-bot merged commit 3f532cb into vllm-project:main Mar 27, 2025
31 of 33 checks passed
Alex4210987 pushed a commit to LeiWang1999/vllm-bitblas that referenced this pull request Apr 5, 2025
lulmer pushed a commit to lulmer/vllm that referenced this pull request Apr 7, 2025
lk-chen pushed a commit to lk-chen/vllm that referenced this pull request Apr 29, 2025
shreyankg pushed a commit to shreyankg/vllm that referenced this pull request May 3, 2025
RichardoMrMu pushed a commit to RichardoMrMu/vllm that referenced this pull request May 12, 2025
@noooop noooop deleted the model_overwrite branch July 10, 2025 04:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants