Skip to content

Conversation

@youkaichao
Copy link
Member

temp fix for #4193

for users who don't use guided decoding but use slurm cluster, this makes their lives eaiser.

@github-actions
Copy link

👋 Hi! Thank you for contributing to the vLLM project.
Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which consists a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of default ones by unblocking the steps in your fast-check build on Buildkite UI.

Once the PR is approved and ready to go, please make sure to run full CI as it is required to merge (or just use auto-merge).

To run full CI, you can do one of these:

  • Comment /ready on the PR
  • Add ready label to the PR
  • Enable auto-merge.

🚀

@simon-mo
Copy link
Collaborator

Nice this is good fix for now! But the users will still have problem with the actually usage :(

@youkaichao
Copy link
Member Author

yep this is a temp fix, because I heard many people are suffering from this problem but they don't use outlines at all.

for people who really want guided decoding and suffer from the problem, they can use --guided-decoding-backend=lm-format-enforcer :)

@youkaichao
Copy link
Member Author

I'm waiting for users' confirmation to see if it works.

@chujiezheng
Copy link

Tested on slurm and this pr works well for me

@alex2awesome
Copy link

when will this be released? I'm unable to build from source on my HPC server

@youkaichao
Copy link
Member Author

@alex2awesome you don't need to wait for the release, we have per-commit wheel released. see

https://docs.vllm.ai/en/latest/getting_started/installation.html

@alex2awesome
Copy link

alex2awesome commented Aug 29, 2024

That's good to know!! unfortunately, after installing the nightly build, I'm still getting this error. Is there a way to delete/refresh the database?

  File "/project/jonmay_231/spangher/Projects/conditional-information-retrieval/source_summaries/data_vllm_70b.py", line 19, in <module>
    from vllm import LLM,  SamplingParams
  File "/home1/spangher/miniconda3/envs/vllm-py310/lib/python3.10/site-packages/vllm/__init__.py", line 6, in <module>
    from vllm.entrypoints.llm import LLM
  File "/home1/spangher/miniconda3/envs/vllm-py310/lib/python3.10/site-packages/vllm/entrypoints/llm.py", line 13, in <module>
    from vllm.model_executor.guided_decoding import (
  File "/home1/spangher/miniconda3/envs/vllm-py310/lib/python3.10/site-packages/vllm/model_executor/guided_decoding/__init__.py", line 8, in <module>
    from vllm.model_executor.guided_decoding.outlines_decoding import (
  File "/home1/spangher/miniconda3/envs/vllm-py310/lib/python3.10/site-packages/vllm/model_executor/guided_decoding/outlines_decoding.py", line 15, in <module>
    from vllm.model_executor.guided_decoding.outlines_logits_processors import (
  File "/home1/spangher/miniconda3/envs/vllm-py310/lib/python3.10/site-packages/vllm/model_executor/guided_decoding/outlines_logits_processors.py", line 25, in <module>
    from outlines import grammars
  File "/home1/spangher/miniconda3/envs/vllm-py310/lib/python3.10/site-packages/outlines/__init__.py", line 2, in <module>
    import outlines.generate
  File "/home1/spangher/miniconda3/envs/vllm-py310/lib/python3.10/site-packages/outlines/generate/__init__.py", line 2, in <module>
    from .cfg import cfg
  File "/home1/spangher/miniconda3/envs/vllm-py310/lib/python3.10/site-packages/outlines/generate/cfg.py", line 3, in <module>
    from outlines.fsm.guide import CFGGuide
  File "/home1/spangher/miniconda3/envs/vllm-py310/lib/python3.10/site-packages/outlines/fsm/guide.py", line 109, in <module>
    def create_states_mapping(
  File "/home1/spangher/miniconda3/envs/vllm-py310/lib/python3.10/site-packages/outlines/caching.py", line 93, in decorator
    memory = get_cache()
  File "/home1/spangher/miniconda3/envs/vllm-py310/lib/python3.10/site-packages/outlines/caching.py", line 65, in get_cache
    memory["__version__"] = outlines_version
  File "/home1/spangher/miniconda3/envs/vllm-py310/lib/python3.10/site-packages/diskcache/core.py", line 823, in __setitem__
    self.set(key, value, retry=True)
  File "/home1/spangher/miniconda3/envs/vllm-py310/lib/python3.10/site-packages/diskcache/core.py", line 806, in set
    self._row_update(rowid, now, columns)
  File "/home1/spangher/miniconda3/envs/vllm-py310/lib/python3.10/site-packages/diskcache/core.py", line 828, in _row_update
    sql(
sqlite3.DatabaseError: database disk image is malformed```

@youkaichao
Copy link
Member Author

File "/home1/spangher/miniconda3/envs/vllm-py310/lib/python3.10/site-packages/vllm/model_executor/guided_decoding/init.py", line 8, in
from vllm.model_executor.guided_decoding.outlines_decoding import (

your installation might be wrong. if you have the latest commit installed, line 8 should not be this one.

see

from vllm.sampling_params import LogitsProcessor

@alex2awesome
Copy link

alex2awesome commented Aug 30, 2024

Ahh thanks @youkaichao — dumb error on my part, I just copy/pasted the instructions in the docs. The right version to use for anyone coming here is:

export VLLM_VERSION=0.5.5
pip install https://vllm-wheels.s3.us-west-2.amazonaws.com/nightly/vllm-${VLLM_VERSION}-cp38-abi3-manylinux1_x86_64.whl```

@gustavosm
Copy link

gustavosm commented Aug 30, 2024

Hello, guys!!
I'm facing same problem as @alex2awesome, but I don't know what I can do here, because I'm running a golang app to start a docker container for vllm. So, I'm not sure how to run this pip install mentioned above, since my container drops as soon as it tries to run.
I have tried vllm 0.5.5 version, but no success.
Any suggestion?
The error message I'm getting is exactly the same as @alex2awesome posted few comments above.

Something that might worth mention is: the problem only occurs if I run the app with my linux user... I log in with another user, everything works fine

@LucWeber
Copy link

Hello, guys!! I'm facing same problem as @alex2awesome, but I don't know what I can do here, because I'm running a golang app to start a docker container for vllm. So, I'm not sure how to run this pip install mentioned above, since my container drops as soon as it tries to run. I have tried vllm 0.5.5 version, but no success. Any suggestion? The error message I'm getting is exactly the same as @alex2awesome posted few comments above.

Something that might worth mention is: the problem only occurs if I run the app with my linux user... I log in with another user, everything works fine

As I understand it the fix is not yet in 0.5.5. You will have to install from source

@youkaichao youkaichao mentioned this pull request Sep 2, 2024
1 task
@alex2awesome
Copy link

Hi @youkaichao , has the fix been implemented with the OpenAI-compatible inference engine yet?

I've tested and am OK with loading the model in python, as in: https://docs.vllm.ai/en/latest/getting_started/quickstart.html

but when I try to launch an inference engine using:

python -m vllm.entrypoints.openai.api_server \
    --model NousResearch/Meta-Llama-3-70B-Instruct \
    --dtype float16 \
    --tensor-parallel-size $NUM_GPUS \
    --api-key token-abc123 \
    --enforce-eager &

I get the same errors:

  File "/home1/spangher/miniconda3/envs/vllm-py310/lib/python3.10/site-packages/vllm/model_executor/guided_decoding/outlines_decoding.py", line 15, in <module>
    from vllm.model_executor.guided_decoding.outlines_logits_processors import (
  File "/home1/spangher/miniconda3/envs/vllm-py310/lib/python3.10/site-packages/vllm/model_executor/guided_decoding/outlines_logits_processors.py", line 25, in <module>
    from outlines import grammars
  File "/home1/spangher/miniconda3/envs/vllm-py310/lib/python3.10/site-packages/outlines/__init__.py", line 2, in <module>
    import outlines.generate
  File "/home1/spangher/miniconda3/envs/vllm-py310/lib/python3.10/site-packages/outlines/generate/__init__.py", line 2, in <module>
    from .cfg import cfg
  File "/home1/spangher/miniconda3/envs/vllm-py310/lib/python3.10/site-packages/outlines/generate/cfg.py", line 3, in <module>
    from outlines.fsm.guide import CFGGuide
  File "/home1/spangher/miniconda3/envs/vllm-py310/lib/python3.10/site-packages/outlines/fsm/guide.py", line 109, in <module>
    def create_states_mapping(
  File "/home1/spangher/miniconda3/envs/vllm-py310/lib/python3.10/site-packages/outlines/caching.py", line 93, in decorator
    memory = get_cache()
  File "/home1/spangher/miniconda3/envs/vllm-py310/lib/python3.10/site-packages/outlines/caching.py", line 65, in get_cache
    memory["__version__"] = outlines_version
  File "/home1/spangher/miniconda3/envs/vllm-py310/lib/python3.10/site-packages/diskcache/core.py", line 823, in __setitem__
    self.set(key, value, retry=True)
  File "/home1/spangher/miniconda3/envs/vllm-py310/lib/python3.10/site-packages/diskcache/core.py", line 806, in set
    self._row_update(rowid, now, columns)
  File "/home1/spangher/miniconda3/envs/vllm-py310/lib/python3.10/site-packages/diskcache/core.py", line 828, in _row_update
    sql(
sqlite3.DatabaseError: database disk image is malformed```

@youkaichao
Copy link
Member Author

@alex2awesome it should also work for the api server.

your stack trace is incomplete, and it is unclear if you use guided decoding or not.

@youkaichao youkaichao mentioned this pull request Oct 1, 2024
1 task
Alvant pushed a commit to compressa-ai/vllm that referenced this pull request Oct 26, 2024
LeiWang1999 pushed a commit to LeiWang1999/vllm-bitblas that referenced this pull request Mar 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants