-
-
Notifications
You must be signed in to change notification settings - Fork 11.7k
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
Your current environment
The output of `python collect_env.py`
🐛 Describe the bug
I'm attempting to integrate LLaVA with vLLM for image processing, but I'm encountering a tensor size mismatch error when executing my script.
Setup:
I installed vLLM along with other required packages using the following command:
!pip install vllm==0.4.1 kaleido python-multipart torch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1
Code:
Here's the script I used to run LLaVA:
import torch
from vllm import LLM
from vllm.sequence import MultiModalData
def run_llava_pixel_values_debug():
llm = LLM(
model="llava-hf/llava-1.5-7b-hf",
enforce_eager=True,
tensor_parallel_size=1,
image_input_type="pixel_values",
image_token_id=32000,
image_input_shape=f"1,3,224,224",
image_feature_size=576,
)
prompt = "<image>" * 576 + (
"\nUSER: What is the content of this image?\nASSISTANT:")
# Load a smaller or test image file if available, or adjust the existing one to match the test size
with open("3d-background-with-hexagonal-shapes-texture_23-2150473185.jpg", "rb") as f:
image_file = f.read()
outputs = llm.generate(prompt,
multi_modal_data=MultiModalData(
type=MultiModalData.Type.IMAGE, data=encoded))
for o in outputs:
generated_text = o.outputs[0].text
print(generated_text)
run_llava_pixel_values_debug()
Error:
Upon running this script, I receive the following error:
RuntimeError: The size of tensor a (257) must match the size of tensor b (577) at non-singleton dimension 1.
Could anyone assist in identifying the source of this issue and suggest how I might correct the tensor size mismatch? Any help or suggestions would be greatly appreciated.
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working