-
-
Notifications
You must be signed in to change notification settings - Fork 11.8k
Description
Your current environment
The output of `python collect_env.py`
🐛 Describe the bug
ERROR 07-01 08:12:10 async_llm_engine.py:52] Engine background task failed
ERROR 07-01 08:12:10 async_llm_engine.py:52] Traceback (most recent call last):
ERROR 07-01 08:12:10 async_llm_engine.py:52] File "/opt/conda/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 42, in _log_task_completion
ERROR 07-01 08:12:10 async_llm_engine.py:52] return_value = task.result()
ERROR 07-01 08:12:10 async_llm_engine.py:52] File "/opt/conda/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 532, in run_engine_loop
ERROR 07-01 08:12:10 async_llm_engine.py:52] has_requests_in_progress = await asyncio.wait_for(
ERROR 07-01 08:12:10 async_llm_engine.py:52] File "/opt/conda/lib/python3.10/asyncio/tasks.py", line 445, in wait_for
ERROR 07-01 08:12:10 async_llm_engine.py:52] return fut.result()
ERROR 07-01 08:12:10 async_llm_engine.py:52] File "/opt/conda/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 506, in engine_step
ERROR 07-01 08:12:10 async_llm_engine.py:52] request_outputs = await self.engine.step_async()
ERROR 07-01 08:12:10 async_llm_engine.py:52] File "/opt/conda/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 235, in step_async
ERROR 07-01 08:12:10 async_llm_engine.py:52] output = await self.model_executor.execute_model_async(
ERROR 07-01 08:12:10 async_llm_engine.py:52] File "/opt/conda/lib/python3.10/site-packages/vllm/executor/distributed_gpu_executor.py", line 166, in execute_model_async
ERROR 07-01 08:12:10 async_llm_engine.py:52] return await self._driver_execute_model_async(execute_model_req)
ERROR 07-01 08:12:10 async_llm_engine.py:52] File "/opt/conda/lib/python3.10/site-packages/vllm/executor/multiproc_gpu_executor.py", line 149, in _driver_execute_model_async
ERROR 07-01 08:12:10 async_llm_engine.py:52] return await self.driver_exec_model(execute_model_req)
ERROR 07-01 08:12:10 async_llm_engine.py:52] File "/opt/conda/lib/python3.10/concurrent/futures/thread.py", line 58, in run
ERROR 07-01 08:12:10 async_llm_engine.py:52] result = self.fn(*self.args, **self.kwargs)
ERROR 07-01 08:12:10 async_llm_engine.py:52] File "/opt/conda/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
ERROR 07-01 08:12:10 async_llm_engine.py:52] return func(*args, **kwargs)
ERROR 07-01 08:12:10 async_llm_engine.py:52] File "/opt/conda/lib/python3.10/site-packages/vllm/worker/worker.py", line 280, in execute_model
ERROR 07-01 08:12:10 async_llm_engine.py:52] output = self.model_runner.execute_model(seq_group_metadata_list,
ERROR 07-01 08:12:10 async_llm_engine.py:52] File "/opt/conda/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
ERROR 07-01 08:12:10 async_llm_engine.py:52] return func(*args, **kwargs)
ERROR 07-01 08:12:10 async_llm_engine.py:52] File "/opt/conda/lib/python3.10/site-packages/vllm/worker/model_runner.py", line 735, in execute_model
ERROR 07-01 08:12:10 async_llm_engine.py:52] ) = self.prepare_input_tensors(seq_group_metadata_list)
ERROR 07-01 08:12:10 async_llm_engine.py:52] File "/opt/conda/lib/python3.10/site-packages/vllm/worker/model_runner.py", line 682, in prepare_input_tensors
ERROR 07-01 08:12:10 async_llm_engine.py:52] sampling_metadata = SamplingMetadata.prepare(
ERROR 07-01 08:12:10 async_llm_engine.py:52] File "/opt/conda/lib/python3.10/site-packages/vllm/model_executor/sampling_metadata.py", line 116, in prepare
ERROR 07-01 08:12:10 async_llm_engine.py:52] ) = _prepare_seq_groups(seq_group_metadata_list, seq_lens, query_lens,
ERROR 07-01 08:12:10 async_llm_engine.py:52] File "/opt/conda/lib/python3.10/site-packages/vllm/model_executor/sampling_metadata.py", line 208, in _prepare_seq_groups
ERROR 07-01 08:12:10 async_llm_engine.py:52] if sampling_params.seed is not None:
ERROR 07-01 08:12:10 async_llm_engine.py:52] AttributeError: 'NoneType' object has no attribute 'seed'
Exception in callback functools.partial(<function _log_task_completion at 0x7f40f9a39630>, error_callback=<bound method AsyncLLMEngine._error_callback of <vllm.engine.async_llm_engine.AsyncLLMEngine object at 0x7f40b87a3160>>)
handle: <Handle functools.partial(<function _log_task_completion at 0x7f40f9a39630>, error_callback=<bound method AsyncLLMEngine._error_callback of <vllm.engine.async_llm_engine.AsyncLLMEngine object at 0x7f40b87a3160>>)>
Traceback (most recent call last):
File "/opt/conda/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 42, in _log_task_completion
return_value = task.result()
File "/opt/conda/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 532, in run_engine_loop
has_requests_in_progress = await asyncio.wait_for(
File "/opt/conda/lib/python3.10/asyncio/tasks.py", line 445, in wait_for
return fut.result()
File "/opt/conda/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 506, in engine_step
request_outputs = await self.engine.step_async()
File "/opt/conda/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 235, in step_async
output = await self.model_executor.execute_model_async(
File "/opt/conda/lib/python3.10/site-packages/vllm/executor/distributed_gpu_executor.py", line 166, in execute_model_async
return await self._driver_execute_model_async(execute_model_req)
File "/opt/conda/lib/python3.10/site-packages/vllm/executor/multiproc_gpu_executor.py", line 149, in _driver_execute_model_async
return await self.driver_exec_model(execute_model_req)
File "/opt/conda/lib/python3.10/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
File "/opt/conda/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/vllm/worker/worker.py", line 280, in execute_model
output = self.model_runner.execute_model(seq_group_metadata_list,
File "/opt/conda/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/vllm/worker/model_runner.py", line 735, in execute_model
) = self.prepare_input_tensors(seq_group_metadata_list)
File "/opt/conda/lib/python3.10/site-packages/vllm/worker/model_runner.py", line 682, in prepare_input_tensors
sampling_metadata = SamplingMetadata.prepare(
File "/opt/conda/lib/python3.10/site-packages/vllm/model_executor/sampling_metadata.py", line 116, in prepare
) = _prepare_seq_groups(seq_group_metadata_list, seq_lens, query_lens,
File "/opt/conda/lib/python3.10/site-packages/vllm/model_executor/sampling_metadata.py", line 208, in _prepare_seq_groups
if sampling_params.seed is not None:
AttributeError: 'NoneType' object has no attribute 'seed'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "uvloop/cbhandles.pyx", line 63, in uvloop.loop.Handle._run
File "/opt/conda/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 54, in _log_task_completion
raise AsyncEngineDeadError(
vllm.engine.async_llm_engine.AsyncEngineDeadError: Task finished unexpectedly. This should never happen! Please open an issue on Github. See stack trace above for theactual cause.
INFO 07-01 08:12:10 async_llm_engine.py:167] Aborted request cmpl-13a5e1f614ab4afe99ca9ccc99097603-0.
INFO: 192.168.30.254:63180 - "POST /v1/embeddings HTTP/1.1" 500 Internal Server Error
ERROR: Exception in ASGI application
Traceback (most recent call last):
File "/opt/conda/lib/python3.10/site-packages/uvicorn/protocols/http/httptools_impl.py", line 399, in run_asgi
result = await app( # type: ignore[func-returns-value]
File "/opt/conda/lib/python3.10/site-packages/uvicorn/middleware/proxy_headers.py", line 70, in call
return await self.app(scope, receive, send)
File "/opt/conda/lib/python3.10/site-packages/fastapi/applications.py", line 1054, in call
await super().call(scope, receive, send)
File "/opt/conda/lib/python3.10/site-packages/starlette/applications.py", line 123, in call
await self.middleware_stack(scope, receive, send)
File "/opt/conda/lib/python3.10/site-packages/starlette/middleware/errors.py", line 186, in call
raise exc
File "/opt/conda/lib/python3.10/site-packages/starlette/middleware/errors.py", line 164, in call
await self.app(scope, receive, _send)
File "/opt/conda/lib/python3.10/site-packages/starlette/middleware/cors.py", line 85, in call
await self.app(scope, receive, send)
File "/opt/conda/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 65, in call
await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
File "/opt/conda/lib/python3.10/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
raise exc
File "/opt/conda/lib/python3.10/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
await app(scope, receive, sender)
File "/opt/conda/lib/python3.10/site-packages/starlette/routing.py", line 756, in call
await self.middleware_stack(scope, receive, send)
File "/opt/conda/lib/python3.10/site-packages/starlette/routing.py", line 776, in app
await route.handle(scope, receive, send)
File "/opt/conda/lib/python3.10/site-packages/starlette/routing.py", line 297, in handle
await self.app(scope, receive, send)
File "/opt/conda/lib/python3.10/site-packages/starlette/routing.py", line 77, in app
await wrap_app_handling_exceptions(app, request)(scope, receive, send)
File "/opt/conda/lib/python3.10/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
raise exc
File "/opt/conda/lib/python3.10/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
await app(scope, receive, sender)
File "/opt/conda/lib/python3.10/site-packages/starlette/routing.py", line 72, in app
response = await func(request)
File "/opt/conda/lib/python3.10/site-packages/fastapi/routing.py", line 278, in app
raw_response = await run_endpoint_function(
File "/opt/conda/lib/python3.10/site-packages/fastapi/routing.py", line 191, in run_endpoint_function
return await dependant.call(**values)
File "/opt/conda/lib/python3.10/site-packages/vllm/entrypoints/openai/api_server.py", line 132, in create_embedding
generator = await openai_serving_embedding.create_embedding(
File "/opt/conda/lib/python3.10/site-packages/vllm/entrypoints/openai/serving_embedding.py", line 124, in create_embedding
async for i, res in result_generator:
File "/opt/conda/lib/python3.10/site-packages/vllm/utils.py", line 250, in consumer
raise e
File "/opt/conda/lib/python3.10/site-packages/vllm/utils.py", line 241, in consumer
raise item
File "/opt/conda/lib/python3.10/site-packages/vllm/utils.py", line 225, in producer
async for item in iterator:
File "/opt/conda/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 747, in encode
async for output in self._process_request(
File "/opt/conda/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 780, in _process_request
raise e
File "/opt/conda/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 776, in _process_request
async for request_output in stream:
File "/opt/conda/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 89, in anext
raise result
File "/opt/conda/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 42, in _log_task_completion
return_value = task.result()
File "/opt/conda/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 532, in run_engine_loop
has_requests_in_progress = await asyncio.wait_for(
File "/opt/conda/lib/python3.10/asyncio/tasks.py", line 445, in wait_for
return fut.result()
File "/opt/conda/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 506, in engine_step
request_outputs = await self.engine.step_async()
File "/opt/conda/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 235, in step_async
output = await self.model_executor.execute_model_async(
File "/opt/conda/lib/python3.10/site-packages/vllm/executor/distributed_gpu_executor.py", line 166, in execute_model_async
return await self._driver_execute_model_async(execute_model_req)
File "/opt/conda/lib/python3.10/site-packages/vllm/executor/multiproc_gpu_executor.py", line 149, in _driver_execute_model_async
return await self.driver_exec_model(execute_model_req)
File "/opt/conda/lib/python3.10/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
File "/opt/conda/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/vllm/worker/worker.py", line 280, in execute_model
output = self.model_runner.execute_model(seq_group_metadata_list,
File "/opt/conda/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/vllm/worker/model_runner.py", line 735, in execute_model
) = self.prepare_input_tensors(seq_group_metadata_list)
File "/opt/conda/lib/python3.10/site-packages/vllm/worker/model_runner.py", line 682, in prepare_input_tensors
sampling_metadata = SamplingMetadata.prepare(
File "/opt/conda/lib/python3.10/site-packages/vllm/model_executor/sampling_metadata.py", line 116, in prepare
) = _prepare_seq_groups(seq_group_metadata_list, seq_lens, query_lens,
File "/opt/conda/lib/python3.10/site-packages/vllm/model_executor/sampling_metadata.py", line 208, in _prepare_seq_groups
if sampling_params.seed is not None:
AttributeError: 'NoneType' object has no attribute 'seed'