When running longer tasks > 120s on DocSum. It runs into a timeout and stops the task. The time out is way to small for document summarization service it makes the real usage of the service unsable. Requests take long time if they are done with CPU or if there is lot of concurrent users to the service. In 1.1 it worked with longer inputs and beyond 120s. So, this is a regression.
Create couple of curls requests for the DocSum service so the finalizing the request takes more than 2 minutes.
$ time curl http://{host_id}:8888/v1/docsum -H "Content-Type: multipart/form-data" -F "type=text" -F "messages=" -F "files=@./pubmed_100.txt" -F "max_tokens=1024" -F "language=en" -F "stream=true" &
[1] 1089380
$ time curl http://{host_id}:8888/v1/docsum -H "Content-Type: multipart/form-data" -F "type=text" -F "messages=" -F "files=@./pubmed_100.txt" -F "max_tokens=1024" -F "language=en" -F "stream=true" &
[2] 1089395
$ time curl http://{host_id}:8888/v1/docsum -H "Content-Type: multipart/form-data" -F "type=text" -F "messages=" -F "files=@./pubmed_100.txt" -F "max_tokens=1024" -F "language=en" -F "stream=true" &
[3] 1089397
$ data: This
curl: (18) transfer closed with outstanding read data remaining
real 2m0,611s
user 0m0,022s
sys 0m0,016s
curl: (18) transfer closed with outstanding read data remaining
real 2m0,837s
user 0m0,015s
sys 0m0,017s
curl: (18) transfer closed with outstanding read data remaining
real 2m0,343s
user 0m0,016s
sys 0m0,021s
ERROR: Exception in ASGI application
Traceback (most recent call last):
File "/home/user/.local/lib/python3.11/site-packages/starlette/responses.py", line 268, in __call__
await wrap(partial(self.listen_for_disconnect, receive))
File "/home/user/.local/lib/python3.11/site-packages/starlette/responses.py", line 264, in wrap
await func()
File "/home/user/.local/lib/python3.11/site-packages/starlette/responses.py", line 233, in listen_for_disconnect
message = await receive()
^^^^^^^^^^^^^^^
File "/home/user/.local/lib/python3.11/site-packages/uvicorn/protocols/http/h11_impl.py", line 531, in receive
await self.message_event.wait()
File "/usr/local/lib/python3.11/asyncio/locks.py", line 213, in wait
await fut
asyncio.exceptions.CancelledError: Cancelled by cancel scope 780f1c98b110
During handling of the above exception, another exception occurred:
+ Exception Group Traceback (most recent call last):
| File "/home/user/.local/lib/python3.11/site-packages/uvicorn/protocols/http/h11_impl.py", line 403, in run_asgi
| result = await app( # type: ignore[func-returns-value]
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
| File "/home/user/.local/lib/python3.11/site-packages/uvicorn/middleware/proxy_headers.py", line 60, in __call__
| return await self.app(scope, receive, send)
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
| File "/home/user/.local/lib/python3.11/site-packages/fastapi/applications.py", line 1054, in __call__
| await super().__call__(scope, receive, send)
| File "/home/user/.local/lib/python3.11/site-packages/starlette/applications.py", line 112, in __call__
| await self.middleware_stack(scope, receive, send)
| File "/home/user/.local/lib/python3.11/site-packages/starlette/middleware/errors.py", line 187, in __call__
| raise exc
| File "/home/user/.local/lib/python3.11/site-packages/starlette/middleware/errors.py", line 165, in __call__
| await self.app(scope, receive, _send)
| File "/home/user/.local/lib/python3.11/site-packages/prometheus_fastapi_instrumentator/middleware.py", line 174, in __call__
| raise exc
| File "/home/user/.local/lib/python3.11/site-packages/prometheus_fastapi_instrumentator/middleware.py", line 172, in __call__
| await self.app(scope, receive, send_wrapper)
| File "/home/user/.local/lib/python3.11/site-packages/starlette/middleware/cors.py", line 85, in __call__
| await self.app(scope, receive, send)
| File "/home/user/.local/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 62, in __call__
| await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
| File "/home/user/.local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
| raise exc
| File "/home/user/.local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
| await app(scope, receive, sender)
| File "/home/user/.local/lib/python3.11/site-packages/starlette/routing.py", line 715, in __call__
| await self.middleware_stack(scope, receive, send)
| File "/home/user/.local/lib/python3.11/site-packages/starlette/routing.py", line 735, in app
| await route.handle(scope, receive, send)
| File "/home/user/.local/lib/python3.11/site-packages/starlette/routing.py", line 288, in handle
| await self.app(scope, receive, send)
| File "/home/user/.local/lib/python3.11/site-packages/starlette/routing.py", line 76, in app
| await wrap_app_handling_exceptions(app, request)(scope, receive, send)
| File "/home/user/.local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
| raise exc
| File "/home/user/.local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
| await app(scope, receive, sender)
| File "/home/user/.local/lib/python3.11/site-packages/starlette/routing.py", line 74, in app
| await response(scope, receive, send)
| File "/home/user/.local/lib/python3.11/site-packages/starlette/responses.py", line 261, in __call__
| async with anyio.create_task_group() as task_group:
| File "/home/user/.local/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 767, in __aexit__
| raise BaseExceptionGroup(
| ExceptionGroup: unhandled errors in a TaskGroup (1 sub-exception)
+-+---------------- 1 ----------------
| Traceback (most recent call last):
| File "/home/user/.local/lib/python3.11/site-packages/aiohttp/streams.py", line 347, in _wait
| await waiter
| asyncio.exceptions.CancelledError
|
| The above exception was the direct cause of the following exception:
|
| Traceback (most recent call last):
| File "/home/user/.local/lib/python3.11/site-packages/starlette/responses.py", line 264, in wrap
| await func()
| File "/home/user/.local/lib/python3.11/site-packages/starlette/responses.py", line 245, in stream_response
| async for chunk in self.body_iterator:
| File "/home/user/comps/llms/src/doc-summarization/integrations/common.py", line 211, in stream_generator
| async for chunk in llm_chain.astream_log(docs):
| File "/home/user/.local/lib/python3.11/site-packages/langchain_core/runnables/base.py", line 1112, in astream_log
| async for item in _astream_log_implementation( # type: ignore
| File "/home/user/.local/lib/python3.11/site-packages/langchain_core/tracers/log_stream.py", line 675, in _astream_log_implementation
| await task
| File "/home/user/.local/lib/python3.11/site-packages/langchain_core/tracers/log_stream.py", line 629, in consume_astream
| async for chunk in runnable.astream(input, config, **kwargs):
| File "/home/user/.local/lib/python3.11/site-packages/langchain_core/runnables/base.py", line 1016, in astream
| yield await self.ainvoke(input, config, **kwargs)
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
| File "/home/user/.local/lib/python3.11/site-packages/langchain/chains/base.py", line 221, in ainvoke
| raise e
| File "/home/user/.local/lib/python3.11/site-packages/langchain/chains/base.py", line 212, in ainvoke
| await self._acall(inputs, run_manager=run_manager)
| File "/home/user/.local/lib/python3.11/site-packages/langchain/chains/combine_documents/base.py", line 154, in _acall
| output, extra_return_dict = await self.acombine_docs(
| ^^^^^^^^^^^^^^^^^^^^^^^^^
| File "/home/user/.local/lib/python3.11/site-packages/langchain/chains/combine_documents/stuff.py", line 277, in acombine_docs
| return await self.llm_chain.apredict(callbacks=callbacks, **inputs), {}
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
| File "/home/user/.local/lib/python3.11/site-packages/langchain/chains/llm.py", line 335, in apredict
| return (await self.acall(kwargs, callbacks=callbacks))[self.output_key]
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
| File "/home/user/.local/lib/python3.11/site-packages/langchain_core/_api/deprecation.py", line 191, in awarning_emitting_wrapper
| return await wrapped(*args, **kwargs)
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
| File "/home/user/.local/lib/python3.11/site-packages/langchain/chains/base.py", line 439, in acall
| return await self.ainvoke(
| ^^^^^^^^^^^^^^^^^^^
| File "/home/user/.local/lib/python3.11/site-packages/langchain/chains/base.py", line 221, in ainvoke
| raise e
| File "/home/user/.local/lib/python3.11/site-packages/langchain/chains/base.py", line 212, in ainvoke
| await self._acall(inputs, run_manager=run_manager)
| File "/home/user/.local/lib/python3.11/site-packages/langchain/chains/llm.py", line 300, in _acall
| response = await self.agenerate([inputs], run_manager=run_manager)
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
| File "/home/user/.local/lib/python3.11/site-packages/langchain/chains/llm.py", line 165, in agenerate
| return await self.llm.agenerate_prompt(
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
| File "/home/user/.local/lib/python3.11/site-packages/langchain_core/language_models/llms.py", line 769, in agenerate_prompt
| return await self.agenerate(
| ^^^^^^^^^^^^^^^^^^^^^
| File "/home/user/.local/lib/python3.11/site-packages/langchain_core/language_models/llms.py", line 1210, in agenerate
| output = await self._agenerate_helper(
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
| File "/home/user/.local/lib/python3.11/site-packages/langchain_core/language_models/llms.py", line 1042, in _agenerate_helper
| raise e
| File "/home/user/.local/lib/python3.11/site-packages/langchain_core/language_models/llms.py", line 1026, in _agenerate_helper
| await self._agenerate(
| File "/home/user/.local/lib/python3.11/site-packages/langchain_core/language_models/llms.py", line 1541, in _agenerate
| await self._acall(prompt, stop=stop, run_manager=run_manager, **kwargs)
| File "/home/user/.local/lib/python3.11/site-packages/langchain_community/llms/huggingface_endpoint.py", line 294, in _acall
| async for chunk in self._astream(
| File "/home/user/.local/lib/python3.11/site-packages/langchain_community/llms/huggingface_endpoint.py", line 363, in _astream
| async for response in await self.async_client.text_generation(
| File "/home/user/.local/lib/python3.11/site-packages/huggingface_hub/inference/_common.py", line 322, in _async_stream_text_generation_response
| async for byte_payload in bytes_output_as_lines:
| File "/home/user/.local/lib/python3.11/site-packages/huggingface_hub/inference/_common.py", line 401, in _async_yield_from
| async for byte_payload in response.content:
| File "/home/user/.local/lib/python3.11/site-packages/aiohttp/streams.py", line 52, in __anext__
| rv = await self.read_func()
| ^^^^^^^^^^^^^^^^^^^^^^
| File "/home/user/.local/lib/python3.11/site-packages/aiohttp/streams.py", line 352, in readline
| return await self.readuntil()
| ^^^^^^^^^^^^^^^^^^^^^^
| File "/home/user/.local/lib/python3.11/site-packages/aiohttp/streams.py", line 386, in readuntil
| await self._wait("readuntil")
| File "/home/user/.local/lib/python3.11/site-packages/aiohttp/streams.py", line 346, in _wait
| with self._timer:
| File "/home/user/.local/lib/python3.11/site-packages/aiohttp/helpers.py", line 671, in __exit__
| raise asyncio.TimeoutError from exc_val
| TimeoutError
+------------------------------------
Priority
P2-High
OS type
Ubuntu
Hardware type
Xeon-SPR
Installation method
Deploy method
Running nodes
Single Node
What's the version?
Docsum images:
tag:1.2
Commit that caused the issue:
opea-project/GenAIComps@5aba3b2
Description
When running longer tasks > 120s on DocSum. It runs into a timeout and stops the task. The time out is way to small for document summarization service it makes the real usage of the service unsable. Requests take long time if they are done with CPU or if there is lot of concurrent users to the service. In 1.1 it worked with longer inputs and beyond 120s. So, this is a regression.
Reproduce steps
Create couple of curls requests for the DocSum service so the finalizing the request takes more than 2 minutes.
Raw log
Attachments
No response