Skip to content

[Bug]DocSum timeouts/fails to execute a long task [Regression] #1481

@vrantala

Description

@vrantala

Priority

P2-High

OS type

Ubuntu

Hardware type

Xeon-SPR

Installation method

  • Pull docker images from hub.docker.com
  • Build docker images from source
  • Other

Deploy method

  • Docker
  • Docker Compose
  • Kubernetes Helm Charts
  • Kubernetes GMC
  • Other

Running nodes

Single Node

What's the version?

Docsum images:
tag:1.2

Commit that caused the issue:
opea-project/GenAIComps@5aba3b2

Description

When running longer tasks > 120s on DocSum. It runs into a timeout and stops the task. The time out is way to small for document summarization service it makes the real usage of the service unsable. Requests take long time if they are done with CPU or if there is lot of concurrent users to the service. In 1.1 it worked with longer inputs and beyond 120s. So, this is a regression.

Reproduce steps

Create couple of curls requests for the DocSum service so the finalizing the request takes more than 2 minutes.

$ time curl http://{host_id}:8888/v1/docsum    -H "Content-Type: multipart/form-data"    -F "type=text"    -F "messages="    -F "files=@./pubmed_100.txt"    -F "max_tokens=1024"    -F "language=en"    -F "stream=true" &
[1] 1089380
$ time curl http://{host_id}:8888/v1/docsum    -H "Content-Type: multipart/form-data"    -F "type=text"    -F "messages="    -F "files=@./pubmed_100.txt"    -F "max_tokens=1024"    -F "language=en"    -F "stream=true" &
[2] 1089395
$ time curl http://{host_id}:8888/v1/docsum    -H "Content-Type: multipart/form-data"    -F "type=text"    -F "messages="    -F "files=@./pubmed_100.txt"    -F "max_tokens=1024"    -F "language=en"    -F "stream=true" &
[3] 1089397
$ data:  This

curl: (18) transfer closed with outstanding read data remaining

real	2m0,611s
user	0m0,022s
sys	0m0,016s
curl: (18) transfer closed with outstanding read data remaining

real	2m0,837s
user	0m0,015s
sys	0m0,017s
curl: (18) transfer closed with outstanding read data remaining

real	2m0,343s
user	0m0,016s
sys	0m0,021s

Raw log

ERROR:    Exception in ASGI application
Traceback (most recent call last):
  File "/home/user/.local/lib/python3.11/site-packages/starlette/responses.py", line 268, in __call__
    await wrap(partial(self.listen_for_disconnect, receive))
  File "/home/user/.local/lib/python3.11/site-packages/starlette/responses.py", line 264, in wrap
    await func()
  File "/home/user/.local/lib/python3.11/site-packages/starlette/responses.py", line 233, in listen_for_disconnect
    message = await receive()
              ^^^^^^^^^^^^^^^
  File "/home/user/.local/lib/python3.11/site-packages/uvicorn/protocols/http/h11_impl.py", line 531, in receive
    await self.message_event.wait()
  File "/usr/local/lib/python3.11/asyncio/locks.py", line 213, in wait
    await fut
asyncio.exceptions.CancelledError: Cancelled by cancel scope 780f1c98b110

During handling of the above exception, another exception occurred:

  + Exception Group Traceback (most recent call last):
  |   File "/home/user/.local/lib/python3.11/site-packages/uvicorn/protocols/http/h11_impl.py", line 403, in run_asgi
  |     result = await app(  # type: ignore[func-returns-value]
  |              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  |   File "/home/user/.local/lib/python3.11/site-packages/uvicorn/middleware/proxy_headers.py", line 60, in __call__
  |     return await self.app(scope, receive, send)
  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  |   File "/home/user/.local/lib/python3.11/site-packages/fastapi/applications.py", line 1054, in __call__
  |     await super().__call__(scope, receive, send)
  |   File "/home/user/.local/lib/python3.11/site-packages/starlette/applications.py", line 112, in __call__
  |     await self.middleware_stack(scope, receive, send)
  |   File "/home/user/.local/lib/python3.11/site-packages/starlette/middleware/errors.py", line 187, in __call__
  |     raise exc
  |   File "/home/user/.local/lib/python3.11/site-packages/starlette/middleware/errors.py", line 165, in __call__
  |     await self.app(scope, receive, _send)
  |   File "/home/user/.local/lib/python3.11/site-packages/prometheus_fastapi_instrumentator/middleware.py", line 174, in __call__
  |     raise exc
  |   File "/home/user/.local/lib/python3.11/site-packages/prometheus_fastapi_instrumentator/middleware.py", line 172, in __call__
  |     await self.app(scope, receive, send_wrapper)
  |   File "/home/user/.local/lib/python3.11/site-packages/starlette/middleware/cors.py", line 85, in __call__
  |     await self.app(scope, receive, send)
  |   File "/home/user/.local/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 62, in __call__
  |     await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
  |   File "/home/user/.local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
  |     raise exc
  |   File "/home/user/.local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
  |     await app(scope, receive, sender)
  |   File "/home/user/.local/lib/python3.11/site-packages/starlette/routing.py", line 715, in __call__
  |     await self.middleware_stack(scope, receive, send)
  |   File "/home/user/.local/lib/python3.11/site-packages/starlette/routing.py", line 735, in app
  |     await route.handle(scope, receive, send)
  |   File "/home/user/.local/lib/python3.11/site-packages/starlette/routing.py", line 288, in handle
  |     await self.app(scope, receive, send)
  |   File "/home/user/.local/lib/python3.11/site-packages/starlette/routing.py", line 76, in app
  |     await wrap_app_handling_exceptions(app, request)(scope, receive, send)
  |   File "/home/user/.local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
  |     raise exc
  |   File "/home/user/.local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
  |     await app(scope, receive, sender)
  |   File "/home/user/.local/lib/python3.11/site-packages/starlette/routing.py", line 74, in app
  |     await response(scope, receive, send)
  |   File "/home/user/.local/lib/python3.11/site-packages/starlette/responses.py", line 261, in __call__
  |     async with anyio.create_task_group() as task_group:
  |   File "/home/user/.local/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 767, in __aexit__
  |     raise BaseExceptionGroup(
  | ExceptionGroup: unhandled errors in a TaskGroup (1 sub-exception)
  +-+---------------- 1 ----------------
    | Traceback (most recent call last):
    |   File "/home/user/.local/lib/python3.11/site-packages/aiohttp/streams.py", line 347, in _wait
    |     await waiter
    | asyncio.exceptions.CancelledError
    | 
    | The above exception was the direct cause of the following exception:
    | 
    | Traceback (most recent call last):
    |   File "/home/user/.local/lib/python3.11/site-packages/starlette/responses.py", line 264, in wrap
    |     await func()
    |   File "/home/user/.local/lib/python3.11/site-packages/starlette/responses.py", line 245, in stream_response
    |     async for chunk in self.body_iterator:
    |   File "/home/user/comps/llms/src/doc-summarization/integrations/common.py", line 211, in stream_generator
    |     async for chunk in llm_chain.astream_log(docs):
    |   File "/home/user/.local/lib/python3.11/site-packages/langchain_core/runnables/base.py", line 1112, in astream_log
    |     async for item in _astream_log_implementation(  # type: ignore
    |   File "/home/user/.local/lib/python3.11/site-packages/langchain_core/tracers/log_stream.py", line 675, in _astream_log_implementation
    |     await task
    |   File "/home/user/.local/lib/python3.11/site-packages/langchain_core/tracers/log_stream.py", line 629, in consume_astream
    |     async for chunk in runnable.astream(input, config, **kwargs):
    |   File "/home/user/.local/lib/python3.11/site-packages/langchain_core/runnables/base.py", line 1016, in astream
    |     yield await self.ainvoke(input, config, **kwargs)
    |           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    |   File "/home/user/.local/lib/python3.11/site-packages/langchain/chains/base.py", line 221, in ainvoke
    |     raise e
    |   File "/home/user/.local/lib/python3.11/site-packages/langchain/chains/base.py", line 212, in ainvoke
    |     await self._acall(inputs, run_manager=run_manager)
    |   File "/home/user/.local/lib/python3.11/site-packages/langchain/chains/combine_documents/base.py", line 154, in _acall
    |     output, extra_return_dict = await self.acombine_docs(
    |                                 ^^^^^^^^^^^^^^^^^^^^^^^^^
    |   File "/home/user/.local/lib/python3.11/site-packages/langchain/chains/combine_documents/stuff.py", line 277, in acombine_docs
    |     return await self.llm_chain.apredict(callbacks=callbacks, **inputs), {}
    |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    |   File "/home/user/.local/lib/python3.11/site-packages/langchain/chains/llm.py", line 335, in apredict
    |     return (await self.acall(kwargs, callbacks=callbacks))[self.output_key]
    |             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    |   File "/home/user/.local/lib/python3.11/site-packages/langchain_core/_api/deprecation.py", line 191, in awarning_emitting_wrapper
    |     return await wrapped(*args, **kwargs)
    |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    |   File "/home/user/.local/lib/python3.11/site-packages/langchain/chains/base.py", line 439, in acall
    |     return await self.ainvoke(
    |            ^^^^^^^^^^^^^^^^^^^
    |   File "/home/user/.local/lib/python3.11/site-packages/langchain/chains/base.py", line 221, in ainvoke
    |     raise e
    |   File "/home/user/.local/lib/python3.11/site-packages/langchain/chains/base.py", line 212, in ainvoke
    |     await self._acall(inputs, run_manager=run_manager)
    |   File "/home/user/.local/lib/python3.11/site-packages/langchain/chains/llm.py", line 300, in _acall
    |     response = await self.agenerate([inputs], run_manager=run_manager)
    |                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    |   File "/home/user/.local/lib/python3.11/site-packages/langchain/chains/llm.py", line 165, in agenerate
    |     return await self.llm.agenerate_prompt(
    |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    |   File "/home/user/.local/lib/python3.11/site-packages/langchain_core/language_models/llms.py", line 769, in agenerate_prompt
    |     return await self.agenerate(
    |            ^^^^^^^^^^^^^^^^^^^^^
    |   File "/home/user/.local/lib/python3.11/site-packages/langchain_core/language_models/llms.py", line 1210, in agenerate
    |     output = await self._agenerate_helper(
    |              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    |   File "/home/user/.local/lib/python3.11/site-packages/langchain_core/language_models/llms.py", line 1042, in _agenerate_helper
    |     raise e
    |   File "/home/user/.local/lib/python3.11/site-packages/langchain_core/language_models/llms.py", line 1026, in _agenerate_helper
    |     await self._agenerate(
    |   File "/home/user/.local/lib/python3.11/site-packages/langchain_core/language_models/llms.py", line 1541, in _agenerate
    |     await self._acall(prompt, stop=stop, run_manager=run_manager, **kwargs)
    |   File "/home/user/.local/lib/python3.11/site-packages/langchain_community/llms/huggingface_endpoint.py", line 294, in _acall
    |     async for chunk in self._astream(
    |   File "/home/user/.local/lib/python3.11/site-packages/langchain_community/llms/huggingface_endpoint.py", line 363, in _astream
    |     async for response in await self.async_client.text_generation(
    |   File "/home/user/.local/lib/python3.11/site-packages/huggingface_hub/inference/_common.py", line 322, in _async_stream_text_generation_response
    |     async for byte_payload in bytes_output_as_lines:
    |   File "/home/user/.local/lib/python3.11/site-packages/huggingface_hub/inference/_common.py", line 401, in _async_yield_from
    |     async for byte_payload in response.content:
    |   File "/home/user/.local/lib/python3.11/site-packages/aiohttp/streams.py", line 52, in __anext__
    |     rv = await self.read_func()
    |          ^^^^^^^^^^^^^^^^^^^^^^
    |   File "/home/user/.local/lib/python3.11/site-packages/aiohttp/streams.py", line 352, in readline
    |     return await self.readuntil()
    |            ^^^^^^^^^^^^^^^^^^^^^^
    |   File "/home/user/.local/lib/python3.11/site-packages/aiohttp/streams.py", line 386, in readuntil
    |     await self._wait("readuntil")
    |   File "/home/user/.local/lib/python3.11/site-packages/aiohttp/streams.py", line 346, in _wait
    |     with self._timer:
    |   File "/home/user/.local/lib/python3.11/site-packages/aiohttp/helpers.py", line 671, in __exit__
    |     raise asyncio.TimeoutError from exc_val
    | TimeoutError
    +------------------------------------

Attachments

No response

Metadata

Metadata

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions