Skip to content

Commit db74ce2

Browse files
committed
📝 document logic for suppressin gabort timeouts
Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
1 parent 2f7d8a6 commit db74ce2

1 file changed

Lines changed: 10 additions & 2 deletions

File tree

vllm/entrypoints/openai/rpc/client.py

Lines changed: 10 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -336,8 +336,16 @@ async def _is_tracing_enabled_rpc(self) -> bool:
336336
async def abort(self, request_id: str):
337337
"""Send an ABORT_REQUEST signal to the RPC Server"""
338338

339-
# Suppress timeouts as well- if the server is busy and does not ack in
340-
# time we assume it got the message.
339+
# Suppress timeouts as well.
340+
# In cases where the server is busy processing requests and a very
341+
# large volume of abort requests arrive, it is likely that the server
342+
# will not be able to ack all of them in time. We have seen this when
343+
# we abort 20k requests at once while another 2k are processing- many
344+
# of them time out, but we see the server successfully abort all of the
345+
# requests.
346+
# In this case we assume that the server has received or will receive
347+
# these abort requests, and ignore the timeout. This prevents a massive
348+
# wall of `TimeoutError` stack traces.
341349
with suppress(RPCClientClosedError, TimeoutError):
342350
await self._send_one_way_rpc_request(
343351
request=RPCAbortRequest(request_id),

0 commit comments

Comments
 (0)