Skip to content

Commit f647afb

Browse files
gshtrasshreyankg
authored andcommitted
[ROCm] Avoid using the default stream on ROCm (vllm-project#13238)
Signed-off-by: Gregory Shtrasberg <[email protected]>
1 parent 40ce1ee commit f647afb

File tree

1 file changed

+6
-1
lines changed

1 file changed

+6
-1
lines changed

vllm/utils.py

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -942,11 +942,16 @@ def current_stream() -> torch.cuda.Stream:
942942
the underlying hypothesis is that we do not call `torch._C._cuda_setStream`
943943
from C/C++ code.
944944
"""
945+
from vllm.platforms import current_platform
945946
global _current_stream
946947
if _current_stream is None:
947948
# when this function is called before any stream is set,
948949
# we return the default stream.
949-
_current_stream = torch.cuda.current_stream()
950+
# On ROCm using the default 0 stream in combination with RCCL
951+
# is hurting performance. Therefore creating a dedicated stream
952+
# per process
953+
_current_stream = torch.cuda.Stream() if current_platform.is_rocm(
954+
) else torch.cuda.current_stream()
950955
return _current_stream
951956

952957

0 commit comments

Comments
 (0)