Skip to content

Commit c85b51b

Browse files
Superjomndominicshanshan
authored andcommitted
[https://nvbugs/5427043][fix] cherrypick: request length exceeds max_num_tokens (NVIDIA#7718)
Signed-off-by: Superjomn <[email protected]> Signed-off-by: Wangshanshan <[email protected]>
1 parent 0b6ea14 commit c85b51b

File tree

1 file changed

+14
-0
lines changed

1 file changed

+14
-0
lines changed

tests/unittest/llmapi/test_llm_pytorch.py

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -892,3 +892,17 @@ def test_min_tokens(use_speculative: bool):
892892

893893
assert len(res.outputs) == 1
894894
assert len(res.outputs[0].token_ids) == output_len
895+
896+
897+
class TestLlmError:
898+
899+
def test_max_num_token_check(self):
900+
""" LLM should raise error when got prompt length exceed the valid range. """
901+
llm = LLM(llama_model_path,
902+
kv_cache_config=global_kvcache_config,
903+
max_num_tokens=100)
904+
905+
with pytest.raises(ValueError,
906+
match="should not exceed max_num_tokens"):
907+
ids = [random.randint(10, 100) for _ in range(101)]
908+
llm.generate([ids])

0 commit comments

Comments
 (0)