Skip to content

Commit 8437354

Browse files
Superjomndominicshanshan
authored andcommitted
[https://nvbugs/5427043][fix] cherrypick: request length exceeds max_num_tokens (NVIDIA#7718)
Signed-off-by: Superjomn <[email protected]> Signed-off-by: Wangshanshan <[email protected]>
1 parent ea419b8 commit 8437354

File tree

1 file changed

+12
-0
lines changed

1 file changed

+12
-0
lines changed

tests/unittest/llmapi/test_llm_pytorch.py

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -925,3 +925,15 @@ def test_llm_return_logprobs_streaming(prompt_logprobs, logprobs,
925925
return_generation_logits,
926926
streaming=True,
927927
backend="pytorch")
928+
class TestLlmError:
929+
930+
def test_max_num_token_check(self):
931+
""" LLM should raise error when got prompt length exceed the valid range. """
932+
llm = LLM(llama_model_path,
933+
kv_cache_config=global_kvcache_config,
934+
max_num_tokens=100)
935+
936+
with pytest.raises(ValueError,
937+
match="should not exceed max_num_tokens"):
938+
ids = [random.randint(10, 100) for _ in range(101)]
939+
llm.generate([ids])

0 commit comments

Comments
 (0)