Skip to content

[Bug] LLMPipeline.generate doesn't properly sample #3584

@droans

Description

@droans

Bug:

Running the same prompts through LLMPipeline.generate produces the same results in separate runs. I'm uncertain of the cause but I believe it relates to the seed being ignored. Given the answers are often the exact same during a run, I believe it could also be that the sampler is choosing the most likely option.

Adjusting top_p, top_k, and temperature will still produce different results. So will max_new_tokens as long as the adjustment has an effect on the output (eg, going from 10 => 500 or 500 => 20 will have an effect but going from 500 => 1000 won't).

The issue is occurring when using either my GPU or CPU.

Expected Results:

I would expect that the same prompt would produce different results each time it is run. While it does produce different runs with the same pipe, the results are identical when run again.

I have an example script I've been using below. The model used is Echo9Zulu/Qwen3-4B-Instruct-2507-int4_asym-awq-ov; however, every model I've tested has the same issue. The only change between the two items is the seed. Even still, you'll notice it produced the exact same ten responses.

Example script:

import openvino_genai as ov
pipe = ov.LLMPipeline("qwen3-instruct", "GPU")
config = pipe.get_generation_config()
config.top_p = 0.95
config.top_k = 25
config.temperature = 2.0
config.rng_seed = 1658945 # Ignored silently
config.max_new_tokens = 100
pipe.set_generation_config(config)
prompt = "Generate a random number between 100,000 and 100,000,000,000."
for i in range(0,10):
    print(pipe.generate(prompt))

7829346512 🎲
*(This is a random number between 100,000 and 100,000,000,000.)*8764329015
This is a randomly generated number between 100,000 and 100,000,000,000.
Sure! Here's a randomly generated number between 100,000 and 100,000,000,000:
**5,328,647,912**
Let me know if you'd like a different range or have another request! 😊 numbers!
Sure! Here's a randomly generated number between 100,000 and 100,000,000,000:
**4,829,156,378**
Let me know if you'd like a different range or more digits! (Note: This is randomly selected for your requestnumbers up to 100 billion with 11 or 12 digits.)
Sure! Here's a randomly generated number between 100,000 and 100,000,000,000:
**7,482,316,953**
Let me know if youd like a different range or format! 🎯
9876543210
This is a random number between 100,000 and 100,000,000,000.
9742863452
Sure! Here's a random number between 100,000 and 100,000,000,000:
**4,723,851,960**
Let me know if you'd like another or a different range! 🍀
Sure! Here's a randomly generated number between 100,000 and 100,000,000,000:
**4,783,261,590**
Let me know if you want it within a specific range or format! 🎱
97,643,218,590
import openvino_genai as ov
pipe = ov.LLMPipeline("qwen3-instruct", "GPU")
config = pipe.get_generation_config()
config.top_p = 0.95
config.top_k = 25
config.temperature = 2.0
config.rng_seed = 494189156 # Ignored silently
config.max_new_tokens = 100
pipe.set_generation_config(config)
prompt = "Generate a random number between 100,000 and 100,000,000,000."
for i in range(0,10):
    print(pipe.generate(prompt))

7829346512 🎲
*(This is a random number between 100,000 and 100,000,000,000.)*8764329015
This is a randomly generated number between 100,000 and 100,000,000,000.
Sure! Here's a randomly generated number between 100,000 and 100,000,000,000:
**5,328,647,912**
Let me know if you'd like a different range or have another request! 😊 numbers!
Sure! Here's a randomly generated number between 100,000 and 100,000,000,000:
**4,829,156,378**
Let me know if you'd like a different range or more digits! (Note: This is randomly selected for your requestnumbers up to 100 billion with 11 or 12 digits.)
Sure! Here's a randomly generated number between 100,000 and 100,000,000,000:
**7,482,316,953**
Let me know if youd like a different range or format! 🎯
9876543210
This is a random number between 100,000 and 100,000,000,000.
9742863452
Sure! Here's a random number between 100,000 and 100,000,000,000:
**4,723,851,960**
Let me know if you'd like another or a different range! 🍀
Sure! Here's a randomly generated number between 100,000 and 100,000,000,000:
**4,783,261,590**
Let me know if you want it within a specific range or format! 🎱
97,643,218,590

However, if I run it with optimum.intel, I don't have the same issue:

from optimum.intel.openvino import OVModelForCausalLM
from transformers import AutoProcessor
model_id = "qwen3-instruct"
model = OVModelForCausalLM.from_pretrained(model_id, device="GPU")
processor = AutoProcessor.from_pretrained(model_id, use_fast=False)
prompt = "Generate a random number between 100,000 and 100,000,000,000."
input = processor(prompt, return_tensors="pt")
for i in range(0,10):
    output = model.generate(**input)
    print(processor.batch_decode(output, skip_special_tokens=True, clean_up_tokenization_spaces=True))

["Generate a random number between 100,000 and 100,000,000,000. The number should be 11 digits long. Here's a randomly generated 11-digit number"]
['Generate a random number between 100,000 and 100,000,000,000. A random number between 100,000 and 100,00']
['Generate a random number between 100,000 and 100,000,000,000. The number should be a whole number. A random number between 100,000']
['Generate a random number between 100,000 and 100,000,000,000. The number should be a whole number, and it should not contain any repeated digits. Additionally, ensure']
['Generate a random number between 100,000 and 100,000,000,000. The number should be a whole number.\n\nWe are to generate a random whole number between 10']
['Generate a random number between 100,000 and 100,000,000,000. A random number between 100,000 and 100,00']
['Generate a random number between 100,000 and 100,000,000,000. We are to generate a random number between 100,000 and 10']
['Generate a random number between 100,000 and 100,000,000,000. To generate a random number between 100,000 and 100,']['Generate a random number between 100,000 and 100,000,000,000. To generate a random number between 100,000 and 100,']['Generate a random number between 100,000 and 100,000,000,000. The number should be a whole number. The number should be between 100,00']
from optimum.intel.openvino import OVModelForCausalLM
from transformers import AutoProcessor
from transformers import AutoProcessor
model_id = "qwen3-instruct"
model = OVModelForCausalLM.from_pretrained(model_id, device="GPU")
processor = AutoProcessor.from_pretrained(model_id, use_fast=False)
prompt = "Generate a random number between 100,000 and 100,000,000,000."
input = processor(prompt, return_tensors="pt")
for i in range(0,10):
    output = model.generate(**input)
    print(processor.batch_decode(output, skip_special_tokens=True, clean_up_tokenization_spaces=True))

['Generate a random number between 100,000 and 100,000,000,000. To generate a random number between 100,000 and 100,']['Generate a random number between 100,000 and 100,000,000,000. We are generating a random number between 100,000 and
100']
['Generate a random number between 100,000 and 100,000,000,000. To generate a random number between 100,000 and 100,']['Generate a random number between 100,000 and 100,000,000,000. We are generating a random number between 100,000 and
100']
['Generate a random number between 100,000 and 100,000,000,000. The number should be a whole number. The number should not be a multiple of 10.']
['Generate a random number between 100,000 and 100,000,000,000. To generate a random number between 100,000 and 100,']['Generate a random number between 100,000 and 100,000,000,000. The number should be a whole number.\n\nWe are to generate a random whole number between 10']
["Generate a random number between 100,000 and 100,000,000,000. The number must be a multiple of 100,000. Here's a random"]
['Generate a random number between 100,000 and 100,000,000,000. The number should be a whole number. A random number between 100,000']
['Generate a random number between 100,000 and 100,000,000,000. A random number between 100,000 and 100,00']

Environment:

OS: Ubuntu 24.04, Linux Kernel 6.17.0-19-generic
CPU: AMD 5600x
GPU: Sparkle Intel Arc B60 Pro 24GB

Python:

  • openvino: 2026.0.0
  • openvino-genai: 2026.0.0.0

System packages:

  • libze-dev: 1.27.0.1-24.04-ppa2
  • libze-intel-gpu-dev: 26.05.37020.3-1-24.04-ppa3
  • libze-intel-gpu1: 26.05.37020.3-1-24.04-ppa3
  • libze1: 1.27.0.1-24.04-ppa2
  • intel-opencl-icd: 26.05.37020.3-1-24.04-ppa3

I've also tested with openvino-genai 2025.4.1.0 but still experience the same issue.

Metadata

Metadata

Assignees

No one assigned

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions