Describe the bug
Description
In _prepare_chunked_prefill function, we are slicing the input embeds based on chunk size, id etc. However, for MM input, we are recomputing it using mm_features and slice of the input embeds corresponding to the chunk, and not the full sequence. This results in image features getting scattered into wrong positions, particularly the last chunks getting messed up.
How to reproduce
This doesn't create a crash or anything, but would affect quality.
I noticed quality issues with ministral3 model, which resulted in the discovery of this issue.
Additional context
No response
Checklist
Describe the bug
Description
In
_prepare_chunked_prefillfunction, we are slicing the input embeds based on chunk size, id etc. However, for MM input, we are recomputing it using mm_features and slice of the input embeds corresponding to the chunk, and not the full sequence. This results in image features getting scattered into wrong positions, particularly the last chunks getting messed up.How to reproduce
This doesn't create a crash or anything, but would affect quality.
I noticed quality issues with ministral3 model, which resulted in the discovery of this issue.
Additional context
No response
Checklist