Errors when providing an input image to MiniCPM-O 2.6

## Describe the bug
I'm trying to use mistral.rs to run MiniCPM-O on Google Cloud Run (with NVIDIA L4 GPU). I created a custom Dockerfile (see at the end) based on `Dockerfile.cuda-all` and built the latest master branch. The main addition in my Dockerfile is downloading the model and storing it in the Docker image.

Everything works fine when I send a request with text only, e.g.
```python 
    response = requests.post(
        "http://localhost:9000/v1/chat/completions",
        json={
            "model":"minicpmo_2_6",
            "messages": [
                {
                    "role": "user",
                    "content": "What is your name?",
                }
            ],
            "max_tokens": 256,
            "frequency_penalty": 1.0,
            "top_p": 0.1,
            "temperature": 0,
        }
    )
```

When I try sending the example from [your docs](https://github.com/EricLBuehler/mistral.rs/blob/master/docs/MINICPMO_2_6.md) it fails. The request I'm sending is like this:

```python
    response = requests.post(
        "http://localhost:9000/v1/chat/completions",
        json={
            "model":"minicpmo_2_6",
            "messages": [
                {
                    "role": "user",
                    "content": [
                        {
                            "type": "image_url",
                            "image_url": {
                                "url": "https://www.nhmagazine.com/content/uploads/2019/05/mtwashingtonFranconia-2-19-18-108-Edit-Edit.jpg"
                            },
                        },
                        {
                            "type": "text",
                            "text": "(<image>./</image>) What is shown in this image? Write a detailed response analyzing the scene.",
                        },
                    ],
                }
            ],
            "max_tokens": 256,
            "frequency_penalty": 1.0,
            "top_p": 0.1,
            "temperature": 0,
        }
    )
```

I also tried with a different image URL and a base64-encoded image. The result is always the same, an error like this in the logs:

```
'[2m2025-02-28T17:16:20.454460Z[0m [31mERROR[0m [2mmistralrs_core::engine[0m[2m:[0m step - Model failed with error: WithBacktrace { inner: Msg("shape mismatch slot_mapping [49], expected 511"), backtrace: Backtrace [{ fn: "candle_core::error::Error::bt" }, { fn: "mistralrs_paged_attn::cuda::backend::paged_attention::reshape_and_cache" }, { fn: "mistralrs_core::paged_attention::layers::paged_attention::PagedAttention::forward" }, { fn: "mistralrs_core::models::qwen2::Model::forward_embed" }, { fn: "<mistralrs_core::vision_models::minicpmo::MiniCpmOModel as mistralrs_core::pipeline::loaders::vision_loaders::VisionModel>::forward" }, { fn: "<mistralrs_core::pipeline::vision::VisionPipeline as mistralrs_core::pipeline::Pipeline>::forward_inputs" }, { fn: "mistralrs_core::pipeline::Pipeline::step::{{closure}}" }, { fn: "mistralrs_core::engine::Engine::run::{{closure}}" }, { fn: "tokio::runtime::runtime::Runtime::block_on" }, { fn: "std::sys::backtrace::__rust_begin_short_backtrace" }, { fn: "core::ops::function::FnOnce::call_once{{vtable.shim}}" }, { fn: "std::sys::pal::unix::thread::Thread::new::thread_start" }] }
'thread '<unnamed>' panicked at mistralrs-core/src/pipeline/inputs_processor.rs:395:21:
'Block table is too small (completion)! start_pos=510 block_size=32 table_len=2
stack backtrace:
0: rust_begin_unwind
1: core::panicking::panic_fmt
2: mistralrs_core::pipeline::inputs_processor::text_models_inputs_processor::make_completion_chunk
3: mistralrs_core::pipeline::inputs_processor::text_models_inputs_processor::get_completion_input
4: <mistralrs_core::vision_models::minicpmo::inputs_processor::MiniCpmOImageProcessor as mistralrs_core::pipeline::inputs_processor::InputsProcessor>::process_inputs
5: mistralrs_core::pipeline::Pipeline::step::{{closure}}
6: mistralrs_core::engine::Engine::run::{{closure}}
7: tokio::runtime::runtime::Runtime::block_on
'note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
```

Full log:
[downloaded-logs-20250228-092532.txt](https://github.com/user-attachments/files/19031561/downloaded-logs-20250228-092532.txt)

Dockerfile + Cloud Build config: https://gist.github.com/jgonera/3c792ee3f44ec1fc12ba7ede7f723550

Full request-making code: https://gist.github.com/jgonera/326ff5d1612a72d0b80194636146f38c

Am I missing something obvious? I'd appreciate any help!

## Latest commit or version
[e2f9648](https://github.com/EricLBuehler/mistral.rs/commit/e2f9648cbb6e4b8a6ac1d9e44df5900210fa5862)



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Errors when providing an input image to MiniCPM-O 2.6 #1166

Describe the bug

Latest commit or version

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Errors when providing an input image to MiniCPM-O 2.6 #1166

Description

Describe the bug

Latest commit or version

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions