Skip to content

ENH: Add MinerU2.5-2509-1.2B for VLLM#4523

Closed
GaoLeiA wants to merge 4 commits intoxorbitsai:mainfrom
GaoLeiA:mineru
Closed

ENH: Add MinerU2.5-2509-1.2B for VLLM#4523
GaoLeiA wants to merge 4 commits intoxorbitsai:mainfrom
GaoLeiA:mineru

Conversation

@GaoLeiA
Copy link
Copy Markdown
Contributor

@GaoLeiA GaoLeiA commented Jan 22, 2026

Add MinerU2.5-2509-1.2B as a Vision-Language Model to Xinference
Support image upload and interactive analysis in the VLM Chat interface
It is recommended to use MinerU CLI for processing PDF documents

@XprobeBot XprobeBot added this to the v1.x milestone Jan 22, 2026
@ZhikaiGuo960110
Copy link
Copy Markdown
Collaborator

ZhikaiGuo960110 commented Jan 23, 2026

to fix this issue, #4518
This PR solves :
Add MinerU2.5-2509-1.2B as a Vision-Language Model to Xinference
Support image upload and interactive analysis in the VLM Chat interface
It is recommended to use MinerU CLI for processing PDF documents

@GaoLeiA GaoLeiA changed the title Mineru Enhance: Add MinerU2.5-2509-1.2B for VLLM Jan 23, 2026
@qinxuye qinxuye changed the title Enhance: Add MinerU2.5-2509-1.2B for VLLM ENH: Add MinerU2.5-2509-1.2B for VLLM Jan 23, 2026
@XprobeBot XprobeBot added the enhancement New feature or request label Jan 23, 2026
@qinxuye
Copy link
Copy Markdown
Contributor

qinxuye commented Jan 29, 2026

Could you rebase the code?

@OliverBryant
Copy link
Copy Markdown
Collaborator

Could you please take a look at two things?

  1. Lint is reporting errors, which may need fixing. You can install a pre-commit hook to help identify where the issues are.
  2. The new version of xinference enables virtual environment configuration by default. Therefore, the virtualenv field in the JSON needs to be adapted for the new version. For details, please refer to the latest documentation (it hasn't been merged yet; you can preview it in PR DOC: add v2.0 doc #4545 for now).

@OliverBryant
Copy link
Copy Markdown
Collaborator

You need to modify the JSON file in ModelHub. Directly editing JSON within the code is not permitted. You may need to revert your recent changes and then edit the JSON in ModelHub, which will automatically commit to the current PR.
Additionally, the engines available in the virtual environment are identified via the #engine# field. Therefore, if the model can be launched by other engines, additional dependencies must be added. Your current changes restrict the model to launching only via vllm, rendering frameworks like Transformers unusable. To enable Transformers support, add a dependency like: transformers_dependencies ; #engine# == \“Transformers\”"

@GaoLeiA
Copy link
Copy Markdown
Contributor Author

GaoLeiA commented Jan 29, 2026

As you said "transformers_dependencies ; #engine# == \“Transformers\”" ”

@GaoLeiA
Copy link
Copy Markdown
Contributor Author

GaoLeiA commented Jan 30, 2026

验证成功。请帮忙通过。

Integrate MinerU VLM model support into Xinference:

 Core Changes:
- xinference/model/llm/llm_family.json: Add mineru-vlm model configuration
- xinference/model/llm/vllm/core.py: Add base64 image/PDF handling
- xinference/ui/gradio/chat_interface.py: Add PDF upload support in chat UI
- xinference/ui/gradio/media_interface.py: Enhance media interface
- xinference/model/image/model_spec.json: Update OCR model specifications

 Features:
- VLM chat with multi-modal support (image, video, audio, PDF)
- Base64 data URI handling for seamless file uploads
- MinerU VLM model ready for serving via vLLM engine
- Fix W293: Remove trailing whitespace from blank lines in media_interface.py
- Fix E303: Remove extra blank lines in media_interface.py
- Fix F841: Remove unused variable file_ext in media_interface.py
- Fix W391: Remove blank line at end of core.py
@qinxuye
Copy link
Copy Markdown
Contributor

qinxuye commented Feb 6, 2026

Replaced by #4569

@qinxuye qinxuye closed this Feb 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants