Skip to content

mtmd: (WIP) gemma3n vision support#17961

Draft
ngxson wants to merge 23 commits intoggml-org:masterfrom
ngxson:gemma3n_mtmd
Draft

mtmd: (WIP) gemma3n vision support#17961
ngxson wants to merge 23 commits intoggml-org:masterfrom
ngxson:gemma3n_mtmd

Conversation

@ngxson
Copy link
Contributor

@ngxson ngxson commented Dec 12, 2025

This is a non-working WIP of gemma 3n vision implement in ggml. Commits are from June 2025 because it was under NDA before the model released.

I could not finished it on-time because testing was tricky and there was no native transformers implementation. My code was heavily based on mlx-vlm implementation: https://github.com/Blaizzy/mlx-vlm/blob/main/mlx_vlm/models/gemma3n/vision.py

IMPORTANT INFO: I am NOT actively working on this feature, but only pushing the PR for visibility. Contributors can take over this task if needed. --> It's better to wait to see if their next model still use mobilenet architecture; if yes, we continue with this implementation, otherwise we drop it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

examples python python script changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant