Change the repository type filter
All
Repositories list
21 repositories
OpenTrackVLA
PublicOpen & Reproducible Research for Tracking VLAsVLM-FO1
PublicVLM-FO1: Bridging the Gap Between High-Level Reasoning and Fine-Grained Perception in VLMs- [EMNLP-2025 Oral] ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration
VLM-R1
PublicSolve Visual Understanding with Reinforced VLMsom-ai-lab.github.io
Public- Reproducible Language Agent Research
vlm-r1seg
PublicVLM-R1.github.io
PublicRS5M
PublicRS5M: a large-scale vision language dataset for remote sensing [TGRS]OmChat
PublicOmAgentDocs
PublicOmDet
PublicReal-time and accurate open-vocabulary end-to-end object detectionVL-CheckList
PublicEvaluating Vision & Language Pretraining Models with Objects, Attributes and Relations. [EMNLP 2022]OmModel
Publicawesome-RSVLM
PublicOVDEval
PublicA Comprehensive Evaluation Benchmark for Open-Vocabulary Detection (AAAI 2024)GroundVLP
PublicGroundVLP: Harnessing Zero-shot Visual Grounding from Vision-Language Pre-training and Open-Vocabulary Object Detection (AAAI 2024)habitat-lab
Public