vlm-exam

Benchmark suite for Vision Language Models. Compare accuracy, cost, and speed across frontier VLMs on standardized visual tasks.

Supported tasks

VQA / OCR -- visual question answering and optical character recognition

Supported providers

Anthropic (Claude)
Google (Gemini)
OpenAI (GPT)

Installation

pip install vlm-exam

Or install from source:

git clone https://github.com/roboflow/vlm-exam.git
cd vlm-exam
pip install -e ".[dev]"

Quick start

CLI

export ANTHROPIC_API_KEY=...
export GOOGLE_API_KEY=...
export OPENAI_API_KEY=...

vlm-exam run \
    --task vqa \
    --models claude-fable-5,gemini-3.5-flash,gpt-5.5 \
    --effort high \
    --dataset-directory /path/to/vqa/dataset

vlm-exam report --results-directory results/

Python

from vlm_exam import load_config, create_provider, create_task, run_benchmark

config = load_config()
task = create_task("vqa")
samples = task.load_samples("/path/to/vqa/dataset")
provider = create_provider("anthropic", model="claude-fable-5")

results = run_benchmark(task=task, provider=provider, samples=samples, effort="high")

Configuration

Model definitions, pricing, and lab branding live in src/vlm_exam/configs/models.yaml. Add a new model by editing this file -- no code changes required.

License

Apache 2.0. See LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
notebooks		notebooks
src/vlm_exam		src/vlm_exam
tests		tests
.gitignore		.gitignore
AGENTS.md		AGENTS.md
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

vlm-exam

Supported tasks

Supported providers

Installation

Quick start

CLI

Python

Configuration

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

vlm-exam

Supported tasks

Supported providers

Installation

Quick start

CLI

Python

Configuration

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages