Skip to content

feat: Add HDF5 baseline inference cache to eliminate redundant forward passes in perturbation evaluation #567

Description

@AdityaX18

Problem

The current eval loop in perceptionmetrics/models/torch_detection.py
re-runs full model inference for every perturbation condition. For N images
× P perturbation types × I intensities this produces N·P·I forward passes.

For COCO val2017 (5,000 images), 5 types, 5 intensities = 125,000 forward
passes. The clean baseline preprocessing and inference is repeated 25×
even though the model and data are identical each time.

Proposed solution

A standalone perceptionmetrics/utils/cache.py with:

  • CacheWriter — context manager, writes preprocessed tensors + detection
    predictions (bboxes, labels, scores) to HDF5 after one baseline eval run
  • CacheReader — validates model_hash + schema_version on open, lazy access
  • is_cache_valid(path, model_hash) → bool — O(1) guard for eval loop

Layer 1 only (disk cache write/read). Integration into torch_detection.py
is a follow-up PR.

HDF5 schema

├── metadata/ (model_name, coco_split, model_hash, timestamp, schema_version)
├── tensors/{img_id} float32 (C, H, W)
└── preds/{img_id}/bboxes float32 (N_det, 4)
/labels int64 (N_det,)
/scores float32 (N_det,)

Zero-detection images write empty (0,4)/(0,)/(0,) datasets — never skip.

Why HDF5

Variable-length prediction arrays + fixed-shape image tensors in one file
with O(1) random access by image ID. Parquet, LMDB, and zarr each fail one
of these requirements.

Tests included: round-trip, stale hash, is_cache_valid, zero-detection, metadata.
I plan to submit a solution for this

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions