Problem
The current eval loop in perceptionmetrics/models/torch_detection.py
re-runs full model inference for every perturbation condition. For N images
× P perturbation types × I intensities this produces N·P·I forward passes.
For COCO val2017 (5,000 images), 5 types, 5 intensities = 125,000 forward
passes. The clean baseline preprocessing and inference is repeated 25×
even though the model and data are identical each time.
Proposed solution
A standalone perceptionmetrics/utils/cache.py with:
CacheWriter — context manager, writes preprocessed tensors + detection
predictions (bboxes, labels, scores) to HDF5 after one baseline eval run
CacheReader — validates model_hash + schema_version on open, lazy access
is_cache_valid(path, model_hash) → bool — O(1) guard for eval loop
Layer 1 only (disk cache write/read). Integration into torch_detection.py
is a follow-up PR.
HDF5 schema
├── metadata/ (model_name, coco_split, model_hash, timestamp, schema_version)
├── tensors/{img_id} float32 (C, H, W)
└── preds/{img_id}/bboxes float32 (N_det, 4)
/labels int64 (N_det,)
/scores float32 (N_det,)
Zero-detection images write empty (0,4)/(0,)/(0,) datasets — never skip.
Why HDF5
Variable-length prediction arrays + fixed-shape image tensors in one file
with O(1) random access by image ID. Parquet, LMDB, and zarr each fail one
of these requirements.
Tests included: round-trip, stale hash, is_cache_valid, zero-detection, metadata.
I plan to submit a solution for this
Problem
The current eval loop in
perceptionmetrics/models/torch_detection.pyre-runs full model inference for every perturbation condition. For N images
× P perturbation types × I intensities this produces N·P·I forward passes.
For COCO val2017 (5,000 images), 5 types, 5 intensities = 125,000 forward
passes. The clean baseline preprocessing and inference is repeated 25×
even though the model and data are identical each time.
Proposed solution
A standalone
perceptionmetrics/utils/cache.pywith:CacheWriter— context manager, writes preprocessed tensors + detectionpredictions (bboxes, labels, scores) to HDF5 after one baseline eval run
CacheReader— validatesmodel_hash+schema_versionon open, lazy accessis_cache_valid(path, model_hash) → bool— O(1) guard for eval loopLayer 1 only (disk cache write/read). Integration into
torch_detection.pyis a follow-up PR.
HDF5 schema
├── metadata/ (model_name, coco_split, model_hash, timestamp, schema_version)
├── tensors/{img_id} float32 (C, H, W)
└── preds/{img_id}/bboxes float32 (N_det, 4)
/labels int64 (N_det,)
/scores float32 (N_det,)
Zero-detection images write empty
(0,4)/(0,)/(0,)datasets — never skip.Why HDF5
Variable-length prediction arrays + fixed-shape image tensors in one file
with O(1) random access by image ID. Parquet, LMDB, and zarr each fail one
of these requirements.
Tests included: round-trip, stale hash, is_cache_valid, zero-detection, metadata.
I plan to submit a solution for this