-
Notifications
You must be signed in to change notification settings - Fork 26
Description
Goal
Port the OrcaHello SRKW detection model from fastai to pure PyTorch, removing fastai==1.0.61 dependency. Greatly simplifies deployment/CI and enables easier sharing/use of the model (e.g. HuggingFace).
Context
-
Happened to revisit, and noticed how complicated local setup, prod container and running CI tests had gotten, with
patch_fastai_audio.shhttps://github.com/orcasound/aifororcas-livesystem/blob/main/InferenceSystem/patch_fastai_audio.sh -
Even after fixes from @dthaler Optimize FastAIModel for reduced peak memory during inference #377 , fastai inference code (using
databunchto load segments like during training) isn't maintainable and has opaque behavior fastai_inference.py is non-deterministic #390. -
Hinders continued evaluation/tuning on new data, usage and citation elsewhere (e.g. processing archives), or comparison with newer candidates (e.g. Perch 2.0) AI model appears to be 5 years old #378
Plan
There's now a pure Pytorch implementation in branch akash/inference-v1-nofastai with pytests that confirm numerical parity of both (1) audio preprocessing (2) model inference on test inputs. The checkpoint is converted once, and then can be used without fastai dependency.
When fully merged to main, addresses #150, making model inference much easier to use
e.g. directly from Huggingface orcasound/orcahello-srkw-detector-v1 as below.
from model_v1.inference import OrcaHelloSRKWDetectorV1
# Load model from HuggingFace Hub
model = OrcaHelloSRKWDetectorV1.from_pretrained("orcasound/orcahello-srkw-detector-v1")
# Detect SRKW calls in audio file
result = model.detect_srkw_from_file("audio.wav", config)
print(f"Orca detected: {result.global_prediction}")
print(f"Confidence: {result.global_confidence:.1f}%")
Will incrementally open PRs and track overall on this issue.