I would like to use CropResistantHash to quickly find near-duplicates from a large set of reference images.
With other hash functions I would normaly use some kind of approximate nearest neighbor index, such as NMSLib or Annoy. The challenge is that CropResistantHash is variable length and cannot be compared using one of the standard distance functions (Angular, Hamming, Manhattan, ...).
Can anyone point me to an alternative solution? How do you use this with large datasets?