[Question] Recommended KNN/ANN index for large datasets

I would like to use `CropResistantHash` to quickly find near-duplicates from a large set of reference images.

With other hash functions I would normaly use some kind of approximate nearest neighbor index, such as NMSLib or Annoy. The challenge is that `CropResistantHash` is variable length and cannot be compared using one of the standard distance functions (Angular, Hamming, Manhattan, ...). 

Can anyone point me to an alternative solution? How do you use this with large datasets?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question] Recommended KNN/ANN index for large datasets #127

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[Question] Recommended KNN/ANN index for large datasets #127

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions