Skip to content

[Question] Recommended KNN/ANN index for large datasets #127

@misotrnka

Description

@misotrnka

I would like to use CropResistantHash to quickly find near-duplicates from a large set of reference images.

With other hash functions I would normaly use some kind of approximate nearest neighbor index, such as NMSLib or Annoy. The challenge is that CropResistantHash is variable length and cannot be compared using one of the standard distance functions (Angular, Hamming, Manhattan, ...).

Can anyone point me to an alternative solution? How do you use this with large datasets?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions