Supporting ARM SVE, the newer extended vector instruction set for aarch64

# Summary

Dear @mdouze and all,

ARM SVE is a newer extended vector instruction set than NEON and is supported on CPUs like AWS Graviton3 and Fujitsu A64fx.
I've added SVE support and some functions implemented with SVE to faiss, then compared their execution times.
It seems that my implementation improves the performance on some environment.
This is just first implementation to show the ability of SVE, and I plan to implemnent SVE version of other functions currently not ported to SVE.

It might be unable to check on Circle CI currently, however would you mind if I submit this as PR?

# Platform

OS: Ubuntu 22.04

Faiss version: a3296f42adee7a0159b7ac09d7642e862edb142f, and mine

Installed from: compiled by myself

Faiss compilation options: `cmake -B build -DFAISS_ENABLE_GPU=OFF -DPython_EXECUTABLE=$(which python3) -DCMAKE_BUILD_TYPE=Release -DBUILD_TESTING=ON -DFAISS_OPT_LEVEL=sve` ( `-DFAISS_OPT_LEVEL=sve` is new optlevel introduced by my changes)

Running on:
- [x] CPU
- [ ] GPU

Interface: 
- [ ] C++
- [x] Python

# Reproduction instructions

I only post the results to search SIFT1M. If you need more detailed information, please let me know.

![benchmark result](https://github.com/facebookresearch/faiss/assets/40021161/ac34630a-7470-42e0-945a-baa0bea012ff)

- Evaluated on an AWS EC2 c7g.large instance, run faiss on 
- `original` is the current (a3296f42adee7a0159b7ac09d7642e862edb142f) implementation
- `SVE` is the result of my implementation supporting ARM SVE

![image](https://github.com/facebookresearch/faiss/assets/40021161/8edfc585-a4dd-4a43-93f5-217c9418e9ba)

The above image illustrates the ratio of speed up.

- In the best case, `SVE` is approx. 2.26x faster than `original` (IndexIVFPQ + IndexHNSWFlat, M: 32  nprove: 16)
    - `original` : 0.618 ms
    - `SVE` : 0.274 ms

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Supporting ARM SVE, the newer extended vector instruction set for aarch64 #2884

Summary

Platform

Reproduction instructions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Supporting ARM SVE, the newer extended vector instruction set for aarch64 #2884

Description

Summary

Platform

Reproduction instructions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions