Skip to content

[WIP] feature : add zero copy support for vector index#1032

Closed
foxspy wants to merge 1 commit intozilliztech:mainfrom
foxspy:zero_copy_support
Closed

[WIP] feature : add zero copy support for vector index#1032
foxspy wants to merge 1 commit intozilliztech:mainfrom
foxspy:zero_copy_support

Conversation

@foxspy
Copy link
Collaborator

@foxspy foxspy commented Jan 15, 2025

issue: #1031

@sre-ci-robot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: foxspy

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@foxspy foxspy changed the title feature : add zero copy support for vector index [WIP] feature : add zero copy support for vector index Jan 15, 2025
@mergify mergify bot added the dco-passed label Jan 15, 2025
@mergify
Copy link

mergify bot commented Jan 15, 2025

@foxspy 🔍 Important: PR Classification Needed!

For efficient project management and a seamless review process, it's essential to classify your PR correctly. Here's how:

  1. If you're fixing a bug, label it as kind/bug.
  2. For small tweaks (less than 20 lines without altering any functionality), please use kind/improvement.
  3. Significant changes that don't modify existing functionalities should be tagged as kind/enhancement.
  4. Adjusting APIs or changing functionality? Go with kind/feature.

For any PR outside the kind/improvement category, ensure you link to the associated issue using the format: “issue: #”.

Thanks for your efforts and contribution to the community!.

@foxspy foxspy force-pushed the zero_copy_support branch 3 times, most recently from 5385cae to dc5a6ac Compare January 15, 2025 13:03
@codecov
Copy link

codecov bot commented Jan 15, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 72.81%. Comparing base (3c46f4c) to head (771d36e).
Report is 344 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff            @@
##           main    #1032       +/-   ##
=========================================
+ Coverage      0   72.81%   +72.81%     
=========================================
  Files         0       82       +82     
  Lines         0     7529     +7529     
=========================================
+ Hits          0     5482     +5482     
- Misses        0     2047     +2047     

see 82 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Collaborator

@alexanderguzhva alexanderguzhva left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one topic to discuss. Otherwise, lgtm

MetricType metric_type;
float metric_arg; ///< argument of the metric type

std::shared_ptr<MmappedFileMappingOwner> mmap_owner;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a very tricky point. Although, I understand why this was introduced, I'd like to avoid adding it here. Basically, the approach is similar to a std::string_view, which does not have a business to maintain the memory. Similar, here: It is not Index's instance business to maintain the view.

I'm open for discussions.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@foxspy any comments on this? Basically, I doubt that we'll able to introduce this variable in vanilla Faiss when porting back

}
}

ZeroCopyIOReader* zr = dynamic_cast<ZeroCopyIOReader*>(f);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a todo for the future: basically, a reader need to instantiate an object. But this is a later business, because it needs to be properly synchronized with the vanilla faiss first.
So, let's keep the change in the way you proposed it.

Copy link
Collaborator Author

@foxspy foxspy Jan 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, so I put the PR up first; zero copy is needed here mainly for

  1. Optimizing memory expansion caused by copying during loading
  2. The memory will be managed uniformly through Eyrie later, and the index does not hold data, but only views
  3. The implementation of mmap and zero copy can be unified (mmap externally first, load through zero copy)
    If you have any ideas about this, please feel free to communicate; has faiss considered supporting this mode?

@foxspy foxspy force-pushed the zero_copy_support branch from dc5a6ac to 7dc2032 Compare January 16, 2025 02:39
@foxspy foxspy force-pushed the zero_copy_support branch 2 times, most recently from b58a68a to 91cfcde Compare February 6, 2025 08:06
Signed-off-by: xianliang.li <xianliang.li@zilliz.com>
@foxspy foxspy force-pushed the zero_copy_support branch from 91cfcde to 771d36e Compare February 6, 2025 08:09
@mergify mergify bot added the ci-passed label Feb 6, 2025

MemoryIOReader reader(binary->data.get(), binary->size);
int io_flags = faiss::IO_FLAG_ZERO_COPY;
faiss::ZeroCopyIOReader reader(binary->data.get(), binary->size);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Such an approach won't lead to a memory leak, correct?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The binary will be held by the index; so there will be no memory leak

MetricType metric_type;
float metric_arg; ///< argument of the metric type

std::shared_ptr<MmappedFileMappingOwner> mmap_owner;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@foxspy any comments on this? Basically, I doubt that we'll able to introduce this variable in vanilla Faiss when porting back


namespace faiss {

ZeroCopyIOReader::ZeroCopyIOReader(uint8_t* data, size_t size) : data_(data), rp_(0), total_(size) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A duplicate code, similar to src/io/memory_io.h

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is because Knowhere is a hub that is adapted to multiple indexes. For indexes other than faiss, it uses the memory_io.h. Faiss needs to remain independent zerocopy_io.cpp and not rely on related objects of knowhere.

@foxspy
Copy link
Collaborator Author

foxspy commented Feb 24, 2025

Zero copy may cause data misalignment, because data serialization in Faiss does not require alignment, which may cause performance degradation or even crashes in some scenarios. @alexanderguzhva Do you have any suggestions for this?

@foxspy foxspy closed this Feb 24, 2025
@foxspy foxspy reopened this Feb 24, 2025
@mergify mergify bot removed the ci-passed label Feb 24, 2025
@alexanderguzhva
Copy link
Collaborator

Zero copy may cause data misalignment, because data serialization in Faiss does not require alignment, which may cause performance degradation or even crashes in some scenarios. @alexanderguzhva Do you have any suggestions for this?

I'm not sure I follow. Any examples?
As far as was able to understand your code, it is about avoiding memory duplication, so I'm not sure where a data misalignment and crash might occur

@alexanderguzhva
Copy link
Collaborator

I mean that this PR totally makes sense, with the exception of adding a memory owner directly to the faiss::Index object

return;
}
}
target.resize(size);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this particular line 118 is not needed

@foxspy
Copy link
Collaborator Author

foxspy commented Feb 25, 2025

Zero copy may cause data misalignment, because data serialization in Faiss does not require alignment, which may cause performance degradation or even crashes in some scenarios. @alexanderguzhva Do you have any suggestions for this?

I'm not sure I follow. Any examples? As far as was able to understand your code, it is about avoiding memory duplication, so I'm not sure where a data misalignment and crash might occur

A simple example is, if https://github.com/zilliztech/knowhere/pull/1032/files#diff-d05f16d9b1473d49c536582f9433cca57d6a2909a5a729dec54c383f598b57c4R119 assign 0x5ffffff3 to float *, it will cause misalignment of memory access. Although I have tried it on x64 and arm machines, it will not cause a crash, but I know that this will generally lead to performance degradation (may cause multiple memory accesses); some older machines may also have strange problems; for example, scann also has SIMD requirements for alignment https://github.com/zilliztech/knowhere/blob/main/thirdparty/faiss/faiss/utils/AlignedTable.h#L115, which will cause memory expansion (duplicated binary + memory copy) if not handled

facebook-github-bot pushed a commit to facebookresearch/faiss that referenced this pull request Mar 11, 2025
Summary:
This PR introduces a backport of a combination of zilliztech/knowhere#996 and zilliztech/knowhere#1032 that allow to have memory-mapped and zerocopy indces.

The root underlying idea is that we replace certain `std::vector<>` containers with a custom `faiss::MaybeOwnedVector<>` container, which may behave either as `std::vector<>`, or as a view of a certain pointer / descriptor. We don't replace all the instances of `std::vector<>`, but the largest ones.

This change affects `IndexFlatCodes`-based and `IndexHNSW` CPU indices.

(done) alter IVF lists as well.
(done) alter binary indices as well.

Memory-mapped index works like this:
```C++
std::unique_ptr<faiss::Index> index_mm(
            faiss::read_index(filenamename.c_str(), faiss::IO_FLAG_MMAP_IFC));
```
In theory, it should be ready to be used from Python. All the descriptor management should be working.

Zero-copy index works like this:
```C++
#include <faiss/impl/zerocopy_io.h>

faiss::ZeroCopyIOReader reader(buffer.data(), buffer.size());
std::unique_ptr<faiss::Index> index_zc(faiss::read_index(&reader));
```
All the pointer management for `faiss::ZeroCopyIOReader` should be handled manually.
I'm not sure how to plug this into Python yet, maybe, some ref-counting is required.

(done) some refactoring

Pull Request resolved: #4199

Reviewed By: mengdilin

Differential Revision: D69972250

Pulled By: mdouze

fbshipit-source-id: 98a3f94d6884814873d3534ee25f960892ef1076
@github-actions github-actions bot added the stale label Mar 28, 2025
@github-actions github-actions bot closed this Apr 4, 2025
samanthawaters8882michaeldonovan added a commit to samanthawaters8882michaeldonovan/faiss that referenced this pull request Oct 12, 2025
Summary:
This PR introduces a backport of a combination of zilliztech/knowhere#996 and zilliztech/knowhere#1032 that allow to have memory-mapped and zerocopy indces.

The root underlying idea is that we replace certain `std::vector<>` containers with a custom `faiss::MaybeOwnedVector<>` container, which may behave either as `std::vector<>`, or as a view of a certain pointer / descriptor. We don't replace all the instances of `std::vector<>`, but the largest ones.

This change affects `IndexFlatCodes`-based and `IndexHNSW` CPU indices.

(done) alter IVF lists as well.
(done) alter binary indices as well.

Memory-mapped index works like this:
```C++
std::unique_ptr<faiss::Index> index_mm(
            faiss::read_index(filenamename.c_str(), faiss::IO_FLAG_MMAP_IFC));
```
In theory, it should be ready to be used from Python. All the descriptor management should be working.

Zero-copy index works like this:
```C++
#include <faiss/impl/zerocopy_io.h>

faiss::ZeroCopyIOReader reader(buffer.data(), buffer.size());
std::unique_ptr<faiss::Index> index_zc(faiss::read_index(&reader));
```
All the pointer management for `faiss::ZeroCopyIOReader` should be handled manually.
I'm not sure how to plug this into Python yet, maybe, some ref-counting is required.

(done) some refactoring

Pull Request resolved: facebookresearch/faiss#4199

Reviewed By: mengdilin

Differential Revision: D69972250

Pulled By: mdouze

fbshipit-source-id: 98a3f94d6884814873d3534ee25f960892ef1076
dimitraseferiadi pushed a commit to dimitraseferiadi/SuCo that referenced this pull request Mar 8, 2026
Summary:
This PR introduces a backport of a combination of zilliztech/knowhere#996 and zilliztech/knowhere#1032 that allow to have memory-mapped and zerocopy indces.

The root underlying idea is that we replace certain `std::vector<>` containers with a custom `faiss::MaybeOwnedVector<>` container, which may behave either as `std::vector<>`, or as a view of a certain pointer / descriptor. We don't replace all the instances of `std::vector<>`, but the largest ones.

This change affects `IndexFlatCodes`-based and `IndexHNSW` CPU indices.

(done) alter IVF lists as well.
(done) alter binary indices as well.

Memory-mapped index works like this:
```C++
std::unique_ptr<faiss::Index> index_mm(
            faiss::read_index(filenamename.c_str(), faiss::IO_FLAG_MMAP_IFC));
```
In theory, it should be ready to be used from Python. All the descriptor management should be working.

Zero-copy index works like this:
```C++
#include <faiss/impl/zerocopy_io.h>

faiss::ZeroCopyIOReader reader(buffer.data(), buffer.size());
std::unique_ptr<faiss::Index> index_zc(faiss::read_index(&reader));
```
All the pointer management for `faiss::ZeroCopyIOReader` should be handled manually.
I'm not sure how to plug this into Python yet, maybe, some ref-counting is required.

(done) some refactoring

Pull Request resolved: facebookresearch#4199

Reviewed By: mengdilin

Differential Revision: D69972250

Pulled By: mdouze

fbshipit-source-id: 98a3f94d6884814873d3534ee25f960892ef1076
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants