Skip to content

Add OpenSearch performance blog for 3.3#4023

Merged
natebower merged 10 commits intoopensearch-project:mainfrom
kolchfa-aws:performance-blog-3
Dec 9, 2025
Merged

Add OpenSearch performance blog for 3.3#4023
natebower merged 10 commits intoopensearch-project:mainfrom
kolchfa-aws:performance-blog-3

Conversation

@kolchfa-aws
Copy link
Collaborator

@kolchfa-aws kolchfa-aws commented Dec 2, 2025

Add OpenSearch performance blog for 3.3

Closes #4024

Check List

  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the BSD-3-Clause License.

@github-actions
Copy link

github-actions bot commented Dec 2, 2025

Thank you for submitting a blog post!

The blog post review process is: Submit a PR -> (Optional) Peer review -> Doc review -> Editorial review -> Marketing review -> Published.

@github-actions
Copy link

github-actions bot commented Dec 2, 2025

Hi @kolchfa-aws,

It looks like you're adding a new blog post but don't have an issue mentioned. Please link this PR to an open issue using one of these keywords in the PR description:

  • Closes #issue-number
  • Fixes #issue-number
  • Resolves #issue-number

If an issue hasn't been created yet, please create one and then link it to this PR.

@kolchfa-aws kolchfa-aws added the Editorial review The blog is under editorial review label Dec 2, 2025
Copy link
Collaborator

@natebower natebower left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Editorial review

natebower
natebower previously approved these changes Dec 2, 2025
Copy link
Collaborator

@natebower natebower left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @kolchfa-aws! LGTM

@natebower natebower added Done and ready to publish The blog is approved and ready to publish and removed Editorial review The blog is under editorial review labels Dec 2, 2025
@natebower natebower removed their assignment Dec 2, 2025
Signed-off-by: Fanit Kolchina <[email protected]>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can to request to replace this with
Latency performance - OS 3 3

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@asimmahmood1 Should the first label be "OS 1.3"?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, could we make all top labels plural? ("Text queries", "Terms aggregations", and "Date histograms")

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, could we make all top labels plural? ("Text queries", "Terms aggregations", and "Date histograms")

The table in the blog has these titles, so those will be need to change too. Let me do that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Latency performance - OS 3 3-update

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the new image, but I would call the graph "Query latency" explicitly. If you call it "Performance", it looks like OpenSearch performance is going down with each new version.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Latency performance - OS 3 3-update


![Memory-optimized search performance improvements](/assets/media/blog-images/2025-12-01-opensearch-performance-3.3/memory-optimized-performance-improvement.png){:style="width: 100%; max-width: 800px; height: auto; text-align: center"}

* **GPU performance**: OpenSearch's vector engine also gained features beyond core performance improvements, including GPU-accelerated k-NN indexing (OpenSearch 3.0) for **9× faster** index builds, disk-based ANN search with hybrid quantization (OpenSearch 3.1), and advanced filtering for multi-tenant vector data.
Copy link
Contributor

@asimmahmood1 asimmahmood1 Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@navneet1v requested to replace with:

GPU Performance: It’s worth noting that OpenSearch’s vector engine also gained many features in the 3.x line outside core performance: for example, GPU-accelerated k-NN indexing is now GA (3.1) and provides 9× faster index builds. GPU based index acceleration now supports building byte data type, binary data type and quantized indexes including (2x, 8x, 16x, 32 compression) for both in memory and on disk mode.
Screenshot 2025-12-02 at 8 02 58 PM


* **[Adding support for BFloat16 with Faiss scalar quantizer for extended range](https://github.com/opensearch-project/k-NN/issues/2510)**: Adding BFloat16 support to the Faiss scalar quantizer will allow the k-NN plugin to overcome the range limitation of the existing FP16 implementation. FP16 restricts input vectors to [-65,504, 65,504] and prevents it from being a default data type, despite its 50% memory reduction and comparable performance to FP32. BFloat16 (SQbf16) provides the same extended range as FP32 (approximately ±3.4 × 10³⁸) while maintaining 50% memory savings by trading off precision, supporting 2–3 decimal values (7 mantissa bits). The k-NN engine can use Intel AVX512 BF16 instruction sets for hardware-accelerated performance on newer processors. This makes 16-bit quantization viable for a wider range of vector search use cases without range constraints.

* **[Moving k-NN interfaces to OpenSearch core](https://github.com/opensearch-project/OpenSearch/issues/20050)**: Moving k-NN vector interfaces from the k-NN plugin to OpenSearch core will address extensibility challenges and plugin dependencies in the growing vector search landscape. Currently, the k-NN plugin (supporting Lucene, Faiss, and the deprecated NMSLib engines) and the newer JVector plugin each implement their own interfaces, but there is no standardized approach. This proposal aims to elevate the common Lucene-based k-NN interfaces into OpenSearch core. Doing so will enable better extensibility for new vector engines, remove the hard dependency of the Neural Search plugin relying on the k-NN plugin, and allow any vector plugin (k-NN, JVector, or future engines) to integrate with the Neural Search plugin. It also provides a standardized contract that simplifies onboarding and encourages innovation in the vector search space.
Copy link
Contributor

@asimmahmood1 asimmahmood1 Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@navneet1v requested to delete Moving k-NN interfaces to OpenSearch core, since its not performance related.

@vamshin do you want to add more here instead?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add:

  • Making Memory Optimized Search Default: In the 3.3 version of OpenSearch k-NN, we observed a significant improvement with Memory Optimized by taking in the HNSW graph traversal algorithm from the Lucene library and C++ bulk SIMD-based distance computation. Further down the road, we are planning to add more optimizations, which include a warmup on the memory-optimized based search indices to reduce the tail latencies, making fp16 the default with Memory Optimized search to reduce the memory footprint by 2x.
  • Disk Based Vector Search V2: In the 2.17 version, the OpenSearch k-NN plugin added disk-based vector search support, allowing searches to run in lower-memory environments. In the V2 of Disk-based Vector Search, we will be working on reducing the disk reads by reordering the vectors on the disk to maximize the number of vectors retrieved per disk access using techniques like Bi-partite Graph Partitioning (BPGP) and Gorder Priority Queue (Gorder-PQ). Along with this, we will be adding different flavors of Better Binary Quantization (flat and approximate search) in the Vector Engine.
  • Accelerating indexing and search: OpenSearch continues to leverage hardware acceleration using new SIMD instructions like avx512_fp16, BFloat16, and ARM SVE to boost search performance on x86 and ARM-based instances. Further for remote index build using GPUs we are planning to cut down the index file transfer to and fro from GPUs machine. This is expected to improve the index builds using GPUs by 2x.
  • Making OpenSearch Vector Engine Extensible: OpenSearch is proposing to move Vector Search interfaces from the k-NN plugin to OpenSearch core to address extensibility challenges and plugin dependencies in the growing vector search ecosystem, where currently the k-NN plugin (supporting Lucene, FAISS) and the newer JVector plugin operate with their own implementations but lack standardized interfaces. The proposal aims to uplevel common Vector Search interfaces—into opensearch-core, enabling better extensibility for new vector engines, eliminating the problematic hard dependency where the neural-search plugin (used for hybrid search) currently relies directly on the k-NN plugin, and allowing you to choose any vector plugin (k-NN, JVector, or future engines) with neural-search while providing a standardized contract that simplifies onboarding new vector engines and encourages innovation in the vector search space.

Copy link
Contributor

@asimmahmood1 asimmahmood1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added comments to add/replace content. Otherwise, LGTM!

@kolchfa-aws @natebower


* **[Adding support for BFloat16 with Faiss scalar quantizer for extended range](https://github.com/opensearch-project/k-NN/issues/2510)**: Adding BFloat16 support to the Faiss scalar quantizer will allow the k-NN plugin to overcome the range limitation of the existing FP16 implementation. FP16 restricts input vectors to [-65,504, 65,504] and prevents it from being a default data type, despite its 50% memory reduction and comparable performance to FP32. BFloat16 (SQbf16) provides the same extended range as FP32 (approximately ±3.4 × 10³⁸) while maintaining 50% memory savings by trading off precision, supporting 2–3 decimal values (7 mantissa bits). The k-NN engine can use Intel AVX512 BF16 instruction sets for hardware-accelerated performance on newer processors. This makes 16-bit quantization viable for a wider range of vector search use cases without range constraints.

* **[Moving k-NN interfaces to OpenSearch core](https://github.com/opensearch-project/OpenSearch/issues/20050)**: Moving k-NN vector interfaces from the k-NN plugin to OpenSearch core will address extensibility challenges and plugin dependencies in the growing vector search landscape. Currently, the k-NN plugin (supporting Lucene, Faiss, and the deprecated NMSLib engines) and the newer JVector plugin each implement their own interfaces, but there is no standardized approach. This proposal aims to elevate the common Lucene-based k-NN interfaces into OpenSearch core. Doing so will enable better extensibility for new vector engines, remove the hard dependency of the Neural Search plugin relying on the k-NN plugin, and allow any vector plugin (k-NN, JVector, or future engines) to integrate with the Neural Search plugin. It also provides a standardized contract that simplifies onboarding and encourages innovation in the vector search space.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add:

  • Making Memory Optimized Search Default: In the 3.3 version of OpenSearch k-NN, we observed a significant improvement with Memory Optimized by taking in the HNSW graph traversal algorithm from the Lucene library and C++ bulk SIMD-based distance computation. Further down the road, we are planning to add more optimizations, which include a warmup on the memory-optimized based search indices to reduce the tail latencies, making fp16 the default with Memory Optimized search to reduce the memory footprint by 2x.
  • Disk Based Vector Search V2: In the 2.17 version, the OpenSearch k-NN plugin added disk-based vector search support, allowing searches to run in lower-memory environments. In the V2 of Disk-based Vector Search, we will be working on reducing the disk reads by reordering the vectors on the disk to maximize the number of vectors retrieved per disk access using techniques like Bi-partite Graph Partitioning (BPGP) and Gorder Priority Queue (Gorder-PQ). Along with this, we will be adding different flavors of Better Binary Quantization (flat and approximate search) in the Vector Engine.
  • Accelerating indexing and search: OpenSearch continues to leverage hardware acceleration using new SIMD instructions like avx512_fp16, BFloat16, and ARM SVE to boost search performance on x86 and ARM-based instances. Further for remote index build using GPUs we are planning to cut down the index file transfer to and fro from GPUs machine. This is expected to improve the index builds using GPUs by 2x.
  • Making OpenSearch Vector Engine Extensible: OpenSearch is proposing to move Vector Search interfaces from the k-NN plugin to OpenSearch core to address extensibility challenges and plugin dependencies in the growing vector search ecosystem, where currently the k-NN plugin (supporting Lucene, FAISS) and the newer JVector plugin operate with their own implementations but lack standardized interfaces. The proposal aims to uplevel common Vector Search interfaces—into opensearch-core, enabling better extensibility for new vector engines, eliminating the problematic hard dependency where the neural-search plugin (used for hybrid search) currently relies directly on the k-NN plugin, and allowing you to choose any vector plugin (k-NN, JVector, or future engines) with neural-search while providing a standardized contract that simplifies onboarding new vector engines and encourages innovation in the vector search space.

Signed-off-by: Fanit Kolchina <[email protected]>

* **Adding support for BFloat16 with Faiss scalar quantizer for extended range**: [Adding BFloat16 support to the Faiss scalar quantizer](https://github.com/opensearch-project/k-NN/issues/2510) will allow the k-NN plugin to overcome the range limitation of the existing FP16 implementation. FP16 restricts input vectors to [-65,504, 65,504] and prevents it from being a default data type, despite its 50% memory reduction and comparable performance to FP32. BFloat16 (SQbf16) provides the same extended range as FP32 (approximately ±3.4 × 10³⁸) while maintaining 50% memory savings by trading off precision, supporting 2–3 decimal values (7 mantissa bits). The k-NN engine can use Intel AVX512 BF16 instruction sets for hardware-accelerated performance on newer processors. This makes 16-bit quantization viable for a wider range of vector search use cases without range constraints.

* **Making memory-optimized search default**: In OpenSearch 3.3, the k-NN plugin achieved significant improvements with memory-optimized search by combining the HNSW graph traversal algorithm from the Lucene library with C++ bulk SIMD-based distance computation. Future optimizations include adding [warmup functionality](https://github.com/opensearch-project/k-NN/issues/2939) for memory-optimized search indices to reduce tail latencies and [making FP16 the default](https://github.com/opensearch-project/k-NN/issues/2924) with memory-optimized search to reduce memory footprint by 50%.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"the" should precede the first instance of "default". "indices" => "indexes"

* **Accelerating indexing and search performance**: OpenSearch continues to use hardware acceleration with new SIMD instructions like [avx512_fp16](https://github.com/facebookresearch/faiss/pull/4225), BFloat16, and ARM SVE in order to improve search performance on x86 and ARM instances. For remote index builds using GPUs, planned optimizations include reducing [index file transfer](https://github.com/opensearch-project/remote-vector-index-builder/issues/94) between GPU machines, which is expected to improve GPU-based index builds by 2×.

* **[Moving k-NN interfaces to OpenSearch core](https://github.com/opensearch-project/OpenSearch/issues/20050)**: Moving k-NN vector interfaces from the k-NN plugin to OpenSearch core will address extensibility challenges and plugin dependencies in the growing vector search landscape. Currently, the k-NN plugin (supporting Lucene, Faiss, and the deprecated NMSLib engines) and the newer JVector plugin each implement their own interfaces, but there is no standardized approach. This proposal aims to elevate the common Lucene-based k-NN interfaces into OpenSearch core. Doing so will enable better extensibility for new vector engines, remove the hard dependency of the Neural Search plugin relying on the k-NN plugin, and allow any vector plugin (k-NN, JVector, or future engines) to integrate with the Neural Search plugin. It also provides a standardized contract that simplifies onboarding and encourages innovation in the vector search space.
* **Making the OpenSearch vector engine extensible**: [Moving vector search interfaces from the k-NN plugin to OpenSearch core](https://github.com/opensearch-project/OpenSearch/issues/20050) will address extensibility challenges and plugin dependencies in the growing vector search environment. Currently, the k-NN plugin (supporting Lucene and Faiss) and the newer JVector plugin operate with their own implementations but lack standardized interfaces. This proposal elevates common vector search interfaces into OpenSearch core, enabling better extensibility for new vector engines, eliminating the hard dependency where the neural-search plugin relies directly on the k-NN plugin, and allowing users to choose any vector plugin (k-NN, JVector, or future engines) with neural-search while providing a standardized contract that simplifies onboarding new vector engines and encourages innovation.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"eliminating the hard dependency of the Neural Search plugin relying directly on the k-NN plugin". Does the next instance of "neural-search" also refer to the Neural Search plugin?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

Signed-off-by: Fanit Kolchina <[email protected]>
natebower
natebower previously approved these changes Dec 3, 2025
Copy link
Collaborator

@natebower natebower left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Signed-off-by: Fanit Kolchina <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
categories:
- technical-posts
- community
meta_keywords: OpenSearch 3.3, performance improvements, gRPC transport, vector search, query optimization, streaming aggregations, derived source, hybrid search
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated meta:
meta_keywords: Open Search performance optimization, vector search, AI search, search latency, query performance, indexing performance, gRPC, aggregation optimization, machine learning, hybrid search, GPU acceleration, OpenSearch 3.3, database performance, search engine

meta_description: OpenSearch 3.3 delivers breakthrough performance with 11× faster queries, optimized vector search, and enhanced AI capabilities. Discover the latest innovations in search and analytics performance.

- mgodwani
- vamshin
- navneev
date: 2025-12-01
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Publish date: 2025-12-08

@pajuric
Copy link

pajuric commented Dec 8, 2025

@natebower - Please final merge and close. Blog is live here: https://opensearch.org/blog/opensearch-3-3-performance-innovations-for-ai-search-solutions/

Copy link
Collaborator

@natebower natebower left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@natebower natebower merged commit eef0983 into opensearch-project:main Dec 9, 2025
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Done and ready to publish The blog is approved and ready to publish

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BLOG] OpenSearch performance 3.3

4 participants