feat:support columnar search result to better performance by jac0626 · Pull Request #3214 · milvus-io/pymilvus

jac0626 · 2026-01-19T07:37:28Z

sre-ci-robot · 2026-01-19T07:37:36Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: jac0626
To complete the pull request process, please assign longjiquan after the PR has been reviewed.
You can assign the PR to them by writing /assign @longjiquan in a comment when ready.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

gemini-code-assist · 2026-01-19T07:37:49Z

Summary of Changes

Hello @jac0626, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a significant performance enhancement for search results by implementing a new ColumnarSearchResult class. This new class stores search data in a columnar format, allowing for lazy access and avoiding the overhead of eagerly creating numerous Python objects. The change is seamlessly integrated into the existing gRPC handlers, providing an option to retrieve results in this optimized format while maintaining full API compatibility with the previous row-based SearchResult. This feature aims to improve the efficiency of handling large search result sets by reducing memory footprint and speeding up initialization.

Highlights

New Columnar Search Result Implementation: Introduces ColumnarSearchResult as a high-performance alternative to the existing SearchResult, designed for efficiency in handling large result sets.
Performance Optimization: The new implementation utilizes columnar storage, lazy data access, and reduced object creation (O(1) initialization) to significantly improve performance and minimize memory usage, especially when only a subset of results is accessed.
API Compatibility: The ColumnarSearchResult is designed to be a drop-in replacement, maintaining full API compatibility with the original SearchResult's iteration patterns, field return types, and dict-like interfaces.
Integration into gRPC Handlers: Both synchronous (grpc_handler.py) and asynchronous (async_grpc_handler.py) gRPC search and hybrid search methods have been updated to conditionally return ColumnarSearchResult based on result_type or use_columnar parameters.
Comprehensive Testing: New test files (test_columnar_compatibility.py and test_columnar_search_result.py) have been added to ensure type compatibility, iteration behavior, dict-like interface, performance benefits, and cover various data types and edge cases for the new columnar result structure.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces ColumnarSearchResult as a performance-optimized, drop-in replacement for SearchResult, focusing on lazy data access and reduced object creation. The implementation is well-done and includes an extensive and thorough test suite, which is excellent for ensuring compatibility and correctness. My review includes a few suggestions for improvement, mainly concerning a potential Liskov Substitution Principle violation in an accessor class, an opportunity to enhance performance when handling dynamic fields, and a note on the use of contextlib.suppress which could mask underlying issues.

pymilvus/client/columnar_search_result.py

codecov · 2026-01-19T08:34:18Z

Codecov Report

❌ Patch coverage is 93.30922% with 37 lines in your changes missing coverage. Please review.
✅ Project coverage is 76.27%. Comparing base (f90b685) to head (ad267f8).

Files with missing lines	Patch %	Lines
pymilvus/client/columnar_search_result.py	93.71%	34 Missing ⚠️
pymilvus/orm/collection.py	33.33%	2 Missing ⚠️
pymilvus/orm/future.py	66.66%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #3214      +/-   ##
==========================================
+ Coverage   76.06%   76.27%   +0.21%     
==========================================
  Files          62       63       +1     
  Lines       13018    13559     +541     
==========================================
+ Hits         9902    10342     +440     
- Misses       3116     3217     +101

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

jac0626 · 2026-01-19T08:49:00Z

More work is needed in the future to improve the code's readability, maintainability, extensibility, and performance @jac0626

Extract base class - Dedupe shared logic with SearchResult(highlight, metadata parsing)
Replace if-elif chains - Use TypeHandler registry pattern in _bind_accessor()
Consolidate Accessor classes - Reduce boilerplate with generic/factory pattern
Unify dynamic field handling - Merge $meta path with static field path
Define Protocol/ABC - Formal interface for type checking compatibility

XuanYang-cn · 2026-01-22T09:22:49Z

#3208 made some changes in search_result, might need extra attention.

But lets not hurry into this new feature, I believe we can do some small changes that we missed on the perf test, mainly engineering improvements. Let's not skip the engineering improvements, that's more like to be released quickly and harmlessly.

And for a new feature like this one, we need a design not a decision. In the design, I'd like some of these questions answered:

Why do we choose to add a flag in search, and make it complete compatible to the old search_result? Why not replace?
Why we need a complete compatible result?
can we provide to_pandas, to_arrow, etc.? can we make the new return results typed for better usages?
Why we choose to ignore those complex types? which are most likely to be beneficial from this feature.

Anyway, let's discuss designs based on the issue #3213, and then implement based on the final design.

For compatibility test(if we're choosing this way): the new code should pass OLD ut, new ut doesn't prove anything.

jac0626 · 2026-01-22T09:55:14Z

#3208 made some changes in search_result, might need extra attention.

But lets not hurry into this new feature, I believe we can do some small changes that we missed on the perf test, mainly engineering improvements. Let's not skip the engineering improvements, that's more like to be released quickly and harmlessly.

And for a new feature like this one, we need a design not a decision. In the design, I'd like some of these questions answered:

Why do we choose to add a flag in search, and make it complete compatible to the old search_result? Why not replace?

Why we need a complete compatible result?

can we provide to_pandas, to_arrow, etc.? can we make the new return results typed for better usages?

Why we choose to ignore those complex types? which are most likely to be beneficial from this feature.

Anyway, let's discuss designs based on the issue #3213, and then implement based on the final design.

For compatibility test(if we're choosing this way): the new code should pass OLD ut, new ut doesn't prove anything.

I will try to do engineering improvements firstly, then I would upload a design doc soon.

Signed-off-by: silas.jiang <[email protected]>

jac0626 · 2026-01-29T09:50:35Z

#3208 made some changes in search_result, might need extra attention.

But lets not hurry into this new feature, I believe we can do some small changes that we missed on the perf test, mainly engineering improvements. Let's not skip the engineering improvements, that's more like to be released quickly and harmlessly.

And for a new feature like this one, we need a design not a decision. In the design, I'd like some of these questions answered:

Why do we choose to add a flag in search, and make it complete compatible to the old search_result? Why not replace?

Why we need a complete compatible result?

can we provide to_pandas, to_arrow, etc.? can we make the new return results typed for better usages?

Why we choose to ignore those complex types? which are most likely to be beneficial from this feature.

Anyway, let's discuss designs based on the issue #3213, and then implement based on the final design.

For compatibility test(if we're choosing this way): the new code should pass OLD ut, new ut doesn't prove anything.

@XuanYang-cn Thanks for the feedback!

Regarding #3208: Already addressed — the columnar implementation has been updated to accommodate those changes.

On engineering improvements: Done.see #3240, I have investigating some other improvements, but just make little sense.

On design: I've prepared a design doc:

To answer your specific questions:

Why compatible, not replace?
last version we provide a flag, now we are doing a direct replacement — no flag, no dual versions. "Compatible" here means the new ColumnarSearchResult maintains the same API contract as the old SearchResult, so existing user code continues to work without changes.
Return types:
- For the standard iteration API (hit.id, hit['field']), return types are identical to the original.
- For the new get_column() API
  - return_type="list": Works for all types.
  - return_type="numpy": Returns native np.ndarray for numeric/vector types. For complex types (JSON, Dynamic, Sparse etc.) where numpy offers no benefit, we will raise an error instead of returning inefficient object arrays.
to_pandas / to_arrow:
Not implemented yet. We can add these as follow-up work
Complex types:
Fully supported — JSON, ARRAY, dynamic fields ($meta), and all vector types are covered.

On testing:
Currently using patch-based compatibility tests (test_columnar_compat.py) to verify the new code passes the old SearchResult tests. Full validation would benefit from e2e testing against a real Milvus instance.

Let me know if you'd like me to update anything in the design doc!

sre-ci-robot requested review from longjiquan and tedxu January 19, 2026 07:37

sre-ci-robot added the size/XXL label Jan 19, 2026

mergify bot added the dco-passed label Jan 19, 2026

gemini-code-assist bot reviewed Jan 19, 2026

View reviewed changes

pymilvus/client/columnar_search_result.py Outdated Show resolved Hide resolved

pymilvus/client/columnar_search_result.py Outdated Show resolved Hide resolved

pymilvus/client/columnar_search_result.py Show resolved Hide resolved

jac0626 force-pushed the feature/columnar-search-result branch from 552ab0b to 1fe4346 Compare January 19, 2026 07:45

jac0626 changed the title ~~feat:support columnar search result to better performance~~ [WIP]feat:support columnar search result to better performance Jan 19, 2026

sre-ci-robot added the do-not-merge/work-in-progress label Jan 19, 2026

jac0626 force-pushed the feature/columnar-search-result branch from 76c71ab to da00c8a Compare January 19, 2026 07:56

mergify bot added the ci-passed label Jan 19, 2026

jac0626 changed the title ~~[WIP]feat:support columnar search result to better performance~~ feat:support columnar search result to better performance Jan 20, 2026

sre-ci-robot removed the do-not-merge/work-in-progress label Jan 20, 2026

jac0626 force-pushed the feature/columnar-search-result branch 7 times, most recently from 32e8221 to 1a3c4a2 Compare January 29, 2026 08:14

feat:support columnar search result to better performance

86573ae

Signed-off-by: silas.jiang <[email protected]>

jac0626 force-pushed the feature/columnar-search-result branch from 1a3c4a2 to 86573ae Compare January 29, 2026 08:27

docs:add design doc forcolumnar search

9808767

Signed-off-by: silas.jiang <[email protected]>

mergify bot added needs-dco and removed dco-passed labels Jan 29, 2026

feat: enforce strict numpy return types in get_column

ad267f8

Signed-off-by: silas.jiang <[email protected]>

jac0626 force-pushed the feature/columnar-search-result branch from e7db13d to ad267f8 Compare January 29, 2026 09:47

mergify bot added dco-passed and removed needs-dco labels Jan 29, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

feat:support columnar search result to better performance#3214

feat:support columnar search result to better performance#3214
jac0626 wants to merge 3 commits intomilvus-io:masterfrom
jac0626:feature/columnar-search-result

jac0626 commented Jan 19, 2026

Uh oh!

sre-ci-robot commented Jan 19, 2026

Uh oh!

gemini-code-assist bot commented Jan 19, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Jan 19, 2026 •

edited

Loading

Uh oh!

jac0626 commented Jan 19, 2026

Uh oh!

XuanYang-cn commented Jan 22, 2026

Uh oh!

jac0626 commented Jan 22, 2026

Uh oh!

jac0626 commented Jan 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments

Conversation

jac0626 commented Jan 19, 2026

Uh oh!

sre-ci-robot commented Jan 19, 2026

Uh oh!

gemini-code-assist bot commented Jan 19, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Jan 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

jac0626 commented Jan 19, 2026

Uh oh!

XuanYang-cn commented Jan 22, 2026

Uh oh!

jac0626 commented Jan 22, 2026

Uh oh!

jac0626 commented Jan 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codecov bot commented Jan 19, 2026 •

edited

Loading