Skip to content

find related docs in single query#35727

Merged
andreer merged 1 commit intomasterfrom
andreer/nn-by-id-searcher
Jan 29, 2026
Merged

find related docs in single query#35727
andreer merged 1 commit intomasterfrom
andreer/nn-by-id-searcher

Conversation

@andreer
Copy link
Member

@andreer andreer commented Jan 29, 2026

I confirm that this contribution is made under the terms of the license found in the root directory of this repository's source tree and that I have the authority necessary to make this contribution on behalf of its copyright owner.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a new searcher that enables finding related documents using nearest neighbor search based on document embeddings. The searcher fetches an embedding from a source document and performs a nearest neighbor query to find similar documents.

Changes:

  • Added RelatedDocumentsByNearestNeighborSearcher to enable "find similar documents" functionality in a single query
  • Added comprehensive test coverage for the new searcher with various scenarios
  • Updated ABI spec to include the new public API

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
container-search/src/main/java/com/yahoo/search/searchers/RelatedDocumentsByNearestNeighborSearcher.java New searcher that fetches embeddings from source documents and performs nearest neighbor search to find related documents
container-search/src/test/java/com/yahoo/search/searchers/RelatedDocumentsByNearestNeighborSearcherTestCase.java Comprehensive test suite covering error cases, query construction, and parameter handling
container-search/abi-spec.json ABI specification update to expose the new searcher as public API

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +100 to +105
Result result = execution.search(fetchQuery);
execution.fill(result, summary);

if (result.hits().size() < 1) {
return null;
}
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fetchEmbedding method should check for errors in the result from execution.search(fetchQuery) before accessing result.hits(). If the fetch query fails (e.g., due to backend communication errors), the method should handle the error appropriately rather than proceeding to check hit count. Consider adding a check like: if (result.hits().getError() != null) return null; after line 100.

Copilot uses AI. Check for mistakes.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot open a new pull request to apply changes based on this feedback

Comment on lines +61 to +69
void testDocumentNotFoundReturnsError() {
var searcher = new RelatedDocumentsByNearestNeighborSearcher();
var query = new Query("?relatedTo.id=doc1&relatedTo.embeddingField=embedding&relatedTo.queryTensorName=q");
var result = executeWithMockBackend(searcher, query);

assertNotNull(result.hits().getError(), "Expected error but got none");
assertTrue(result.hits().getError().getDetailedMessage().contains("Could not find document"),
"Error message was: " + result.hits().getError().getDetailedMessage());
}
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no test coverage for the case where the embedding field exists but is not a Tensor type (line 109-112 in the main code). Consider adding a test case that verifies the behavior when the field contains a non-Tensor value, ensuring it returns an appropriate error message to the user.

Copilot uses AI. Check for mistakes.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot open a new pull request to apply changes based on this feedback

Copy link
Contributor

Copilot AI commented Jan 29, 2026

@andreer I've opened a new pull request, #35729, to work on those changes. Once the pull request is ready, I'll request review from you.

Copy link
Contributor

Copilot AI commented Jan 29, 2026

@andreer I've opened a new pull request, #35728, to work on those changes. Once the pull request is ready, I'll request review from you.

@andreer andreer merged commit 33a1030 into master Jan 29, 2026
8 of 9 checks passed
@andreer andreer deleted the andreer/nn-by-id-searcher branch January 29, 2026 22:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants