Skip to content

fix CONTAINSANY index use for lists of embedded documents#3051

Merged
robfrank merged 3 commits intoArcadeData:mainfrom
gramian:main
Jan 4, 2026
Merged

fix CONTAINSANY index use for lists of embedded documents#3051
robfrank merged 3 commits intoArcadeData:mainfrom
gramian:main

Conversation

@gramian
Copy link
Collaborator

@gramian gramian commented Dec 19, 2025

What does this PR do?

This change copies the cahnges made to CONTAINS for index use of lists of embedded documents to CONTAINSANY.

Motivation

Manual tests

Related issues

#2802

Checklist

  • I have run the build using mvn clean package command
  • My unit tests cover both failure and success scenarios

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @gramian, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a limitation in the CONTAINSANY operator's ability to leverage indexes, particularly when dealing with lists of embedded documents and nested properties. By mirroring the index awareness logic previously implemented for the CONTAINS operator, this change ensures that CONTAINSANY can more effectively utilize BY-ITEM indexes, leading to improved query performance for relevant data structures.

Highlights

  • Improved Index Awareness for CONTAINSANY: The CONTAINSANY operator now correctly identifies and utilizes indexes for queries involving lists of embedded documents and nested properties, aligning its behavior with the CONTAINS operator.
  • Support for Nested Properties: The isIndexAware method in ContainsAnyCondition.java has been enhanced to parse and recognize nested property paths (e.g., 'tags.id') when determining index applicability.
  • BY-ITEM Index Restriction: Index usage for CONTAINSANY is now explicitly restricted to BY-ITEM indexes, ensuring correct and efficient index application for list elements.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@mergify
Copy link
Contributor

mergify bot commented Dec 19, 2025

🧪 CI Insights

Here's what we observed from your CI run for f65ce89.

🟢 All jobs passed!

But CI Insights is watching 👀

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request ports the index usage logic for lists of embedded documents from CONTAINS to CONTAINSANY. The changes correctly extend index awareness to nested properties and add a necessary check for BY-ITEM indexes.

I've identified a couple of areas for improvement:

  • A comment incorrectly refers to CONTAINS instead of CONTAINSANY.
  • The regular expression for validating property paths has some issues: it doesn't support quoted identifiers (which is a correctness issue for properties with special names) and it incorrectly allows invalid paths with multiple dots. Since this logic is copied from ContainsCondition, the same issue might exist there and should ideally be fixed in both places.

Overall, the change is in the right direction to fix the reported issue.

// Might be a nested property - try using the string representation
String leftStr = left.toString();
// Check if this looks like a simple property path (alphanumeric with dots)
if (leftStr.matches("[a-zA-Z_][a-zA-Z0-9_.]*")) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The regular expression used to validate the property path is both too restrictive and too permissive.

  1. Too restrictive: It doesn't handle quoted identifiers (using backticks), which are necessary for property names with special characters like hyphens (e.g., `my-property`.name). This will cause index lookups to fail for such valid property paths.
  2. Too permissive: It allows multiple consecutive dots (e.g., tags..id), which is not a valid property path syntax.

A more accurate validation might be needed here. Using a more robust regular expression or parsing the path to validate each segment would make this more reliable.

gramian and others added 2 commits December 19, 2025 20:42
…Condition.java

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Copy link
Contributor

@lvca lvca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good to me!

@lvca lvca added this to the 25.12.1 milestone Dec 21, 2025
@gramian
Copy link
Collaborator Author

gramian commented Dec 27, 2025

Can this be merged? The remark about the RegEx is valid, but needs to be done not only for CONTAINSANY but potentially all relevant filters.

@robfrank robfrank merged commit 51a058b into ArcadeData:main Jan 4, 2026
11 of 13 checks passed
mergify bot added a commit to robfrank/linklift that referenced this pull request Jan 9, 2026
….1 [skip ci]

Bumps [com.arcadedb:arcadedb-network](https://github.com/ArcadeData/arcadedb) from 25.11.1 to 25.12.1.
Release notes

*Sourced from [com.arcadedb:arcadedb-network's releases](https://github.com/ArcadeData/arcadedb/releases).*

> 25.12.1
> -------
>
> ArcadeDB 25.12.1 Release Notes
> ==============================
>
> We're excited to announce the release of ArcadeDB v25.12.1! This release includes significant bug fixes, new features, performance improvements, and dependency updates.
>
> Highlights
> ----------
>
> ### Vector Search Enhancements
>
> * **Fixed critical vector quantization bug** ([#3052](https://redirect.github.com/ArcadeData/arcadedb/issues/3052), [#3053](https://redirect.github.com/ArcadeData/arcadedb/issues/3053)) - INT8 and BINARY vector quantization now works correctly across all dimensions
> * **New filtered vector search** ([#3071](https://redirect.github.com/ArcadeData/arcadedb/issues/3071), [#3072](https://redirect.github.com/ArcadeData/arcadedb/issues/3072)) - LSMVectorIndex now supports filtered searches for more precise queries
> * **Better vector type support** ([#3090](https://redirect.github.com/ArcadeData/arcadedb/issues/3090)) - Added support for `List<Float>` in vector indexes
> * **Improved compression** ([#2911](https://redirect.github.com/ArcadeData/arcadedb/issues/2911)) - Enhanced compression for LSM vector indexes
> * **Fixed HNSW graph persistence** ([#2916](https://redirect.github.com/ArcadeData/arcadedb/issues/2916)) - Ensures JVector HNSW graph file is properly closed and flushed to disk
>
> ### SQL and Query Improvements
>
> * **Fixed IF statement execution** ([#2775](https://redirect.github.com/ArcadeData/arcadedb/issues/2775)) - SQL scripts with IF statements now execute correctly from console
> * **Fixed index creation with IF NOT EXISTS** ([#1819](https://redirect.github.com/ArcadeData/arcadedb/issues/1819)) - Console no longer errors when creating existing indexes with IF NOT EXISTS clause
> * **Custom function parameter binding** ([#3046](https://redirect.github.com/ArcadeData/arcadedb/issues/3046), [#3049](https://redirect.github.com/ArcadeData/arcadedb/issues/3049)) - Fixed parameter binding for SQL and JavaScript custom functions
> * **SQL method consistency** ([#2964](https://redirect.github.com/ArcadeData/arcadedb/issues/2964), [#2967](https://redirect.github.com/ArcadeData/arcadedb/issues/2967)) - `values()` method now behaves consistently with `keys()` method
> * **CONTAINSANY index fix** ([#3051](https://redirect.github.com/ArcadeData/arcadedb/issues/3051)) - Fixed index usage for lists of embedded documents with CONTAINSANY
>
> ### Transaction Management
>
> * **Revised transaction logic** ([#3074](https://redirect.github.com/ArcadeData/arcadedb/issues/3074)) - Improved transaction handling and consistency
> * **Fixed edge index invalidation** ([#3091](https://redirect.github.com/ArcadeData/arcadedb/issues/3091)) - Edge indexes now remain valid in edge-case scenarios
>
> ### New Features
>
> * **Database size API** ([#3045](https://redirect.github.com/ArcadeData/arcadedb/issues/3045)) - Added new `database.getSize()` API method
> * **Version display enhancement** ([#2905](https://redirect.github.com/ArcadeData/arcadedb/issues/2905)) - Server log version number now displayed consistently
>
> What's Changed
> --------------
>
> ### Bug Fixes
>
> * Fix INT8 and BINARY vector quantization offset bug in LSMVectorIndex page loading by [`@​Copilot`](https://github.com/Copilot) in [ArcadeData/arcadedb#3053](https://redirect.github.com/ArcadeData/arcadedb/pull/3053)
> * fix: revert SQL grammar changes and disable deep level JSON insert tests by [`@​robfrank`](https://github.com/robfrank) in [ArcadeData/arcadedb#2961](https://redirect.github.com/ArcadeData/arcadedb/pull/2961)
> * [#2915](https://redirect.github.com/ArcadeData/arcadedb/issues/2915) fix: ensure Jvector HNSW graph file is closed and flushed to disk on database close by [`@​robfrank`](https://github.com/robfrank) in [ArcadeData/arcadedb#2916](https://redirect.github.com/ArcadeData/arcadedb/pull/2916)
> * fix: make values method behave like keys method by [`@​gramian`](https://github.com/gramian) in [ArcadeData/arcadedb#2967](https://redirect.github.com/ArcadeData/arcadedb/pull/2967)
> * Fix custom function parameter binding for SQL and JavaScript functions by [`@​Copilot`](https://github.com/Copilot) in [ArcadeData/arcadedb#3049](https://redirect.github.com/ArcadeData/arcadedb/pull/3049)
> * fix CONTAINSANY index use for lists of embedded documents by [`@​gramian`](https://github.com/gramian) in [ArcadeData/arcadedb#3051](https://redirect.github.com/ArcadeData/arcadedb/pull/3051)
> * fix: support List in vector index by [`@​szekelyszabi`](https://github.com/szekelyszabi) in [ArcadeData/arcadedb#3090](https://redirect.github.com/ArcadeData/arcadedb/pull/3090)
>
> ### Features
>
> * Show version number same as in server log by [`@​gramian`](https://github.com/gramian) in [ArcadeData/arcadedb#2905](https://redirect.github.com/ArcadeData/arcadedb/pull/2905)
> * feat: added new `database.getSize()` api by [`@​lvca`](https://github.com/lvca) in [ArcadeData/arcadedb#3045](https://redirect.github.com/ArcadeData/arcadedb/pull/3045)
> * Add filtered vector search support to LSMVectorIndex by [`@​Copilot`](https://github.com/Copilot) in [ArcadeData/arcadedb#3072](https://redirect.github.com/ArcadeData/arcadedb/pull/3072)
> * add stars chart by [`@​robfrank`](https://github.com/robfrank) in [ArcadeData/arcadedb#3084](https://redirect.github.com/ArcadeData/arcadedb/pull/3084)
>
> ### Performance Improvements
>
> * Lsm vector fix by [`@​lvca`](https://github.com/lvca) in [ArcadeData/arcadedb#2907](https://redirect.github.com/ArcadeData/arcadedb/pull/2907)
> * perf: improved compression with lsm vectors by [`@​lvca`](https://github.com/lvca) in [ArcadeData/arcadedb#2911](https://redirect.github.com/ArcadeData/arcadedb/pull/2911)

... (truncated)


Commits

* [`6290454`](ArcadeData/arcadedb@6290454) Set release version to 25.12.1
* [`5bdbdfa`](ArcadeData/arcadedb@5bdbdfa) chore: removed system.out
* [`5764b95`](ArcadeData/arcadedb@5764b95) fix: deletion of light edge after last fix
* [`a81163a`](ArcadeData/arcadedb@a81163a) fix: avoid reuse of deleted record in same tx
* [`a42ae5e`](ArcadeData/arcadedb@a42ae5e) perf: avoid conversion of float[] into List<Float> in SQL engine
* [`c8fb3e5`](ArcadeData/arcadedb@c8fb3e5) chore: refactoring conversion functions to float[] in a centralized method
* [`de9bfcf`](ArcadeData/arcadedb@de9bfcf) fix: support List<Float> in vector index ([#3090](https://redirect.github.com/ArcadeData/arcadedb/issues/3090))
* [`9e964ef`](ArcadeData/arcadedb@9e964ef) Merge branch 'main' of <https://github.com/ArcadeData/arcadedb>
* [`07c7d3e`](ArcadeData/arcadedb@07c7d3e) Fixed failing test using java
* [`51a058b`](ArcadeData/arcadedb@51a058b) fix CONTAINSANY index use for lists of embedded documents ([#3051](https://redirect.github.com/ArcadeData/arcadedb/issues/3051))
* Additional commits viewable in [compare view](ArcadeData/arcadedb@25.11.1...25.12.1)
  
[![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility\_score?dependency-name=com.arcadedb:arcadedb-network&package-manager=maven&previous-version=25.11.1&new-version=25.12.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
Dependabot commands and options
  
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
- `@dependabot show  ignore conditions` will show all of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
robfrank pushed a commit that referenced this pull request Feb 11, 2026
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
(cherry picked from commit 51a058b)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants